Working with Metrics
Sysdig Monitor metrics are divided into two groups: default metrics (out-of-the-box metrics associated with the system, orchestrator, and network infrastructure), and custom metrics (JMX, StatsD, and multiple other integrated application metrics).

Sysdig automatically collects all types of metrics, and auto-labels
them. Custom metrics can also have custom (user-defined) labels.
Out-of-the box, when an agent is deployed on a host, Sysdig Monitor automatically begins collecting and reporting on a wide array of metrics. The sections below describe how those metrics are conceptualized within the system.
1 - Types of Metrics
This topic introduces you to the types of metrics in Sysdig Monitor.
Default Metrics
Default metrics include various kinds of metadata which Sysdig Monitor
automatically knows how to label, segment, and display.
For example:
System metrics for hosts, containers, and processes (CPU used, etc.)
Orchestrator metrics (collected from Kubernetes, Mesos, etc.)
Network metrics (e.g. network traffic)
HTTP
Platform metrics (in some cases)
Default metrics are collected mainly from two sources: syscalls and
Kubernetes.
Custom Metrics
About Custom Metrics
Custom metrics generally refer to any metrics that the Sysdig Agent
collects from some third-party integration. The type of infrastructure
and applications integrated determine the custom metrics that the Agent
collects and reports to Sysdig Monitor. The supported custom metrics
are:
Each metric comes with a set of custom labels, and additional labels can
be user-created. Sysdig Monitor simply collects and reports them with
minimal or no internal processing. Use the metrics_filter
option in the dragent.yaml
file to remove unwanted metrics or to choose the metrics to report when
hosts exceed this limit. For more information on editing the
dragent.yaml
file, see Understanding the Agent Config
Files.
Unit for Custom Metrics
Sysdig Monitor detects the default unit of custom metrics automatically
with the delimiter suffix in the metrics name. For example,
custom_expvar_time_seconds
results in a base unit set to seconds. The
supported base units are byte, percent, and time. Custom metrics name
should carry one of the following delimiter suffixes in order for Sysdig
Monitor to identify and configure the accurate unit type.
Custom metrics will not be auto-detected and the unit will be incorrect
unless this naming convention is followed. For instance,
custom_byte_expvar
will not yield the correct unit, that is MiB.
Editing the Unit Scale
You have the flexibility to change the unit scale either by editing the
panel on the Dashboard or in the Explore.
Explore
From the Search Metrics and Dashboard drop-down, select the custom
metrics you want to edit the unit selection for, then click More
Options. Select the desired unit scale from the Metric Format
drop-down and click Save.

Dashboard
Select the Dashboard Panel associated with the custom metrics you want to modify. Select the
desired unit scale from the Metrics drop-down and click Save.

Display Missing Data
Data can be missing for a few different reasons:
Sysdig Monitor allows you to configure the behavior of missing data in
Dashboards. Though metric type determines the default behavior, you can
configure how to visualize missing data and define it at the per-query
level. Use the No Data Display drop-down in the Options menu in
the panel configuration, and the No Data Message text box under the Panel tab. See Create a New Panel for more information.
Consider the following guidelines:
Use the No Data Message text box under the Panel tab to enter a custom message when no data is available
to render on the panels. This custom message, which could include links in markdown format and line breaks,
is shown when queries return no data and reports no errors.
The No Data Display drop-down has only two options for the
Stacked Area timechart: gap and show as zero.
For form-based timechart panels, the default option for a metrics
selection that does not contain a StatsD metric is gap
.
Adding a StatsD metric to a query in a form-based timechart panel
will default the selected No Data Display type to the show as
zero , which is the default option for form-based StatsD metrics.
You can change this selection to any other type.
The default display option is gap for PromQL Timechart panels.
The options for No Data Display are:
gap: The default option for form-based timechart panel, where a
query metrics selection does not contain a StatsD metric. gap is
the best visualization type for most use cases because it is easy to
spot indicating a problem.

show as zero: The best option for StatsD metrics which are only
submitted sporadically. For example, batch jobs and count of errors.
This is the default display option for StatsD metrics in form-based
panels.

We do not recommend this option as setting zero could be misleading.
For example, this setting will report the value for free disk space
as 0% when the disk or host disappears, but in reality, the value is
unknown.
connect - solid: Use for measuring the value of a metric,
typically a gauge, where you want to visualize the missing samples
flattened.

The leftmost and rightmost visible data points can be connected as
Sysdig does not perform the interpolation.
connect - dotted: Use it for measuring the value of a metric,
typically a gauge, where you want to visualize the missing samples
flattened.

The leftmost and rightmost visible data points can be connected as
Sysdig does not perform the interpolation.
2 - Using Labels
Data aggregation and filtering in Sysdig Monitor are done through the use of assigned labels. The sections below explain how labels work, the ways they can be used, and how to work with groupings, scopes, and segments.
Labels are used to identify and differentiate characteristics of a
metric, allowing them to be aggregated or filtered for Explore module
views, dashboards, alerts, and captures. Labels can be used in different
ways:
To group infrastructure objects into logical hierarchies displayed
on the Explore tab (called groupings). For more information, refer
to Groupings.
To split aggregated data into segments. For more information, refer
to Segments.

There are two types of labels:
Infrastructure labels
Metric descriptor labels
Infrastructure Labels
Infrastructure labels are used to identify objects or entities within
the infrastructure that a metric is associated with, including hosts,
containers, and processes. An example label is shown below:
Sysdig Notation
Prometheus Notation
The table below outlines what each part of the label represents:
Example Label Component | Description |
---|
kubernetes | The infrastructure type. |
pod | The object. |
name | The label key. |
Infrastructure labels are obtained from the infrastructure (including
from orchestrators, platforms, and the runtime processes), and Sysdig
automatically builds a relationship model using the labels. This allows
users to create logical hierarchical groupings to better aggregate the
infrastructure objects in the Explore module.
For more information on groupings, refer to the
Groupings.
Metric Descriptor Labels
Metric descriptor labels are custom descriptors or key-value pairs
applied directly to metrics, obtained from integrations like StatsD,
Prometheus, and JMX. Sysdig automatically collects custom metrics from
these integrations, and parses the labels from them. Unlike
infrastructure labels, these labels can be arbitrary, and do not
necessarily map to any entity or object.
Metric descriptor labels can only be used for segmenting, not grouping
or scoping.
An example metric descriptor label is shown below:
website_failedRequests:20|region='Asia', customer_ID='abc'
The table below outlines what each part of the label represents:
Example Label Component | Description |
---|
website_failedRequests | The metric name. |
20 | The metric value. |
region=‘Asia’, customer_ID=‘abc’ | The metric descriptor labels. Multiple key-value pairs can be assigned using a comma separated list. |
Sysdig recommends not using labels to store dimensions with high
cardinalities (numerous different label values), such as user IDs, email
addresses, URLs, or other unbounded sets of values. Each unique
key-value label pair represents a new time series, which can
dramatically increase the amount of data stored.
Groupings
Groupings are hierarchical organizations of labels, allowing users to
organize their infrastructure views on the Explore tab in a logical
hierarchy. An example grouping is shown below:

The example above groups the infrastructure into four levels. This
results in a tree view in the Explore module with four levels, with rows
for each infrastructure object applicable to each level.
As each label is selected, Sysdig Monitor automatically filters out
labels for the next selection that no longer fit the hierarchy, to
ensure that only logical groupings are created.
The example below shows the logical hierarchy structure for Kubernetes:
Clusters: Cluster > Namespace > Replicaset > Pod
Namespace: Cluster > Namespace > HorizontalPodAutoscaler >
Deployment > Pod
Daemonsets : Cluster > Namespace > Daemonsets > Pod
Services: Cluster > Namespace > Service > StatefulSet >
Pod
Job: Cluster > Namespace > Job > Pod
ReplicationController: Cluster > Namespace >
ReplicationController > Pod

The default groupings are immutable: They cannot be modified or deleted.
However, you can make a copy of them that you can modify.
Unified Workload Labels
Sysdig provides the following labels to help improve your infrastructure
organization and troubleshooting easier.
kubernetes_workload_name: Displays all the Kubernetes workloads
and indicates what type and name of workload resource (deployment,
daemonSet, replicaSet, and so on) it is.
kubernetes_workload_type: Indicates what type of workload
resource (deployment, daemonSet, replicaSet, and so on) it is.

The availability of these labels also simplifies Groupings. You do
not need different groupings for each type of deployment, instead, you
have a single grouping for workloads.
The labels allow you to segment metrics, such as sysdig_host_cpu_cores_used_percent
, by
kubernetes_workload_name
to see CPU cores usage for all the workloads,
instead of having a separate query for segmenting by
kubernetes_deployment_name
, kubernetes_replicaSet_name
, and so on.
Learn More
Scopes
A scope is a collection of labels that are used to filter out or define
the boundaries of a group of data points when creating dashboards,
dashboard panels, alerts, and teams. An example scope is shown below:

In the example above, the scope is defined by two labels with operators
and values defined. The table below defines each of the available
operators.
Operator | Description |
---|
is | The value matches the defined label value exactly. |
is not | The value does not match the defined label value exactly. |
in | The value is among the comma separated values entered. |
not in | The value is not among the comma separated values entered. |
contains | The label value contains the defined value. |
does not contain | The label value does not contain the defined value. |
starts with | The label value starts with the defined value. |
The scope editor provides dynamic filtering capabilities. It restricts
the scope of the selection for subsequent filters by rendering valid
values that are specific to the previously selected label. Expand the
list to view unfiltered suggestions. At run time, users can also supply
custom values to achieve more granular filtering. The custom values are
preserved. Note that changing a label higher up in the hierarchy might
render the subsequent labels incompatible. For example, changing the
kubernetes_namespace_name
> kubernetes_deployment_name
hierarchy
to swarm_service_name
> kubernetes_deployment_name
is invalid as
these entities belong to different orchestrators and cannot be logically
grouped.
Dashboards and Panels
Dashboard scopes define the criteria for what metric data will be
included in the dashboard’s panels. The current dashboard’s scope can be
seen at the top of the dashboard:

By default, all dashboard panels abide by the scope of the overall
dashboard. However, an individual panel scope can be configured for a
different scope than the rest of the dashboard.
For more information on Dashboards and Panels, refer to the
Dashboards documentation.
Alerts
Alert scopes are defined during the creation process, and specify what
areas within the infrastructure the alert is applicable for. In the
example alerts below, the first alert has a scope defined, whereas the
second alert does not have a custom scope defined. If no scope is
defined, the alert is applicable to the entire infrastructure.

For more information on Alerts, refer to the
Alerts documentation.
Teams
A team’s scope determines the highest level of data that team members
have visibility for:
If a team’s scope is set to Host
, team members can see all
host-level and container-level information.
If a team’s scope is set to Container, team members can only see
container-level information.
A team’s scope only applies to that team. Users that are members of
multiple teams may have different visibility depending on which team is
active.
For more information on teams and configuring team scope, refer to the
Manage Teams and Roles
documentation.
Segments
Aggregated data can be split into smaller sections by segmenting the
data with labels. This allows for the creation of multi-series
comparisons and multiple alerts. In the first image, the metric is not
segmented:

In the second image, the same metric has been segmented by
container_id
:

Line and Area panels can display any number of segments for any
given metric. The example image below displays the sysdig_connection_net_in_bytes
metric
segmented by both container_id
and host_hostname
:

For more information regarding segmentation in dashboard panels, refer
to the Configure Panels
documentation. For more information regarding configuring alerts, refer
to the Alerts
documentation.
The Meaning of n/a
Sysdig Monitor imports data related to entities such as hosts,
containers, processes, and so on, and reports them in tables or panels
on the Explore and Dashboards UI, as well as in events, so across the UI
you see varieties of data. The term n/a can appear anywhere on the UI
where some form of data is displayed.
n/a is a term that indicates data that is not available or that it does
not apply to a particular instance. In Sysdig parlance, the term
signifies one or more entities defined by a particular label, such as
hostname or Kubernetes service, for which the label is invalid. In other
words, n/a collectively represent entities whose metadata is not
relevant to aggregation and filtering techniques—Grouping, Scoping, and
Segmenting. For instance, a list of Kubernetes services might display
the list of all the services as well as n/a that includes all the
containers without the metadata describing a Kubernetes service.
You might encounter n/a sporadically in Explore UI as well as in
drill-down panels or dashboards, events, and likely elsewhere on the
Sysdig Monitor UI when no relevant metadata is available for that
particular display. How n/a should be treated depends on the nature of
your deployment. The deployment will not be affected by the entities
marked n/a.
The following are some of the cases that yield n/a on the UI:
Labels are partially available or not available. For example, a host
has entities that are not associated with a monitored Kubernetes
deployment, or a monitored host has an unmonitored Kubernetes
deployment running.
Labels that do not apply to the grouping criteria or at the
hierarchy level. For example:
Containers that are not managed by Kubernetes. The containers
managed by Kubernetes are identified with their
container_name
labels.
In certain groupings by DaemonSet, Deployments render N/A and
vice versa. Not all containers belong to both DaemonSet and
Deployment objects concurrently. Likewise, a Kubernetes
ReplicaSet grouping with the kubernetes_replicaset_name
label
will not show StatefulSets.
In
a kubernetes_cluster_name > kubernetes_namespace_name > kubernetes_deployment_name
grouping, the entities without the kubernetes_cluster_name
label yield n/a.
Entities are incorrectly labeled in the infrastructure.
Kubernetes features that are yet to be in sync with Sysdig
Monitoring.
The format is not applicable to a particular record in the database.
3 - Data Aggregation
Sysdig Monitor allows you to adjust the aggregation settings when graphing or creating alerts for a metric, informing how Sysdig rolls up the available data samples in order to create the chart or evaluate the alert. This topic helps you familiarize with aggregation concepts and settings and explains some mechanic Sysdig uses to allow for efficient query performance and data retention.
Data Aggregation Concepts
Data Sampling
Sysdig agents collect 1-second samples and report data at 10-seconds resolution. It is the lowest resolution at which backend stores the data. In order to do so, the agent performs the downsampling from 1-second to 10-second samples.
Note: This is true for all the metrics, except Prometheus. For Prometheus metrics, data is sampled at every 1 second, but what is reported in the 10-seconds interval is the lastest value, not the downsample.
Samples are initially stored on the lowest supported resolution of 10-seconds, after which samples are being rolled up to higher downsampled timelines periodically, as new data arrives. For example, the data registered at every 10 seconds is rolled up in blocks of 1-minute interval, and the data stored in blocks of 1-minutes is being rolled up to 10-minutes blocks.
Downsampling
Downsampling refers to the process of aggregating multiple samples, on defined time interval, into set of values which can provide estimation for aggregated time ranges. In Sysdig parlance, downsampling is nothing but the data aggregation performed by the backend before exposing it as time aggregation on the UI or by the API. In effect, the data available for time aggregation during query evaluation is not the raw data, but the values that represent the estimated values for the given time range.
Reducing the amount of samples also help reduce data retention costs as well as improve query performances by reducing the amount of data loaded during query evaluation.
Downsampled data is used only for longer time ranges. If you are viewing most recent data, such as 10 minute or last 1 hour, raw data is used for evaluation.
Data Rollup
Sysdig Monitor rolls up historical data over time.
Sysdig downsampling produces data rollups of aggregated samples. In each data rollup, Sysdig calculates and records 4 values: maximum
, minimum
, sum
, and count
. These values allow for exposing the following time aggregations: max
, min
, sum
, count
, avg
, rate
, rateOfChange
on the UI as well as by the APIs.
For example, the data collected every 10-seconds is aggregated and rolled up in blocks of 1-minute interval. From the recorded values in 1-minute rollups, data is rolled up again for a block of 10-minutes interval.
Data Resolution
Data resolution is the frequency with which the data is displayed. Sysdig Monitor supports the data resolution of 10 seconds, 1 minute, 10 minutes, 1 hour, and 1 day.
Time and Group Aggregations
There are two forms of aggregation used for metrics in Sysdig: time aggregation and group aggregation. Time aggregation is always performed before group aggregation.
Time Aggregation
Time aggregation comes into effect in two situations (that can sometimes overlap):
- Aggregation: Graphs can only render a limited number of data points. To look at a wide range of data, Sysdig Monitor aggregate granular
data into larger blocks of samples for visualization as given in Data Downsampling.
- Data Rollup: Sysdig retains rollups based on each aggregation type to allow users to choose which data points to utilize when evaluating older data.
Aggregation Types
Aggregation Type | Description |
---|
average | The average of the retrieved metric values across the time period. |
rate | The average value of the metric across the time period evaluated. |
maximum | The highest value during the time period evaluated. |
minimum | The lowest value during the time period evaluated. |
sum | The combined sum of the metric across the time period evaluated. |
Difference Between Rate and average
Rate and average are very similar and often provide the same result.
However, the calculation of each is different.
If time aggregation is set to one minute, the agent is supposed
to retrieve six samples (one every 10 seconds).
In some cases, samples may not be there, due to disconnections
or other circumstances. For this example, four samples are
available. If this was the case, the average
would be
calculated by dividing by four, while the rate
would be
calculated by dividing by six.
Most metrics are sampled once for each time interval, resulting in
average and rate returning the same value. However, there will be a
distinction for any metrics not reported at every time interval. For
example, some custom statsd metrics.
Rate is currently referred to as timeAvg
in the Sysdig Monitor API
and advanced alerting language.
By default, average is used when displaying data points for a time
interval.
Time Aggregation on the UI
On the Sysdig Monitor UI, you select the time aggregation from the Metric drop-down.
Depending on the time range you have selected, how old the data is, and what the resolution is , panels display data at a granularity of 10 seconds, 1 minute, 10 minute, 1 hour, and 1 day.
The data drawn at 10-second resolution is reported every 10-second with the available aggregations (average, rate, min, max, sum) to make them available via the Sysdig Monitor UI and the API. For time series panels covering 5 minutes or less, data points are drawn at this 10-second resolution, and any time aggregation selections will have no effect.
When a panel displays an amount of time greater than 5 minutes
, data points are drawn as an aggregate for an appropriate time interval
. For
example, for a panel covering 1 hour, each data point would reflect a 1-minute interval.
At time intervals of 1-minute and above, charts can be configured to display different aggregates for the 10-second metrics used to calculate each datapoint.
Time Aggregation and Time Range Mapping on the UI
Aggregation Interval | Time Range |
---|
10-seconds | 10 Minutes |
1-minute | 1 Hour |
10-minutes | 6 Hours, 12 Hours |
1-hour | 1 Day, 4 Day, 1 Week |
1-day | 2 Weeks |
Group Aggregation
Metrics applied to a group of items (for example, several containers, hosts, or nodes) are averaged between the members of the group by default. For example, three hosts report different CPU usage for one sample interval. The three values will be averaged, and reported on the chart as a single datapoint for that metric.
There are several different types of group aggregation:
Aggregation Type | Description |
---|
average | The average value of the interval’s samples. |
maximum | The maximum value of the interval’s samples. |
minimum | The minimum value of the interval’s samples. |
sum | The sum of values of all of the interval’s samples. |
If a chart or alert is segmented, the group aggregation settings will be utilized for both aggregations across the whole group, and aggregation
within each individual segmentation.
For example, the image below shows a chart for CPU% across the infrastructure:

When segmented by proc_name
, the chart shows one CPU% line for each
process:

Each line provides the average value for every process with the same
name. To see the difference, change the group aggregation type to sum:

The metric aggregation value showed beside the metric name is for the
time aggregation. While the screenshot shows AVG
, the group
aggregation is set to SUM
.
Aggregation Examples
The tables below provide an example of how each type of aggregation
works. The first table provides the metric data, while the second
displays the resulting value for each type of aggregation.

In the example below, the CPU% metric is applied to a group of servers
called webserver
. The first chart shows metrics using average
aggregation for both time and group. The second chart shows the metrics
using maximum aggregation for both time and group.

For each one minute interval, the second chart renders the highest CPU
usage value found from the servers in the webserver
group and from all
of the samples reported during the one minute interval. This view can be
useful when searching for transient spikes in metrics over long periods
of time, that would otherwise be missed with average aggregation.
The group aggregation type is dependent on the segmentation. For a view
showing metrics for a group of items, the current group aggregation
setting will revert to the default setting, if the Segment By
selection is changed.
4 - Metric Limits
Metric limits determine the amount of custom time series ingested by the Sysdig agent. While this is primarily a tool to help limit the total time series ingested for limiting cost exposure for each user, it does affect the total number of time series consumed and used in tracking metrics.
The Sysdig agent metric limit is different from the entitlement limit
imposed on custom time series. Your time series entitlement could be
lower than agent metric limits. For more information, see Time Series
Billing.
View Metric Limits
The metric limits are automatically defined by Sysdig backend components
based on your plan, agent version, and backend configuration. Metric limits
are set per-category, and when aggregated the per-category limits define
your overall metric limit per agent. Metric limits are global per account
and the same limit will apply to each agent within a Sysdig account.
Use the Sysdig Agent Health & Status dashboard under Host
Infrastructure templates to view per-category metric limits for your
account, along with the current usage per host for each metric type.
Contact Sysdig Support to adjust metric limits for any category.

See the Sysdig Agent Health & Status dashboard to view the metric limits
and current time series consumption for each agent.
Metrics | Description |
---|
statsd_dragent_metricCount_limit_appCheck | The maximum number of unique appCheck timeseries that are allowed in an individual sample from the agent per node. |
statsd_dragent_metricCount_limit_statsd | The maximum number of unique statsd timeseries that are allowed in an individual sample from the agent per node. |
statsd_dragent_metricCount_limit_jmx | The maximum number of unique JMX timeseries that are allowed in an individual sample from the agent per node. |
statsd_dragent_metricCount_limit_prometheus | The maximum number of unique Prometheus timeseries that are allowed in an individual sample from the agent per node. |
Learn More
5 - Manage Metric Scale
Sysdig provides several knobs for managing metric scale. This topic introduces you to the primary ways in which you could include/exclude metrics, should you encounter unwanted metrics limits.
Include/exclude custom metrics by name filters.
See Include/Exclude Custom Metrics.
Include/exclude metrics emitted by certain containers, Kubernetes annotations, or any other container label at collection time. See Prioritize/Include/Exclude Designated Containers.
Exclude metrics from unwanted ports. See Blacklist Ports.
6 - Metrics Library
The Sysdig metrics dictionary lists all the metrics, both in Sysdig legacy and Prometheus-compatible notation, supported by the Sysdig product suite, as well as kube state and cloud provider metrics. The Metrics Dictionary is a living document and is updated as new metrics are added to the product.
6.1 - Metrics and Labels Mapping
This topic outlines the mapping between the metrics and label naming conventions in the Sysdig legacy datastore and the new Sysdig datastore.
6.1.1 - Mapping Classic Metrics with Context-Specific PromQL Metrics
Sysdig classic metrics such as cpu.used.percent
previously returned values from a process, container, or host depending on the query segmentation or scope. You can now use context-explicit metrics which aligns with the flat model and resource specific semantics of Prometheus naming schema.
Your existing dashboards and alerts will be automatically migrated to the new naming convention.
Sysdig Classic Metrics | Context-Specific Metrics in Prometheus Notation |
---|
cpu.cores.used | sysdig_container_cpu_cores_used sysdig_host_cpu_cores_used sysdig_program_cpu_cores_used |
cpu.cores.used.percent | sysdig_container_cpu_cores_used_percent sysdig_host_cpu_cores_used_percent sysdig_program_cpu_cores_used_percent |
cpu.used.percent | sysdig_container_cpu_used_percent sysdig_host_cpu_used_percent sysdig_program_cpu_used_percent |
fd.used.percent | sysdig_container_fd_used_percent sysdig_host_fd_used_percent sysdig_program_fd_used_percent |
file.bytes.in | sysdig_container_file_in_bytes sysdig_host_file_in_bytes sysdig_program_file_in_bytes |
file.bytes.out | sysdig_container_file_out_bytes sysdig_host_file_out_bytes sysdig_program_file_out_bytes |
file.bytes.total | sysdig_container_file_total_bytes sysdig_host_file_total_bytes sysdig_program_file_total_bytes |
file.error.open.count | sysdig_container_file_error_open_count sysdig_host_file_error_open_count sysdig_program_file_error_open_count |
file.error.total.count | sysdig_container_file_error_total_count sysdig_host_file_error_total_count sysdig_program_file_error_total_count |
file.iops.in | sysdig_container_file_in_iops sysdig_host_file_in_iops sysdig_program_file_in_iops |
file.iops.out | sysdig_container_file_out_iops sysdig_host_file_out_iops sysdig_program_file_out_iops |
file.iops.total | sysdig_container_file_total_iops sysdig_host_file_total_iops sysdig_program_file_total_iops |
file.open.count | sysdig_container_file_open_count sysdig_host_file_open_count sysdig_program_file_open_count |
file.time.in | sysdig_container_file_in_time sysdig_host_file_in_time sysdig_program_file_in_time |
file.time.out | sysdig_container_file_out_time sysdig_host_file_out_time sysdig_program_file_out_time |
file.time.total | sysdig_container_file_total_time sysdig_host_file_total_time sysdig_program_file_total_time |
fs.bytes.free | sysdig_container_fs_free_bytes sysdig_fs_free_bytes sysdig_host_fs_free_bytes |
fs.bytes.total | sysdig_container_fs_total_bytes sysdig_fs_total_bytes sysdig_host_fs_total_bytes |
fs.bytes.used | sysdig_container_fs_used_bytes sysdig_fs_used_bytes sysdig_host_fs_used_bytes |
fs.free.percent | sysdig_container_fs_free_percent sysdig_fs_free_percent sysdig_host_fs_free_percent |
fs.inodes.total.count | sysdig_container_fs_inodes_total_count sysdig_fs_inodes_total_count sysdig_host_fs_inodes_total_count |
fs.inodes.used.count | sysdig_container_fs_inodes_used_count sysdig_fs_inodes_used_count sysdig_host_fs_inodes_used_count |
fs.inodes.used.percent | sysdig_container_fs_inodes_used_percent sysdig_fs_inodes_used_percent sysdig_host_fs_inodes_used_percent |
fs.largest.used.percent | sysdig_container_fs_largest_used_percent sysdig_host_fs_largest_used_percent |
fs.root.used.percent | sysdig_container_fs_root_used_percent sysdig_host_fs_root_used_percent |
fs.used.percent | sysdig_container_fs_used_percent sysdig_fs_used_percent sysdig_host_fs_used_percent |
host.error.count | sysdig_container_syscall_error_count sysdig_host_syscall_error_count |
info | sysdig_agent_info sysdig_container_info sysdig_host_info |
memory.bytes.total | sysdig_host_memory_total_bytes sysdig_container_memory_used_bytes sysdig_host_memory_used_bytes sysdig_program_memory_used_bytes |
memory.bytes.virtual | sysdig_container_memory_virtual_bytes sysdig_host_memory_virtual_bytes |
memory.swap.bytes.used | sysdig_container_memory_swap_used_bytes sysdig_host_memory_swap_used_bytes |
memory.used.percent | sysdig_container_memory_used_percent sysdig_host_memory_used_percent |
net.bytes.in | sysdig_connection_net_in_bytes sysdig_container_net_in_bytes sysdig_host_net_in_bytes sysdig_program_net_in_bytes |
net.bytes.out | sysdig_connection_net_out_bytes sysdig_container_net_out_bytes sysdig_host_net_out_bytes sysdig_program_net_out_bytes |
net.bytes.total | sysdig_connection_net_total_bytes sysdig_container_net_total_bytes sysdig_host_net_total_bytes sysdig_program_net_total_bytes |
net.connection.count.in | sysdig_connection_net_connection_in_count sysdig_container_net_connection_in_count sysdig_host_net_connection_in_count sysdig_program_net_connection_in_count |
net.connection.count.out | sysdig_connection_net_connection_out_count sysdig_container_net_connection_out_count sysdig_host_net_connection_out_count sysdig_program_net_connection_out_count |
net.connection.count.total | sysdig_connection_net_connection_total_count sysdig_container_net_connection_total_count sysdig_host_net_connection_total_count sysdig_program_net_connection_total_count |
net.request.count | sysdig_connection_net_request_count sysdig_container_net_request_count sysdig_host_net_request_count sysdig_program_net_request_count |
net.error.count | sysdig_container_net_error_count sysdig_host_net_error_count sysdig_program_net_error_count |
net.request.count.in | sysdig_connection_net_request_in_count sysdig_container_net_request_in_count sysdig_host_net_request_in_count sysdig_program_net_request_in_count |
net.request.count.out | sysdig_connection_net_request_out_count sysdig_container_net_request_out_count sysdig_host_net_request_out_count sysdig_program_net_request_out_count |
net.request.time | sysdig_connection_net_request_time sysdig_container_net_request_time sysdig_host_net_request_time sysdig_program_net_request_time |
net.request.time.in | sysdig_connection_net_request_in_time sysdig_container_net_request_in_time sysdig_host_net_request_in_time sysdig_program_net_request_in_time |
net.request.time.out | sysdig_connection_net_request_out_time sysdig_container_net_request_out_time sysdig_host_net_request_out_time sysdig_program_net_request_out_time |
net.server.bytes.in | sysdig_container_net_server_in_bytes sysdig_host_net_server_in_bytes |
net.server.bytes.out | sysdig_container_net_server_out_bytes sysdig_host_net_server_out_bytes |
net.server.bytes.total | sysdig_container_net_server_total_bytes sysdig_host_net_server_total_bytes |
net.sql.error.count | sysdig_container_net_sql_error_count sysdig_host_net_sql_error_count |
net.sql.request.count | sysdig_container_net_sql_request_count sysdig_host_net_sql_request_count |
net.tcp.queue.len | sysdig_container_net_tcp_queue_len sysdig_host_net_tcp_queue_len sysdig_program_net_tcp_queue_len |
proc.count | sysdig_container_proc_count sysdig_host_proc_count sysdig_program_proc_count |
thread.count | sysdig_container_thread_count sysdig_host_thread_count sysdig_program_thread_count |
uptime | sysdig_container_up sysdig_host_up sysdig_program_up |
6.1.2 - Mapping Classic Metrics with PromQL Metrics
Starting SaaS v 3.2.6, Sysdig classic metrics and labels have been
renamed to be aligned with Prometheus naming convention. For example,
Sysdig classic metrics have a dot-oriented hierarchy, whereas Prometheus
has label-based metric organization. The table below helps you identify
the Prometheus metrics and labels and the corresponding ones in the
Sysdig classic system.
host | info | sysdig_host_info | Not exposed | host_mac host instance_id agent_tag_{*}
| host.mac host.hostName host.instanceId agent.tag.{*}
|
| | sysdig_cloud_provider_info | | host_mac provider_id account_id region availability_zone instance_type tag_{*} security_groups host_ip_public host_ip_private host_name name
| host.mac cloudProvider.id cloudProvider.account.id cloudProvider.region cloudProvider.availabilityZone cloudProvider.instance.type cloudProvider.tag.{*} cloudProvider.securityGroups cloudProvider.host.ip.public cloudProvider.host.ip.private cloudProvider.host.name cloudProvider.name
| | |
| data | sysdig_host_cpu_used_percent | cpu.used.percent | | | |
| | sysdig_host_cpu_cores_used | cpu.cores.used | | |
| | sysdig_host_cpu_user_percent | cpu.user.percent | | |
| | sysdig_host_cpu_idle_percent | cpu.idle.percent | | |
| | sysdig_host_cpu_iowait_percent | cpu.iowait.percent | | |
| | sysdig_host_cpu_nice_percent | cpu.nice.percent | | |
| | sysdig_host_cpu_stolen_percent | cpu.stolen.percent | | |
| | sysdig_host_cpu_system_percent | cpu.system.percent | | |
| | sysdig_host_fd_used_percent | fd.used.percent | | |
| | sysdig_host_file_error_open_count | file.error.open.count | | |
| | sysdig_host_file_error_total_count | file.error.total.count | | |
| | sysdig_host_file_in_bytes | file.bytes.in | | |
| | sysdig_host_file_in_iops | file.iops.in | | |
| | sysdig_host_file_in_time | file.time.in | | |
| | sysdig_host_file_open_count | file.open.count | | |
| | sysdig_host_file_out_bytes | file.bytes.out | | |
| | sysdig_host_file_out_iops | file.iops.out | | |
| | sysdig_host_file_out_time | file.time.out | | |
| | sysdig_host_load_average_15m | load.average.15m | | |
| | sysdig_host_load_average_1m | load.average.1m | | |
| | sysdig_host_load_average_5m | load.average.5m | | |
| | sysdig_host_memory_available_bytes | memory.bytes.available | | |
| | sysdig_host_memory_total_bytes | memory.bytes.total | | |
| | sysdig_host_memory_used_bytes | memory.bytes.used | | |
| | sysdig_host_memory_swap_available_bytes | memory.swap.bytes.available | | |
| | sysdig_host_memory_swap_total_bytes | memory.swap.bytes.total | | |
| | sysdig_host_memory_swap_used_bytes | memory.swap.bytes.used | | |
| | sysdig_host_memory_virtual_bytes | memory.bytes.virtual | | |
| | sysdig_host_net_connection_in_count | net.connection.count.in | | |
| | sysdig_host_net_connection_out_count | net.connection.count.out | | |
| | sysdig_host_net_error_count | net.error.count | | |
| | sysdig_host_net_in_bytes | net.bytes.in | | |
| | sysdig_host_net_out_bytes | net.bytes.out | | |
| | sysdig_host_net_tcp_queue_len | net.tcp.queue.len | | |
| | sysdig_host_proc_count | proc.count | | |
| | sysdig_host_system_uptime | system.uptime | | |
| | sysdig_host_thread_count | thread.count | | |
container | info | sysdig_container_info | Not exposed | container_id | container_id |
| | | | container_full_id | none |
| | | | host_mac | host.mac |
| | | | container | container.name |
| | | | container_type | container.type |
| | | | image | container.image |
| | | | image_id | container.image.id |
| | | | mesos_task_id | container.mesosTaskId Only available in Mesos orchestrator. |
| | | | cluster | kubernetes.cluster.name Present only if the container is part of Kubernetes. |
| | | | pod | kubernetes.pod.name Present only if the container is part of Kubernetes |
| | | | namespace | kubernetes.namespace.name Present only if the container is part of Kubernetes. |
| data | sysdig_container_cpu_used_percent | cpu.used.percent | host_mac container_id container_type container
| host.mac container.id container.type container.name
|
| | sysdig_container_cpu_cores_used | cpu.cores.used | | |
| | sysdig_container_cpu_cores_used_percent | cpu.cores.used.percent | | |
| | sysdig_container_cpu_quota_used_percent | cpu.quota.used.percent | | |
| | sysdig_container_cpu_shares | cpu.shares.count | | |
| | sysdig_container_cpu_shares_used_percent | cpu.shares.used.percent | | |
| | sysdig_container_fd_used_percent | fd.used.percent | | |
| | sysdig_container_file_error_open_count | file.error.open.count | | |
| | sysdig_container_file_error_total_count | file.error.total.count | | |
| | sysdig_container_file_in_bytes | file.bytes.in | | |
| | sysdig_container_file_in_iops | file.iops.in | | |
| | sysdig_container_file_in_time | file.time.in | | |
| | sysdig_container_file_open_count | file.open.count | | |
| | sysdig_container_file_out_bytes | file.bytes.out | | |
| | sysdig_container_file_out_iops | file.iops.out | | |
| | sysdig_container_file_out_time | file.time.out | | |
| | sysdig_container_memory_limit_bytes | memory.limit.bytes | | |
| | sysdig_container_memory_limit_used_percent | memory.limit.used.percent | | |
| | sysdig_container_memory_swap_available_bytes | memory.swap.bytes.available | | |
| | sysdig_container_memory_swap_total_bytes | memory.swap.bytes.total | | |
| | sysdig_container_memory_swap_used_bytes | memory.swap.bytes.used | | |
| | sysdig_container_memory_used_bytes | memory.bytes.used | | |
| | sysdig_container_memory_virtual_bytes | memory.bytes.virtual | | |
| | sysdig_container_net_connection_in_count | net.connection.count.in | | |
| | sysdig_container_net_connection_out_count | net.connection.count.out | | |
| | sysdig_container_net_error_count | net.error.count | | |
| | sysdig_container_net_in_bytes | net.bytes.in | | |
| | sysdig_container_net_out_bytes | net.bytes.out | | |
| | sysdig_container_net_tcp_queue_len | net.tcp.queue.len | | |
| | sysdig_container_proc_count | proc.count | | |
| | sysdig_container_swap_limit_bytes | swap.limit.bytes | | |
| | sysdig_container_thread_count | thread.count | | |
Process/ Program | Info | sysdig_program_info | not exposed | program | proc.name |
| | | | cmd_line | proc.commandLine |
| | | | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| data | sysdig_program_cpu_used_percent | cpu.used.percent | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | program | proc.name |
| | | | cmd_line | proc.commandLine |
| | sysdig_program_memory_used_bytes | memory.bytes.used | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | program | proc.name |
| | | | cmd_line | proc.commandLine |
| | sysdig_program_net_in_bytes | net.bytes.in | container_id | container.id |
| | | | host_mac | host.mac |
| | | | container_type | container.type |
| | | | program | proc.name |
| | | | cmd_line | proc.commandLine |
| | sysdig_program_net_out_bytes | net.bytes.out | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | program | proc.name |
| | | | cmd_line | proc.commandLine |
| | sysdig_program_proc_count | proc.count | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | program | proc.name |
| | | | cmd_line | proc.commandLine |
| | sysdig_program_thread_count | thread.count | host_mac | host.mac |
| | | | | |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | program | proc.name |
cmd_line | proc.commandLine | | | | |
| | | | | |
fs | info | sysdig_fs_info | not exposed | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | device | fs.device |
| | | | mount_dir | fs.mountDir |
| | | | type | fs.type |
| data | sysdig_fs_free_bytes | fs.bytes.free | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | device | fs.device |
| | sysdig_fs_inodes_total_count | fs.inodes.total.count | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | device | fs.device |
| | sysdig_fs_inodes_used_count | fs.inodes.used.count | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | device | fs.device |
| | sysdig_fs_total_bytes | fs.bytes.total | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | device | fs.device |
| | fs.bytes.used | | host_mac | host.mac |
| | | | container_id | container.id |
| | | | container_type | container.type |
| | | | devide | fs.device |
6.1.3 - Mapping Legacy Sysdig Kubernetes Metrics with Prometheus Metrics
Prometheus metrics, in Kubernetes parlance, are nothing but Kube State
Metrics. These metrics are available in Sysdig PromQL and can be mapped
to existing Sysdig Kubernetes metrics.
For descriptions on Kubernetes State Metrics, see Kubernetes State
Metrics.
Pod | kubernetes.pod.containers.waiting | kube_pod_container_status_waiting | | |
| kubernetes.pod.resourceLimits.cpuCores kubernetes.pod.resourceLimits.memBytes | kube_pod_container_resource_limits kube_pod_sysdig_resource_limits_memory_bytes kube_pod_sysdig_resource_limits_cpu_cores | | {namespace="default",pod="pod0",container="pod1_con1",resource="cpu",unit="core"} {namespace="default",pod="pod0",container="pod1_con1",resource="memory",unit="byte"} |
| kubernetes.pod.resourceRequests.cpuCores kubernetes.pod.resourceRequests.memBytes | kube_pod_container_resource_requests kube_pod_sysdig_resource_requests_cpu_cores kube_pod_sysdig_resource_requests_memory_bytes | | {namespace="default",pod="pod0",container="pod1_con1",resource="cpu",unit="core"} {namespace="default",pod="pod0",container="pod1_con1",resource="memory",unit="byte"} |
| kubernetes.pod.status.ready | kube_pod_status_ready | | |
| | kube_pod_info | | {namespace="default",pod="pod0",host_ip="1.1.1.1",pod_ip="1.2.3.4",uid="abc-0",node="node1",created_by_kind="<none>",created_by_name="<none>",priority_class=""} |
| | kube_pod_owner | | {namespace="default",pod="pod0",owner_kind="<none>",owner_name="<none>;",owner_is_controller="<none>"} |
| | kube_pod_labels | | {namespace="default",pod="pod0", label_app="myApp"} |
| | kube_pod_container_info | | {namespace="default",pod="pod0",container="container2",image="k8s.gcr.io/hyperkube2",image_id="docker://sha256:bbb",container_id="docker://cd456"} |
node | kubernetes.node.allocatable.cpuCores | kube_node_status_allocatable_cpu_cores | node=<node-address> resource=<resource-name> unit=<resource-unit> node=<node-address>
| resource/unit have one of the values: (cpu, core); (memory, byte); (pods, integer). Sysdig currently supports only CPU, pods, and memory resources for kube_node_status_capacity metrics. "# HELP kube_node_status_capacity The capacity for different resources of a node.
kube_node_status_capacity{node=""k8s-master"",resource=""hugepages_1Gi"",unit=""byte""} 0
kube_node_status_capacity{node=""k8s-master"",resource=""hugepages_2Mi"",unit=""byte""} 0
kube_node_status_capacity{node=""k8s-master"",resource=""memory"",unit=""byte""} 4.16342016e+09
kube_node_status_capacity{node=""k8s-master"",resource=""pods"",unit=""integer""} 110
kube_node_status_capacity{node=""k8s-node1"",resource=""pods"",unit=""integer""} 110
kube_node_status_capacity{node=""k8s-node1"",resource=""cpu"",unit=""core""} 2
kube_node_status_capacity{node=""k8s-node1"",resource=""hugepages_1Gi"",unit=""byte""} 0
kube_node_status_capacity{node=""k8s-node1"",resource=""hugepages_2Mi"",unit=""byte""} 0
kube_node_status_capacity{node=""k8s-node1"",resource=""memory"",unit=""byte""} 6.274154496e+09
kube_node_status_capacity{node=""k8s-node2"",resource=""hugepages_1Gi"",unit=""byte""} 0
kube_node_status_capacity{node=""k8s-node2"",resource=""hugepages_2Mi"",unit=""byte""} 0
kube_node_status_capacity{node=""k8s-node2"",resource=""memory"",unit=""byte""} 6.274154496e+09
kube_node_status_capacity{node=""k8s-node2"",resource=""pods"",unit=""integer""} 110
kube_node_status_capacity{node=""k8s-node2"",resource=""cpu"",unit=""core""} 2
|
| kubernetes.node.allocatable.memBytes | kube_node_status_allocatable_memory_bytes | | |
| kubernetes.node.allocatable.pods | kube_node_status_allocatable_pods | | |
| kubernetes.node.capacity.cpuCores | kube_node_status_capacity_cpu_cores | node=<node-address> resource=<resource-name> unit=<resource-unit> node=<node-address>
| |
| kubernetes.node.capacity.memBytes | kube_node_status_capacity_memory_bytes | | |
| kubernetes.node.capacity.pod | kube_node_status_capacity_pods | | |
| kubernetes.node.diskPressure | kube_node_status_condition | | |
| kubernetes.node.memoryPressure | | | |
| kubernetes.node.networkUnavailable | | | |
| kubernetes.node.outOfDisk | | | |
| kubernetes.node.ready | | | |
| kubernetes.node.unschedulable | kube_node_spec_unschedulable | | |
| | kube_node_info | | |
| | kube_node_labels | | |
Deployment | kubernetes.deployment.replicas.available | kube_deployment_status_replicas_available | | |
| kubernetes.deployment.replicas.desired | kube_deployment_spec_replicas | | |
| kubernetes.deployment.replicas.paused | kube_deployment_spec_paused | | |
| kubernetes.deployment.replicas.running | kube_deployment_status_replicas | | |
| kubernetes.deployment.replicas.unavailable | kube_deployment_status_replicas_unavailable | | |
| kubernetes.deployment.replicas.updated | kube_deployment_status_replicas_updated | | |
| | kube_deployment_labels | | |
job | kubernetes.job.completions | kube_job_spec_completions | | |
| kubernetes.job.numFailed | kube_job_failed | | |
| kubernetes.job.numSucceeded | kube_job_complete | | |
| kubernetes.job.parallelism | kube_job_spec_parallelism | | |
| | kube_job_status_active | | |
| | kube_job_info | | |
| | kube_job_owner | | |
| | kube_job_labels | | |
daemonSet | kubernetes.daemonSet.pods.desired | kube_daemonset_status_desired_number_scheduled | | |
| kubernetes.daemonSet.pods.misscheduled | kube_daemonset_status_number_misscheduled | | |
| kubernetes.daemonSet.pods.ready | kube_daemonset_status_number_ready | | |
| kubernetes.daemonSet.pods.scheduled | kube_daemonset_status_current_number_scheduled | | |
| | kube_daemonset_labels | daemonset=<daemonset-name> namespace=<daemonset-namespace> label_daemonset_label=<daemonset_label>
| |
replicaSet | kubernetes.replicaSet.replicas.fullyLabeled | kube_replicaset_status_fully_labeled_replicas | | |
| kubernetes.replicaSet.replicas.ready | kube_replicaset_status_ready_replicas | | |
| kubernetes.replicaSet.replicas.running | kube_replicaset_status_replicas | | |
| kubernetes.replicaSet.replicas.desired | kube_replicaset_spec_replicas | | |
| | kube_replicaset_owner | | | |
| | kube_replicaset_labels | label_replicaset_label=<replicaset_label> replicaset=<replicaset-name> namespace=<replicaset-namespace>
|
statefulset | kubernetes.statefulset.replicas | kube_statefulset_replicas | | |
| kubernetes.statefulset.status.replicas | kube_statefulset_status_replicas | | |
| kubernetes.statefulset.status.replicas.current | kube_statefulset_status_replicas_current | | |
| kubernetes.statefulset.status.replicas.ready | kube_statefulset_status_replicas_ready | | |
| kubernetes.statefulset.status.replicas.updated | kube_statefulset_status_replicas_updated | | |
| | kube_statefulset_labels | | |
hpa | kubernetes.hpa.replicas.min | kube_horizontalpodautoscaler_spec_min_replicas | | |
| kubernetes.hpa.replicas.max | kube_horizontalpodautoscaler_spec_max_replicas | | |
| kubernetes.hpa.replicas.current | kube_horizontalpodautoscaler_status_current_replicas | | |
| kubernetes.hpa.replicas.desired | kube_horizontalpodautoscaler_status_desired_replicas | | |
| | kube_horizontalpodautoscaler_labels | | |
resourcequota | kubernetes.resourcequota.configmaps.hard kubernetes.resourcequota.configmaps.used kubernetes.resourcequota.limits.cpu.hard kubernetes.resourcequota.limits.cpu.used kubernetes.resourcequota.limits.memory.hard kubernetes.resourcequota.limits.memory.used kubernetes.resourcequota.persistentvolumeclaims.hard kubernetes.resourcequota.persistentvolumeclaims.used kubernetes.resourcequota.cpu.hard kubernetes.resourcequota.memory.hard kubernetes.resourcequota.pods.hard kubernetes.resourcequota.pods.used kubernetes.resourcequota.replicationcontrollers.hard kubernetes.resourcequota.replicationcontrollers.used kubernetes.resourcequota.requests.cpu.hard kubernetes.resourcequota.requests.cpu.used kubernetes.resourcequota.requests.memory.hard kubernetes.resourcequota.requests.memory.used kubernetes.resourcequota.requests.storage.hard kubernetes.resourcequota.requests.storage.used kubernetes.resourcequota.resourcequotas.hard kubernetes.resourcequota.resourcequotas.used kubernetes.resourcequota.secrets.hard kubernetes.resourcequota.secrets.used kubernetes.resourcequota.services.hard kubernetes.resourcequota.services.used kubernetes.resourcequota.services.loadbalancers.hard kubernetes.resourcequota.services.loadbalancers.used kubernetes.resourcequota.services.nodeports.hard kubernetes.resourcequota.services.nodeports.used | kube_resourcequota | | |
namespace | | kube_namespace_labels | | |
replicationcontroller | kubernetes.replicationcontroller.replicas.desired | kube_replicationcontroller_spec_replicase | | |
| kubernetes.replicationcontroller.replicas.running | kube_replicationcontroller_status_replicas | | |
| | kube_replicationcontroller_status_fully_labeled_replicas kube_replicationcontroller_status_ready_replicas kube_replicationcontroller_status_available_replicas kube_replicationcontroller_status_observed_generation kube_replicationcontroller_metadata_generation kube_replicationcontroller_created | | |
| | kube_replicationcontroller_owner | | |
service | | kube_service_info | service=<service-name> namespace=<service-namespace> cluster_ip=<service cluster ip> external_name=<service external name> load_balancer_ip=<service load balancer ip>
| |
| | kube_service_labels | | |
persistentvolume | kubernetes.persistentvolume.storage | kube_persistentvolume_capacity_bytes | | |
| | kube_persistentvolume_info | | |
| | kube_persistentvolume_labels | | |
persistentvolumeclaim | kubernetes.persistentvolumeclaim.requests.storage | kube_persistentvolumeclaim_resource_requests_storage_bytes | | |
| | kube_persistentvolumeclaim_info | | |
| | kube_persistentvolumeclaim_labels | persistentvolumeclaim=<persistentvolumeclaim-name> namespace=<persistentvolumeclaim-namespace> label_persistentvolumeclaim_label=<persistentvolumeclaim_label>
| |
6.2 - Metrics and Labels in Prometheus Format
The Prometheus metrics library lists the metrics in Prometheus format
supported by the Sysdig product suite, as well as kube state and cloud
provider metrics.
The metrics listed in this section follows the statsd-compatible Sysdig naming convention. To see a mapping between Prometheus notation and Sysdig notation, see Metrics and Label Mapping.
Overview
Each metric in the dictionary has several pieces of metadata listed to
provide greater context for how the metric can be used within Sysdig
products. An example layout is displayed below:
Metric Name
Metric definition. For some metrics, the equation for how the value is
determined is provided.
Metric Type | Metric type determines whether the metric value is a counter metric or a gauge metric. Sysdig Monitor offers two Metric types: Counter: The metric whose value keeps on increasing and is reliant on previous values. It helps you record how many times something has happened, for example, a user login. Gauge: Represents a single numerical value that can arbitrarily fluctuate over time. Each value returns an instantaneous measurement, for example, CPU usage. |
Value Type | The type of value the metric can have. The possible values are: Percent (%) Byte Date Double Integer (int) relativeTime String
|
Segment By | The levels within the infrastructure that the metric can be segmented at: Host Container Process Kubernetes Mesos Swarm CloudProvider
|
Default Time Aggregation | The default time aggregation format for the metric. |
Available Time Aggregation Formats | The time aggregation formats the metric can be aggregated by: Average (Avg) Rate Sum Minimum (Min) Maximum (Max)
|
Default Group Aggregation | The default group aggregation format for the metric. |
Available Group Aggregation Formats | The group aggregation formats the metric can be aggregated by: Average (Avg) Sum Minimum (Min) Maximum (Max)
|
6.2.1 - Agent
sysdig_agent_info
| |
---|
Prometheus ID | sysdig_agent_info |
Legacy ID | info |
Metric Type | gauge |
Unit | number |
Description | The metrics will always have the value of 1. |
Additional Notes | |
sysdig_agent_timeseries_count_appcheck
| |
---|
Prometheus ID | sysdig_agent_timeseries_count_appcheck |
Legacy ID | metricCount.appCheck |
Metric Type | gauge |
Unit | number |
Description | The total number of time series received from appcheck integrations. |
Additional Notes | |
sysdig_agent_timeseries_count_jmx
| |
---|
Prometheus ID | sysdig_agent_timeseries_count_jmx |
Legacy ID | metricCount.jmx |
Metric Type | gauge |
Unit | number |
Description | The total number of time series received from JMX integrations. |
Additional Notes | |
sysdig_agent_timeseries_count_prometheus
| |
---|
Prometheus ID | sysdig_agent_timeseries_count_prometheus |
Legacy ID | metricCount.prometheus |
Metric Type | gauge |
Unit | number |
Description | The total number of time series received from Prometheus integrations. |
Additional Notes | |
sysdig_agent_timeseries_count_statsd
| |
---|
Prometheus ID | sysdig_agent_timeseries_count_statsd |
Legacy ID | metricCount.statsd |
Metric Type | gauge |
Unit | number |
Description | The total number of time series received from StatsD integrations. |
Additional Notes | |
6.2.2 - Containers
sysdig_container_count
| |
---|
Prometheus ID | sysdig_container_count |
Legacy ID | container.count |
Metric Type | gauge |
Unit | number |
Description | The count of the number of containers. |
Additional Notes | This metric is perfect for dashboards and alerts. In particular, you can create alerts that notify you when you have too many (or too few) containers of a certain type in a certain group or node - try segmenting by container.image, .id or .name. See also: host.count. |
sysdig_container_cpu_cgroup_used_percent
| |
---|
Prometheus ID | sysdig_container_cpu_cgroup_used_percent |
Legacy ID | cpu.cgroup.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of a container’s cgroup limit that is actually used. This is the minimum usage for the underlying cgroup limits: cpuset.limit and quota.limit. |
Additional Notes | |
sysdig_container_cpu_cores_cgroup_limit
| |
---|
Prometheus ID | sysdig_container_cpu_cores_cgroup_limit |
Legacy ID | cpu.cores.cgroup.limit |
Metric Type | gauge |
Unit | number |
Description | The number of CPU cores assigned to a container. This is the minimum of the cgroup limits: cpuset.limit and quota.limit. |
Additional Notes | |
sysdig_container_cpu_cores_quota_limit
| |
---|
Prometheus ID | sysdig_container_cpu_cores_quota_limit |
Legacy ID | cpu.cores.quota.limit |
Metric Type | gauge |
Unit | number |
Description | The number of CPU cores assigned to a container. Technically, the container’s cgroup quota and period. This is a way of creating a CPU limit for a container. |
Additional Notes | |
sysdig_container_cpu_cores_used
| |
---|
Prometheus ID | sysdig_container_cpu_cores_used |
Legacy ID | cpu.cores.used |
Metric Type | gauge |
Unit | number |
Description | The CPU core usage of each container is obtained from cgroups, and is equal to the number of cores used by the container. For example, if a container uses two of an available four cores, the value of sysdig_container_cpu_cores_used will be two. |
Additional Notes | |
sysdig_container_cpu_cores_used_percent
| |
---|
Prometheus ID | sysdig_container_cpu_cores_used_percent |
Legacy ID | cpu.cores.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The CPU core usage percent for each container is obtained from cgroups, and is equal to the number of cores multiplied by 100. For example, if a container uses three cores, the value of sysdig_container_cpu_cores_used_percent would be 300%. |
Additional Notes | |
sysdig_container_cpu_quota_used_percent
| |
---|
Prometheus ID | sysdig_container_cpu_quota_used_percent |
Legacy ID | cpu.quota.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of a container’s CPU Quota that is actually used. CPU Quotas are a common way of creating a CPU limit for a container. CPU Quotas are based on a percentage of time - a container can only spend its quota of time on CPU cycles across a given time period (default period is 100ms). Note that, unlike CPU Shares, CPU Quota is a hard limit to the amount of CPU the container can use - so this metric, CPU Quota %, should not exceed 100%. |
Additional Notes | |
sysdig_container_cpu_shares_count
| |
---|
Prometheus ID | sysdig_container_cpu_shares_count |
Legacy ID | cpu.shares.count |
Metric Type | gauge |
Unit | number |
Description | The number of CPU shares assigned to a container (technically, the container’s cgroup) - this is a common way of creating a CPU limit for a container. CPU Shares represent a relative weight used by the kernel to distribute CPU cycles across different containers. The default value for a container is 1024. Each container receives its own allocation of CPU cycles, according to the ratio of it’s share count vs to the total number of shares claimed by all containers. For example, if you have three containers, each with 1024 shares, then each will recieve 1/3 of the CPU cycles. Note that this is not a hard limit: a container can consume more than its allocation, if the CPU has cycles that aren’t being consumed by the container they were originally allocated to. |
Additional Notes | |
sysdig_container_cpu_shares_used_percent
| |
---|
Prometheus ID | sysdig_container_cpu_shares_used_percent |
Legacy ID | cpu.shares.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of a container’s allocated CPU shares that are actually used. CPU Shares are a common way of creating a CPU limit for a container. CPU Shares represent a relative weight used by the kernel to distribute CPU cycles across different containers. The default value for a container is 1024. Each container receives its own allocation of CPU cycles, according to the ratio of it’s share count vs to the total number of shares claimed by all containers. For example, if you have three containers, each with 1024 shares, then each will recieve 1/3 of the CPU cycles. Note that this is not a hard limit: a container can consume more than its allocation, if the CPU has cycles that aren’t being consumed by the container they were originally allocated to - so this metric, CPU Shares %, can actually exceed 100%. |
Additional Notes | |
sysdig_container_cpu_used_percent
| |
---|
Prometheus ID | sysdig_container_cpu_used_percent |
Legacy ID | cpu.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The CPU usage for each container is obtained from cgroups, and normalized by dividing by the number of cores to determine an overall percentage. For example, if the environment contains six cores on a host, and the container or processes are assigned two cores, Sysdig will report CPU usage of 2/6 * 100% = 33.33%. This metric is calculated differently for hosts and processes. |
Additional Notes | |
sysdig_container_fd_used_percent
| |
---|
Prometheus ID | sysdig_container_fd_used_percent |
Legacy ID | fd.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of used file descriptors out of the maximum available. |
Additional Notes | Usually, when a process reaches its FD limit it will stop operating properly and possibly crash. As a consequence, this is a metric you want to monitor carefully, or even better use for alerts. |
sysdig_container_file_error_open_count
| |
---|
Prometheus ID | sysdig_container_file_error_open_count |
Legacy ID | file.error.open.count |
Metric Type | counter |
Unit | number |
Description | The number of errors in opening files. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_file_error_total_count
| |
---|
Prometheus ID | sysdig_container_file_error_total_count |
Legacy ID | file.error.total.count |
Metric Type | counter |
Unit | number |
Description | The number of error caused by file access. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_file_in_bytes
| |
---|
Prometheus ID | sysdig_container_file_in_bytes |
Legacy ID | file.bytes.in |
Metric Type | counter |
Unit | data |
Description | The amount of bytes read from file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_file_in_iops
| |
---|
Prometheus ID | sysdig_container_file_in_iops |
Legacy ID | file.iops.in |
Metric Type | counter |
Unit | number |
Description | The number of file read operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_container_file_in_time
| |
---|
Prometheus ID | sysdig_container_file_in_time |
Legacy ID | file.time.in |
Metric Type | counter |
Unit | time |
Description | The time spent in file reading. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_file_open_count
| |
---|
Prometheus ID | sysdig_container_file_open_count |
Legacy ID | file.open.count |
Metric Type | counter |
Unit | number |
Description | The number of time the file has been opened. |
Additional Notes | |
sysdig_container_file_out_bytes
| |
---|
Prometheus ID | sysdig_container_file_out_bytes |
Legacy ID | file.bytes.out |
Metric Type | counter |
Unit | data |
Description | The number of of bytes written to file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_file_out_iops
| |
---|
Prometheus ID | sysdig_container_file_out_iops |
Legacy ID | file.iops.out |
Metric Type | counter |
Unit | number |
Description | The Number of file write operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_container_file_out_time
| |
---|
Prometheus ID | sysdig_container_file_out_time |
Legacy ID | file.time.out |
Metric Type | counter |
Unit | time |
Description | The time spent in file writing. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_file_total_bytes
| |
---|
Prometheus ID | sysdig_container_file_total_bytes |
Legacy ID | file.bytes.total |
Metric Type | counter |
Unit | data |
Description | The number of bytes read from and written to file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_file_total_iops
| |
---|
Prometheus ID | sysdig_container_file_total_iops |
Legacy ID | file.iops.total |
Metric Type | counter |
Unit | number |
Description | The number of read and write file operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_container_file_total_time
| |
---|
Prometheus ID | sysdig_container_file_total_time |
Legacy ID | file.time.total |
Metric Type | counter |
Unit | time |
Description | The time spent in file I/O. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_fs_free_bytes
| |
---|
Prometheus ID | sysdig_container_fs_free_bytes |
Legacy ID | fs.bytes.free |
Metric Type | gauge |
Unit | data |
Description | The available space in the filesystem. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_free_percent
| |
---|
Prometheus ID | sysdig_container_fs_free_percent |
Legacy ID | fs.free.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of free space in the filesystem. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_inodes_total_count
| |
---|
Prometheus ID | sysdig_container_fs_inodes_total_count |
Legacy ID | fs.inodes.total.count |
Metric Type | gauge |
Unit | number |
Description | The total number of inodes in the filesystem. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_inodes_used_count
| |
---|
Prometheus ID | sysdig_container_fs_inodes_used_count |
Legacy ID | fs.inodes.used.count |
Metric Type | gauge |
Unit | number |
Description | The number of inodes used in the filesystem. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_inodes_used_percent
| |
---|
Prometheus ID | sysdig_container_fs_inodes_used_percent |
Legacy ID | fs.inodes.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of inodes usage in the filesystem. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_largest_used_percent
| |
---|
Prometheus ID | sysdig_container_fs_largest_used_percent |
Legacy ID | fs.largest.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of the largest filesystem in use. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_root_used_percent
| |
---|
Prometheus ID | sysdig_container_fs_root_used_percent |
Legacy ID | fs.root.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of the root filesystem in use in the container. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_total_bytes
| |
---|
Prometheus ID | sysdig_container_fs_total_bytes |
Legacy ID | fs.bytes.total |
Metric Type | gauge |
Unit | data |
Description | The size of container filesystem. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_used_bytes
| |
---|
Prometheus ID | sysdig_container_fs_used_bytes |
Legacy ID | fs.bytes.used |
Metric Type | gauge |
Unit | data |
Description | The used space in the container filesystem. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_fs_used_percent
| |
---|
Prometheus ID | sysdig_container_fs_used_percent |
Legacy ID | fs.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of the sum of all filesystems in use in the container. |
Additional Notes | Container Filesystem metrics report data on filesystems mounted to containers. These are the most useful metrics for stateful containers which have dedicated file storage mounted. Use these metrics with appropriate scoping. Care should be taken when aggregating filesystem metrics to ensure that there is no “double counting” of filesystems that are mounted to multiple containers. Additionally, the metrics from overlay type file systems are generally not reported, so these metrics typically will not show the actual space consumed by a container. |
sysdig_container_info
| |
---|
Prometheus ID | sysdig_container_info |
Legacy ID | info |
Metric Type | gauge |
Unit | number |
Description | The info metrics will always have the value of 1. |
Additional Notes | |
sysdig_container_memory_limit_bytes
| |
---|
Prometheus ID | sysdig_container_memory_limit_bytes |
Legacy ID | memory.limit.bytes |
Metric Type | gauge |
Unit | data |
Description | The memory limit in bytes assigned to a container. |
Additional Notes | |
sysdig_container_memory_limit_used_percent
| |
---|
Prometheus ID | sysdig_container_memory_limit_used_percent |
Legacy ID | memory.limit.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of memory limit used by a container. |
Additional Notes | |
sysdig_container_memory_used_bytes
| |
---|
Prometheus ID | sysdig_container_memory_used_bytes |
Legacy ID | memory.bytes.used |
Metric Type | gauge |
Unit | data |
Description | The amount of physical memory currently in use. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_container_memory_used_percent
| |
---|
Prometheus ID | sysdig_container_memory_used_percent |
Legacy ID | memory.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of physical memory in use. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_memory_virtual_bytes
| |
---|
Prometheus ID | sysdig_container_memory_virtual_bytes |
Legacy ID | memory.bytes.virtual |
Metric Type | gauge |
Unit | data |
Description | The virtual memory size of the process, in bytes. This value is obtained from Sysdig events. |
Additional Notes | |
sysdig_container_net_connection_in_count
| |
---|
Prometheus ID | sysdig_container_net_connection_in_count |
Legacy ID | net.connection.count.in |
Metric Type | counter |
Unit | number |
Description | The number of currently established client (inbound) connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_container_net_connection_out_count
| |
---|
Prometheus ID | sysdig_container_net_connection_out_count |
Legacy ID | net.connection.count.out |
Metric Type | counter |
Unit | number |
Description | The number of currently established server (outbound) connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_container_net_connection_total_count
| |
---|
Prometheus ID | sysdig_container_net_connection_total_count |
Legacy ID | net.connection.count.total |
Metric Type | counter |
Unit | number |
Description | The number of currently established connections. This value may exceed the sum of the inbound and outbound metrics since it represents client and server inter-host connections as well as internal only connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_container_net_error_count
| |
---|
Prometheus ID | sysdig_container_net_error_count |
Legacy ID | net.error.count |
Metric Type | counter |
Unit | number |
Description | The number of network errors. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_net_http_error_count
| |
---|
Prometheus ID | sysdig_container_net_http_error_count |
Legacy ID | net.http.error.count |
Metric Type | counter |
Unit | number |
Description | The number of failed HTTP requests as counted from 4xx/5xx status codes. |
Additional Notes | |
sysdig_container_net_http_request_count
| |
---|
Prometheus ID | sysdig_container_net_http_request_count |
Legacy ID | net.http.request.count |
Metric Type | counter |
Unit | number |
Description | The count of HTTP requests. |
Additional Notes | |
sysdig_container_net_http_request_time
| |
---|
Prometheus ID | sysdig_container_net_http_request_time |
Legacy ID | net.http.request.time |
Metric Type | counter |
Unit | time |
Description | The average time taken for HTTP requests. |
Additional Notes | |
sysdig_container_net_http_statuscode_error_count
| |
---|
Prometheus ID | sysdig_container_net_http_statuscode_error_count |
Legacy ID | net.http.statuscode.error.count |
Metric Type | counter |
Unit | number |
Description | The number of HTTP error codes returned. |
Additional Notes | |
sysdig_container_net_http_statuscode_request_count
| |
---|
Prometheus ID | sysdig_container_net_http_statuscode_request_count |
Legacy ID | net.http.statuscode.request.count |
Metric Type | counter |
Unit | number |
Description | The number of HTTP status codes requests. |
Additional Notes | |
sysdig_container_net_http_url_error_count
| |
---|
Prometheus ID | sysdig_container_net_http_url_error_count |
Legacy ID | net.http.url.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_container_net_http_url_request_count
| |
---|
Prometheus ID | sysdig_container_net_http_url_request_count |
Legacy ID | net.http.url.request.count |
Metric Type | counter |
Unit | number |
Description | The number of HTTP URLs requests. |
Additional Notes | |
sysdig_container_net_http_url_request_time
| |
---|
Prometheus ID | sysdig_container_net_http_url_request_time |
Legacy ID | net.http.url.request.time |
Metric Type | counter |
Unit | time |
Description | The time taken for requesting HTTP URLs. |
Additional Notes | |
sysdig_container_net_in_bytes
| |
---|
Prometheus ID | sysdig_container_net_in_bytes |
Legacy ID | net.bytes.in |
Metric Type | counter |
Unit | data |
Description | The number of inbound network bytes. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_net_mongodb_error_count
| |
---|
Prometheus ID | sysdig_container_net_mongodb_error_count |
Legacy ID | net.mongodb.error.count |
Metric Type | counter |
Unit | number |
Description | The number of Failed MongoDB requests. |
Additional Notes | |
sysdig_container_net_mongodb_request_count
| |
---|
Prometheus ID | sysdig_container_net_mongodb_request_count |
Legacy ID | net.mongodb.request.count |
Metric Type | counter |
Unit | number |
Description | The total number of MongoDB requests. |
Additional Notes | |
sysdig_container_net_out_bytes
| |
---|
Prometheus ID | sysdig_container_net_out_bytes |
Legacy ID | net.bytes.out |
Metric Type | counter |
Unit | data |
Description | The number of outbound network bytes. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_net_request_count
| |
---|
Prometheus ID | sysdig_container_net_request_count |
Legacy ID | net.request.count |
Metric Type | counter |
Unit | number |
Description | The total number of network requests. Note, this value may exceed the sum of inbound and outbound requests, because this count includes requests over internal connections. |
Additional Notes | |
sysdig_container_net_request_in_count
| |
---|
Prometheus ID | sysdig_container_net_request_in_count |
Legacy ID | net.request.count.in |
Metric Type | counter |
Unit | number |
Description | The number of inbound network requests. |
Additional Notes | |
sysdig_container_net_request_in_time
| |
---|
Prometheus ID | sysdig_container_net_request_in_time |
Legacy ID | net.request.time.in |
Metric Type | counter |
Unit | time |
Description | The average time to serve an inbound request. |
Additional Notes | |
sysdig_container_net_request_out_count
| |
---|
Prometheus ID | sysdig_container_net_request_out_count |
Legacy ID | net.request.count.out |
Metric Type | counter |
Unit | number |
Description | The number of outbound network requests. |
Additional Notes | |
sysdig_container_net_request_out_time
| |
---|
Prometheus ID | sysdig_container_net_request_out_time |
Legacy ID | net.request.time.out |
Metric Type | counter |
Unit | time |
Description | The average time spent waiting for an outbound request. |
Additional Notes | |
sysdig_container_net_request_time
| |
---|
Prometheus ID | sysdig_container_net_request_time |
Legacy ID | net.request.time |
Metric Type | counter |
Unit | time |
Description | The average time to serve a network request. |
Additional Notes | |
sysdig_container_net_server_connection_in_count
| |
---|
Prometheus ID | sysdig_container_net_server_connection_in_count |
Legacy ID | net.server.connection.count.in |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_container_net_server_in_bytes
| |
---|
Prometheus ID | sysdig_container_net_server_in_bytes |
Legacy ID | net.server.bytes.in |
Metric Type | counter |
Unit | data |
Description | |
Additional Notes | |
sysdig_container_net_server_out_bytes
| |
---|
Prometheus ID | sysdig_container_net_server_out_bytes |
Legacy ID | net.server.bytes.out |
Metric Type | counter |
Unit | data |
Description | |
Additional Notes | |
sysdig_container_net_server_total_bytes
| |
---|
Prometheus ID | sysdig_container_net_server_total_bytes |
Legacy ID | net.server.bytes.total |
Metric Type | counter |
Unit | data |
Description | |
Additional Notes | |
sysdig_container_net_sql_error_count
| |
---|
Prometheus ID | sysdig_container_net_sql_error_count |
Legacy ID | net.sql.error.count |
Metric Type | counter |
Unit | number |
Description | The number of failed SQL requests. |
Additional Notes | |
sysdig_container_net_sql_query_error_count
| |
---|
Prometheus ID | sysdig_container_net_sql_query_error_count |
Legacy ID | net.sql.query.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_container_net_sql_query_request_count
| |
---|
Prometheus ID | sysdig_container_net_sql_query_request_count |
Legacy ID | net.sql.query.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_container_net_sql_query_request_time
| |
---|
Prometheus ID | sysdig_container_net_sql_query_request_time |
Legacy ID | net.sql.query.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_container_net_sql_querytype_error_count
| |
---|
Prometheus ID | sysdig_container_net_sql_querytype_error_count |
Legacy ID | net.sql.querytype.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_container_net_sql_querytype_request_count
| |
---|
Prometheus ID | sysdig_container_net_sql_querytype_request_count |
Legacy ID | net.sql.querytype.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_container_net_sql_querytype_request_time
| |
---|
Prometheus ID | sysdig_container_net_sql_querytype_request_time |
Legacy ID | net.sql.querytype.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_container_net_sql_request_count
| |
---|
Prometheus ID | sysdig_container_net_sql_request_count |
Legacy ID | net.sql.request.count |
Metric Type | counter |
Unit | number |
Description | The number of SQL requests. |
Additional Notes | |
sysdig_container_net_sql_request_time
| |
---|
Prometheus ID | sysdig_container_net_sql_request_time |
Legacy ID | net.sql.request.time |
Metric Type | counter |
Unit | time |
Description | The average time to complete an SQL request. |
Additional Notes | |
sysdig_container_net_sql_table_error_count
| |
---|
Prometheus ID | sysdig_container_net_sql_table_error_count |
Legacy ID | net.sql.table.error.count |
Metric Type | counter |
Unit | number |
Description | The total number of SQL errors returned. |
Additional Notes | |
sysdig_container_net_sql_table_request_count
| |
---|
Prometheus ID | sysdig_container_net_sql_table_request_count |
Legacy ID | net.sql.table.request.count |
Metric Type | counter |
Unit | number |
Description | The total number of SQL table requests. |
Additional Notes | |
sysdig_container_net_sql_table_request_time
| |
---|
Prometheus ID | sysdig_container_net_sql_table_request_time |
Legacy ID | net.sql.table.request.time |
Metric Type | counter |
Unit | time |
Description | The average time to serve an SQL table request. |
Additional Notes | |
sysdig_container_net_tcp_queue_len
| |
---|
Prometheus ID | sysdig_container_net_tcp_queue_len |
Legacy ID | net.tcp.queue.len |
Metric Type | counter |
Unit | number |
Description | The length of the TCP request queue. |
Additional Notes | |
sysdig_container_net_total_bytes
| |
---|
Prometheus ID | sysdig_container_net_total_bytes |
Legacy ID | net.bytes.total |
Metric Type | counter |
Unit | data |
Description | The total number of network bytes, including inbound and outbound connections. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_proc_count
| |
---|
Prometheus ID | sysdig_container_proc_count |
Legacy ID | proc.count |
Metric Type | counter |
Unit | number |
Description | The number of processes on host or container. |
Additional Notes | |
sysdig_container_swap_limit_bytes
| |
---|
Prometheus ID | sysdig_container_swap_limit_bytes |
Legacy ID | swap.limit.bytes |
Metric Type | gauge |
Unit | data |
Description | The swap limit in bytes assigned to a container. |
Additional Notes | |
sysdig_container_swap_limit_used_percent
| |
---|
Prometheus ID | sysdig_container_swap_limit_used_percent |
Legacy ID | swap.limit.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of swap limit used by the container. |
Additional Notes | |
sysdig_container_syscall_count
| |
---|
Prometheus ID | sysdig_container_syscall_count |
Legacy ID | syscall.count |
Metric Type | gauge |
Unit | number |
Description | The total number of syscalls seen. |
Additional Notes | Syscalls are resource intensive. This metric tracks how many have been made by a given process or container |
sysdig_container_syscall_error_count
| |
---|
Prometheus ID | sysdig_container_syscall_error_count |
Legacy ID | host.error.count |
Metric Type | counter |
Unit | number |
Description | The number of system call errors. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_container_thread_count
| |
---|
Prometheus ID | sysdig_container_thread_count |
Legacy ID | thread.count |
Metric Type | counter |
Unit | number |
Description | The number of threads running in a container. |
Additional Notes | |
sysdig_container_timeseries_count_appcheck
| |
---|
Prometheus ID | sysdig_container_timeseries_count_appcheck |
Legacy ID | metricCount.appCheck |
Metric Type | gauge |
Unit | number |
Description | The number of appcheck custom metrics. |
Additional Notes | |
sysdig_container_timeseries_count_jmx
| |
---|
Prometheus ID | sysdig_container_timeseries_count_jmx |
Legacy ID | metricCount.jmx |
Metric Type | gauge |
Unit | number |
Description | The number of JMX custom metrics. |
Additional Notes | |
sysdig_container_timeseries_count_prometheus
| |
---|
Prometheus ID | sysdig_container_timeseries_count_prometheus |
Legacy ID | metricCount.prometheus |
Metric Type | gauge |
Unit | number |
Description | The number of Prometheus custom metrics. |
Additional Notes | |
sysdig_container_timeseries_count_statsd
| |
---|
Prometheus ID | sysdig_container_timeseries_count_statsd |
Legacy ID | metricCount.statsd |
Metric Type | gauge |
Unit | number |
Description | The number of StatsD custom metrics. |
Additional Notes | |
sysdig_container_up
| |
---|
Prometheus ID | sysdig_container_up |
Legacy ID | uptime |
Metric Type | gauge |
Unit | number |
Description | The percentage of time the selected entity was down during the visualized time sample. This can be used to determine if a machine (or a group of machines) went down. |
Additional Notes | |
6.2.3 - Metric Labels
_sysdig_datasource
| |
---|
Prometheus ID | _sysdig_datasource |
Legacy ID | _sysdig_datasource |
OSS KSM ID | - |
Category | Sysdig |
Description | Indicates the ingestion data source for the metric. |
Additional Notes | |
agent_id
| |
---|
Prometheus ID | agent_id |
Legacy ID | agent.id |
OSS KSM ID | - |
Category | Agent |
Description | Unique agent id which sent the metric timeseries from the host |
Additional Notes | |
agent_mode
| |
---|
Prometheus ID | agent_mode |
Legacy ID | agent.mode |
OSS KSM ID | - |
Category | Agent |
Description | |
Additional Notes | |
agent_version
| |
---|
Prometheus ID | agent_version |
Legacy ID | agent.version |
OSS KSM ID | - |
Category | Agent |
Description | The Sysdig’s agent installed version |
Additional Notes | |
cloud_provider_account_id
| |
---|
Prometheus ID | cloud_provider_account_id |
Legacy ID | cloudProvider.account.id |
OSS KSM ID | - |
Category | Cloud Provider |
Description | The account number related to your AWS account - useful when you have multiple AWS accounts linked with Sysdig Monitor. |
Additional Notes | |
cloud_provider_availability_zone
| |
---|
Prometheus ID | cloud_provider_availability_zone |
Legacy ID | cloudProvider.availabilityZone |
OSS KSM ID | - |
Category | Cloud Provider |
Description | The AWS Availability Zone where the entity or entities are located. Each Availability zone is an isolated subsection of an AWS region (see cloudProvider.region). |
Additional Notes | |
cloud_provider_host_ip_private
| |
---|
Prometheus ID | cloud_provider_host_ip_private |
Legacy ID | cloudProvider.host.ip.private |
OSS KSM ID | - |
Category | Cloud Provider |
Description | The private IP address allocated by the cloud provider for the instance. This address can be used for communication between instances in the same network. |
Additional Notes | |
cloud_provider_host_ip_public
| |
---|
Prometheus ID | cloud_provider_host_ip_public |
Legacy ID | cloudProvider.host.ip.public |
OSS KSM ID | - |
Category | Cloud Provider |
Description | Public IP addresses of the selected host. |
Additional Notes | |
cloud_provider_host_name
| |
---|
Prometheus ID | cloud_provider_host_name |
Legacy ID | cloudProvider.host.name |
OSS KSM ID | - |
Category | Cloud Provider |
Description | The name of the host as reported by the cloud provider (e.g. AWS). |
Additional Notes | |
cloud_provider_id
| |
---|
Prometheus ID | cloud_provider_id |
Legacy ID | cloudProvider.id |
OSS KSM ID | - |
Category | Cloud Provider |
Description | ID number as assigned and reported by the cloud provider. |
Additional Notes | |
cloud_provider_instance_type
| |
---|
Prometheus ID | cloud_provider_instance_type |
Legacy ID | cloudProvider.instance.type |
OSS KSM ID | - |
Category | Cloud Provider |
Description | The type of AWS instance. |
Additional Notes | This metric is extremely useful to segment instances and compare their resource usage and saturation. You can use it as a grouping criteria for the explore table to quickly explore AWS usage on a per-instance-type basis. You can also use it to compare things like CPU usage, number of requests or network utilization for different instance types. |
cloud_provider_name
| |
---|
Prometheus ID | cloud_provider_name |
Legacy ID | cloudProvider.name |
OSS KSM ID | - |
Category | Cloud Provider |
Description | Name of the cloud service provider (AWS, etc.). |
Additional Notes | |
cloud_provider_region
| |
---|
Prometheus ID | cloud_provider_region |
Legacy ID | cloudProvider.region |
OSS KSM ID | - |
Category | Cloud Provider |
Description | The AWS region where the host (or group of hosts) is located. |
Additional Notes | Use this grouping criteria in conjunction with the host.count metric to easily create a report on how many instances you have in each region. |
cloud_provider_resource_endpoint
| |
---|
Prometheus ID | cloud_provider_resource_endpoint |
Legacy ID | cloudProvider.resource.endPoint |
OSS KSM ID | - |
Category | Cloud Provider |
Description | DNS name for which the resource can be accessed. |
Additional Notes | |
cloud_provider_resource_name
| |
---|
Prometheus ID | cloud_provider_resource_name |
Legacy ID | cloudProvider.resource.name |
OSS KSM ID | - |
Category | Cloud Provider |
Description | The AWS service name (e.g. EC2, RDS, ELB). |
Additional Notes | |
cloud_provider_resource_type
| |
---|
Prometheus ID | cloud_provider_resource_type |
Legacy ID | cloudProvider.resource.type |
OSS KSM ID | - |
Category | Cloud Provider |
Description | The service type (e.g. INSTANCE, LOAD_BALANCER, DATABASE). |
Additional Notes | |
cloud_provider_security_groups
| |
---|
Prometheus ID | cloud_provider_security_groups |
Legacy ID | cloudProvider.securityGroups |
OSS KSM ID | - |
Category | Cloud Provider |
Description | Security Groups Name. |
Additional Notes | |
cloud_provider_status
| |
---|
Prometheus ID | cloud_provider_status |
Legacy ID | cloudProvider.status |
OSS KSM ID | - |
Category | Cloud Provider |
Description | Resource status. |
Additional Notes | |
container_full_id
| |
---|
Prometheus ID | container_full_id |
Legacy ID | container.full.id |
OSS KSM ID | - |
Category | Container |
Description | The full UID of the running container as retrieved from the container runtime. |
Additional Notes | |
container_id
| |
---|
Prometheus ID | container_id |
Legacy ID | container.id |
OSS KSM ID | - |
Category | Container |
Description | The short ID of the running container via truncating the full ID. In case of Docker, this is a 12 digit hex number. |
Additional Notes | |
container_image
| |
---|
Prometheus ID | container_image |
Legacy ID | container.image |
OSS KSM ID | - |
Category | Container |
Description | The name of the image used to run the container. |
Additional Notes | |
container_image_digest
| |
---|
Prometheus ID | container_image_digest |
Legacy ID | container.image.digest |
OSS KSM ID | - |
Category | Container |
Description | The digest of the image used to run the container. |
Additional Notes | |
container_image_id
| |
---|
Prometheus ID | container_image_id |
Legacy ID | container.image.id |
OSS KSM ID | - |
Category | Container |
Description | The ID of the image used to run the container. |
Additional Notes | |
container_image_repo
| |
---|
Prometheus ID | container_image_repo |
Legacy ID | container.image.repo |
OSS KSM ID | - |
Category | Container |
Description | The repo where the image used to run the container was retrieved from. Empty if image wasn’t retrieved from a remote repository. |
Additional Notes | |
container_image_tag
| |
---|
Prometheus ID | container_image_tag |
Legacy ID | container.image.tag |
OSS KSM ID | - |
Category | Container |
Description | The tag of the image used to run the container. |
Additional Notes | |
container_label_io_kubernetes_container_name
| |
---|
Prometheus ID | container_label_io_kubernetes_container_name |
Legacy ID | container.label.io.kubernetes.container.name |
OSS KSM ID | - |
Category | Container |
Description | Label set on the container in the container runtime when running in a Kubernetes environment. This label will match the container name set in the Kubernetes manifest for the Pod. |
Additional Notes | |
container_label_io_kubernetes_pod_name
| |
---|
Prometheus ID | container_label_io_kubernetes_pod_name |
Legacy ID | container.label.io.kubernetes.pod.name |
OSS KSM ID | - |
Category | Container |
Description | Label set on the container in the container runtime when running in a Kubernetes environment. This label will match the Pod name set in the Kubernetes manifest for the Pod. |
Additional Notes | |
container_label_io_kubernetes_pod_namespace
| |
---|
Prometheus ID | container_label_io_kubernetes_pod_namespace |
Legacy ID | container.label.io.kubernetes.pod.namespace |
OSS KSM ID | - |
Category | Container |
Description | Label set on the container in the container runtime when running in a Kubernetes environment. This label will match the Pod namespace set in the Kubernetes manifest for the Pod. |
Additional Notes | |
container_label_io_prometheus_path
| |
---|
Prometheus ID | container_label_io_prometheus_path |
Legacy ID | container.label.io.prometheus.path |
OSS KSM ID | - |
Category | Container |
Description | |
Additional Notes | |
container_label_io_prometheus_port
| |
---|
Prometheus ID | container_label_io_prometheus_port |
Legacy ID | container.label.io.prometheus.port |
OSS KSM ID | - |
Category | Container |
Description | |
Additional Notes | |
container_label_io_prometheus_scrape
| |
---|
Prometheus ID | container_label_io_prometheus_scrape |
Legacy ID | container.label.io.prometheus.scrape |
OSS KSM ID | - |
Category | Container |
Description | |
Additional Notes | |
container_name
| |
---|
Prometheus ID | container_name |
Legacy ID | container.name |
OSS KSM ID | - |
Category | Container |
Description | The name of a running container. |
Additional Notes | |
container_type
| |
---|
Prometheus ID | container_type |
Legacy ID | container.type |
OSS KSM ID | - |
Category | Container |
Description | |
Additional Notes | |
cpu_core
| |
---|
Prometheus ID | cpu_core |
Legacy ID | cpu.core |
OSS KSM ID | - |
Category | Host |
Description | CPU core number |
Additional Notes | |
ecs_cluster_name
| |
---|
Prometheus ID | ecs_cluster_name |
Legacy ID | ecs.clusterName |
OSS KSM ID | - |
Category | ECS |
Description | Amazon ECS cluster name |
Additional Notes | |
ecs_service_name
| |
---|
Prometheus ID | ecs_service_name |
Legacy ID | ecs.serviceName |
OSS KSM ID | - |
Category | ECS |
Description | Amazon ECS service name |
Additional Notes | |
ecs_task_family_name
| |
---|
Prometheus ID | ecs_task_family_name |
Legacy ID | ecs.taskFamilyName |
OSS KSM ID | - |
Category | ECS |
Description | Amazon ECS task family name |
Additional Notes | |
file_mount
| |
---|
Prometheus ID | file_mount |
Legacy ID | file.mount |
OSS KSM ID | - |
Category | File Stats |
Description | File stats mount path |
Additional Notes | |
file_name
| |
---|
Prometheus ID | file_name |
Legacy ID | file.name |
OSS KSM ID | - |
Category | File Stats |
Description | File stats file name including its path |
Additional Notes | |
fs_device
| |
---|
Prometheus ID | fs_device |
Legacy ID | fs.device |
OSS KSM ID | - |
Category | File System |
Description | File system device name |
Additional Notes | |
fs_mount_dir
| |
---|
Prometheus ID | fs_mount_dir |
Legacy ID | fs.mountDir |
OSS KSM ID | - |
Category | File System |
Description | File system mounted dir |
Additional Notes | |
fs_type
| |
---|
Prometheus ID | fs_type |
Legacy ID | fs.type |
OSS KSM ID | - |
Category | File System |
Description | File system type (e.g. EXT, NTFS) |
Additional Notes | |
host_domain
| |
---|
Prometheus ID | host_domain |
Legacy ID | host.domain |
OSS KSM ID | - |
Category | Host |
Description | The domain name for external websites. |
Additional Notes | This label has been deprecated. |
host_hostname
| |
---|
Prometheus ID | host_hostname |
Legacy ID | host.hostName |
OSS KSM ID | - |
Category | Host |
Description | Host name as defined in the /etc/hostname file. |
Additional Notes | |
host_instance_id
| |
---|
Prometheus ID | host_instance_id |
Legacy ID | host.instanceId |
OSS KSM ID | - |
Category | Host |
Description | |
Additional Notes | |
host_ip_private
| |
---|
Prometheus ID | host_ip_private |
Legacy ID | host.ip.private |
OSS KSM ID | - |
Category | Host |
Description | Private machine IP address. |
Additional Notes | |
host_ip_public
| |
---|
Prometheus ID | host_ip_public |
Legacy ID | host.ip.public |
OSS KSM ID | - |
Category | Host |
Description | Public machine IP address. |
Additional Notes | |
host_mac
| |
---|
Prometheus ID | host_mac |
Legacy ID | host.mac |
OSS KSM ID | - |
Category | Host |
Description | Media Access Control address of the host. |
Additional Notes | |
kube_cluster_id
| |
---|
Prometheus ID | kube_cluster_id |
Legacy ID | kubernetes.cluster.id |
OSS KSM ID | id |
Category | Kubernetes |
Description | Uniquely identifying ID for a cluster |
Additional Notes | As there is no concept of a cluster ID in Kubernetes, this label is populated with the UID of the “default” namespace in the cluster |
kube_cluster_name
| |
---|
Prometheus ID | kube_cluster_name |
Legacy ID | kubernetes.cluster.name |
OSS KSM ID | cluster |
Category | Kubernetes |
Description | User-defined name for the cluster |
Additional Notes | The cluster name is set by the user via the “k8s_cluster_name” configuration parameter in the Agent or by adding an Agent tag with a key called “cluster”. If the user doesn’t set it, this label will not exist. |
concurrency_policy
| |
---|
Prometheus ID | concurrency_policy |
Legacy ID | kubernetes.cronjob.concurrencyPolicy |
OSS KSM ID | - |
Category | Kubernetes |
Description | Specifies how to treat concurrent executions created by this Cron Job. Value can be “Allow”, “Forbid”, or “Replace” |
Additional Notes | |
kube_cronjob_name
| |
---|
Prometheus ID | kube_cronjob_name |
Legacy ID | kubernetes.cronjob.name |
OSS KSM ID | cronjob |
Category | Kubernetes |
Description | Name of the Cron Job as retrieved from the API server. |
Additional Notes | |
schedule
| |
---|
Prometheus ID | schedule |
Legacy ID | kubernetes.cronjob.schedule |
OSS KSM ID | - |
Category | Kubernetes |
Description | The scheduled time in which the Cron Job will run. Will be a Cron format string. |
Additional Notes | |
kube_daemonset_name
| |
---|
Prometheus ID | kube_daemonset_name |
Legacy ID | kubernetes.daemonSet.name |
OSS KSM ID | daemonset |
Category | Kubernetes |
Description | Name of the DaemonSet as retrieved from the API server. |
Additional Notes | |
kube_daemonset_uid
| |
---|
Prometheus ID | kube_daemonset_uid |
Legacy ID | kubernetes.daemonSet.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the DaemonSet as retrieved from the API server. |
Additional Notes | |
kube_deployment_name
| |
---|
Prometheus ID | kube_deployment_name |
Legacy ID | kubernetes.deployment.name |
OSS KSM ID | deployment |
Category | Kubernetes |
Description | Name of the Deployment as retrieved from the API server. |
Additional Notes | |
kube_deployment_uid
| |
---|
Prometheus ID | kube_deployment_uid |
Legacy ID | kubernetes.deployment.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Deployment as retrieved from the API server. |
Additional Notes | |
kube_hpa_name
| |
---|
Prometheus ID | kube_hpa_name |
Legacy ID | kubernetes.hpa.name |
OSS KSM ID | hpa |
Category | Kubernetes |
Description | Name of the HPA as retrieved from the API server. |
Additional Notes | |
kube_hpa_uid
| |
---|
Prometheus ID | kube_hpa_uid |
Legacy ID | kubernetes.hpa.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the HPA as retrieved from the API server. |
Additional Notes | |
kube_job_name
| |
---|
Prometheus ID | kube_job_name |
Legacy ID | kubernetes.job.name |
OSS KSM ID | job_name |
Category | Kubernetes |
Description | Name of the Job as retrieved from the API server. |
Additional Notes | |
kube_job_owner_is_controller
| |
---|
Prometheus ID | kube_job_owner_is_controller |
Legacy ID | kubernetes.job.owner.isController |
OSS KSM ID | owner_is_controller |
Category | Kubernetes |
Description | Designates whether the Job is created by a higher-level controller object |
Additional Notes | |
kube_job_owner_kind
| |
---|
Prometheus ID | kube_job_owner_kind |
Legacy ID | kubernetes.job.owner.kind |
OSS KSM ID | owner_kind |
Category | Kubernetes |
Description | The workload resource type of the object that created the Job if owned by a higher-level controller object |
Additional Notes | |
kube_job_owner_name
| |
---|
Prometheus ID | kube_job_owner_name |
Legacy ID | kubernetes.job.owner.name |
OSS KSM ID | owner_name |
Category | Kubernetes |
Description | The name of the object that created the Job if owned by a higher-level controller object |
Additional Notes | |
kube_job_uid
| |
---|
Prometheus ID | kube_job_uid |
Legacy ID | kubernetes.job.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Job as retrieved from the API server. |
Additional Notes | |
kube_namespace_name
| |
---|
Prometheus ID | kube_namespace_name |
Legacy ID | kubernetes.namespace.name |
OSS KSM ID | namespace |
Category | Kubernetes |
Description | Name of the Namespace as retrieved from the API server. |
Additional Notes | |
kube_namespace_uid
| |
---|
Prometheus ID | kube_namespace_uid |
Legacy ID | kubernetes.namespace.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Namespace as retrieved from the API server. |
Additional Notes | |
kube_node_condition
| |
---|
Prometheus ID | kube_node_condition |
Legacy ID | kubernetes.node.condition |
OSS KSM ID | condition |
Category | Kubernetes |
Description | Describes the status of the Node. Can be Ready, DiskPressure, OutOfDisk, MemoryPressure, or Unschedulable. |
Additional Notes | |
kube_node_name
| |
---|
Prometheus ID | kube_node_name |
Legacy ID | kubernetes.node.name |
OSS KSM ID | node |
Category | Kubernetes |
Description | Name of the Node as retrieved from the API server. |
Additional Notes | |
kube_node_resource
| |
---|
Prometheus ID | kube_node_resource |
Legacy ID | kubernetes.node.resource |
OSS KSM ID | resource |
Category | Kubernetes |
Description | Indicates the capacity or allocatable limit for the different resources of a node |
Additional Notes | |
kube_node_status
| |
---|
Prometheus ID | kube_node_status |
Legacy ID | kubernetes.node.status |
OSS KSM ID | status |
Category | Kubernetes |
Description | Used in combination with the kube_node_condition label to indicate the boolean value of that label |
Additional Notes | |
kube_node_uid
| |
---|
Prometheus ID | kube_node_uid |
Legacy ID | kubernetes.node.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Node as retrieved from the API server. |
Additional Notes | |
kube_node_unit
| |
---|
Prometheus ID | kube_node_unit |
Legacy ID | kubernetes.node.unit |
OSS KSM ID | unit |
Category | Kubernetes |
Description | Used in combination with the kube_node_resource label to indicate the unit of that label |
Additional Notes | |
name
| |
---|
Prometheus ID | name |
Legacy ID | kubernetes.persistentvolume.claim.ref.name |
OSS KSM ID | - |
Category | Kubernetes |
Description | Name of the Persistent Volume’s claimRef as retrieved from the API server. |
Additional Notes | |
claim_namespace
| |
---|
Prometheus ID | claim_namespace |
Legacy ID | kubernetes.persistentvolume.claim.ref.namespace |
OSS KSM ID | - |
Category | Kubernetes |
Description | Namespace of the Persistent Volume’s claimRef as retrieved from the API server. |
Additional Notes | |
kube_persistentvolume_name
| |
---|
Prometheus ID | kube_persistentvolume_name |
Legacy ID | kubernetes.persistentvolume.name |
OSS KSM ID | persistentvolume |
Category | Kubernetes |
Description | Name of the Persistent Volume as retrieved from the API server. |
Additional Notes | |
kube_persistentvolume_uid
| |
---|
Prometheus ID | kube_persistentvolume_uid |
Legacy ID | kubernetes.persistentvolume.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Persistent Volume as retrieved from the API server. |
Additional Notes | |
access_mode
| |
---|
Prometheus ID | access_mode |
Legacy ID | kubernetes.persistentvolumeclaim.accessMode |
OSS KSM ID | - |
Category | Kubernetes |
Description | Access mode of the PVC as retrieved from the API server. |
Additional Notes | |
status
| |
---|
Prometheus ID | status |
Legacy ID | kubernetes.persistentvolumeclaim.condition.status |
OSS KSM ID | - |
Category | Kubernetes |
Description | Used in combination with the type label to indicate the boolean value of that label |
Additional Notes | |
type
| |
---|
Prometheus ID | type |
Legacy ID | kubernetes.persistentvolumeclaim.condition.type |
OSS KSM ID | - |
Category | Kubernetes |
Description | The type of the condition that the PVC is in |
Additional Notes | |
kube_persistentvolumeclaim_name
| |
---|
Prometheus ID | kube_persistentvolumeclaim_name |
Legacy ID | kubernetes.persistentvolumeclaim.name |
OSS KSM ID | persistentvolumeclaim |
Category | Kubernetes |
Description | Name of the PVC as retrieved from the API server. |
Additional Notes | |
phase
| |
---|
Prometheus ID | phase |
Legacy ID | kubernetes.persistentvolumeclaim.phase |
OSS KSM ID | - |
Category | Kubernetes |
Description | The phase that the PVC is in. Will be Available, Bound, Released, or Failed. |
Additional Notes | |
kube_persistentvolumeclaim_uid
| |
---|
Prometheus ID | kube_persistentvolumeclaim_uid |
Legacy ID | kubernetes.persistentvolumeclaim.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the PVC as retrieved from the API server. |
Additional Notes | |
kube_pod_condition
| |
---|
Prometheus ID | kube_pod_condition |
Legacy ID | kubernetes.pod.condition |
OSS KSM ID | condition |
Category | Kubernetes |
Description | The condition that the Pod is in. Will be PodScheduled, ContainersReady, Initialized, or Ready |
Additional Notes | |
kube_pod_container_full_id
| |
---|
Prometheus ID | kube_pod_container_full_id |
Legacy ID | kubernetes.pod.container.full.id |
OSS KSM ID | container_full_id |
Category | Kubernetes |
Description | The full UID of the container in the Pod |
Additional Notes | |
kube_pod_container_id
| |
---|
Prometheus ID | kube_pod_container_id |
Legacy ID | kubernetes.pod.container.id |
OSS KSM ID | container_id |
Category | Kubernetes |
Description | A short ID from truncating the full UID of the container in the Pod |
Additional Notes | |
kube_pod_container_name
| |
---|
Prometheus ID | kube_pod_container_name |
Legacy ID | kubernetes.pod.container.name |
OSS KSM ID | container |
Category | Kubernetes |
Description | The name of the container in the Pod |
Additional Notes | |
kube_pod_container_reason
| |
---|
Prometheus ID | kube_pod_container_reason |
Legacy ID | kubernetes.pod.container.reason |
OSS KSM ID | reason |
Category | Kubernetes |
Description | The reason that the container is in the state that it is in. |
Additional Notes | |
kube_pod_internal_ip
| |
---|
Prometheus ID | kube_pod_internal_ip |
Legacy ID | kubernetes.pod.internalIp |
OSS KSM ID | internal_ip |
Category | Kubernetes |
Description | The IP address associated with the Pod |
Additional Notes | |
kube_pod_name
| |
---|
Prometheus ID | kube_pod_name |
Legacy ID | kubernetes.pod.name |
OSS KSM ID | pod |
Category | Kubernetes |
Description | Name of the Pod as retrieved from the API server. |
Additional Notes | |
kube_pod_node
| |
---|
Prometheus ID | kube_pod_node |
Legacy ID | kubernetes.pod.node |
OSS KSM ID | node |
Category | Kubernetes |
Description | The Node on which the Pod is running. |
Additional Notes | |
kube_pod_owner_is_controller
| |
---|
Prometheus ID | kube_pod_owner_is_controller |
Legacy ID | kubernetes.pod.owner.isController |
OSS KSM ID | owner_is_controller |
Category | Kubernetes |
Description | Designates whether the Pod is created by a higher-level controller object |
Additional Notes | |
kube_pod_owner_kind
| |
---|
Prometheus ID | kube_pod_owner_kind |
Legacy ID | kubernetes.pod.owner.kind |
OSS KSM ID | owner_kind |
Category | Kubernetes |
Description | The workload resource type of the object that created the Pod if owned by a higher-level controller object |
Additional Notes | |
kube_pod_owner_name
| |
---|
Prometheus ID | kube_pod_owner_name |
Legacy ID | kubernetes.pod.owner.name |
OSS KSM ID | owner_name |
Category | Kubernetes |
Description | The name of the object that created the Pod if owned by a higher-level controller object |
Additional Notes | |
kube_pod_persistentvolumeclaim
| |
---|
Prometheus ID | kube_pod_persistentvolumeclaim |
Legacy ID | kubernetes.pod.persistentvolumeclaim |
OSS KSM ID | persistentvolumeclaim |
Category | Kubernetes |
Description | The name of the PVC associated with the Pod |
Additional Notes | |
kube_pod_phase
| |
---|
Prometheus ID | kube_pod_phase |
Legacy ID | kubernetes.pod.phase |
OSS KSM ID | phase |
Category | Kubernetes |
Description | The phase that the Pod is in. Can be Pending, Running, Succeeded, Failed, or Unknown. |
Additional Notes | |
kube_pod_pod_ip
| |
---|
Prometheus ID | kube_pod_pod_ip |
Legacy ID | kubernetes.pod.pod.ip |
OSS KSM ID | pod_ip |
Category | Kubernetes |
Description | The IP address associated with the Pod |
Additional Notes | |
kube_pod_reason
| |
---|
Prometheus ID | kube_pod_reason |
Legacy ID | kubernetes.pod.reason |
OSS KSM ID | reason |
Category | Kubernetes |
Description | The reason the Pod is in the phase that it is in. |
Additional Notes | |
kube_pod_resource
| |
---|
Prometheus ID | kube_pod_resource |
Legacy ID | kubernetes.pod.resource |
OSS KSM ID | resource |
Category | Kubernetes |
Description | The Pod’s resource limits and requests. Individual labels are created for memory limits, memory requests, CPU limits, and CPU requests |
Additional Notes | |
kube_pod_uid
| |
---|
Prometheus ID | kube_pod_uid |
Legacy ID | kubernetes.pod.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Pod as retrieved from the API server. |
Additional Notes | |
kube_pod_unit
| |
---|
Prometheus ID | kube_pod_unit |
Legacy ID | kubernetes.pod.unit |
OSS KSM ID | unit |
Category | Kubernetes |
Description | Used in combination with the kube_pod_resource label to indicate the unit of the resource limit or request |
Additional Notes | |
kube_pod_volume
| |
---|
Prometheus ID | kube_pod_volume |
Legacy ID | kubernetes.pod.volume |
OSS KSM ID | volume |
Category | Kubernetes |
Description | Name of the volume associated with the Pod. |
Additional Notes | |
kube_replicaset_name
| |
---|
Prometheus ID | kube_replicaset_name |
Legacy ID | kubernetes.replicaSet.name |
OSS KSM ID | replicaset |
Category | Kubernetes |
Description | Name of the ReplicaSet as retrieved from the API server. |
Additional Notes | |
kube_replicaset_owner_is_controller
| |
---|
Prometheus ID | kube_replicaset_owner_is_controller |
Legacy ID | kubernetes.replicaSet.owner.isController |
OSS KSM ID | owner_is_controller |
Category | Kubernetes |
Description | Designates whether the ReplicaSet is created by a higher-level controller object |
Additional Notes | |
kube_replicaset_owner_kind
| |
---|
Prometheus ID | kube_replicaset_owner_kind |
Legacy ID | kubernetes.replicaSet.owner.kind |
OSS KSM ID | owner_kind |
Category | Kubernetes |
Description | The workload resource type of the object that created the ReplicaSet if owned by a higher-level controller object |
Additional Notes | |
kube_replicaset_owner_name
| |
---|
Prometheus ID | kube_replicaset_owner_name |
Legacy ID | kubernetes.replicaSet.owner.name |
OSS KSM ID | owner_name |
Category | Kubernetes |
Description | The name of the object that created the ReplicaSet if owned by a higher-level controller object |
Additional Notes | |
kube_replicaset_uid
| |
---|
Prometheus ID | kube_replicaset_uid |
Legacy ID | kubernetes.replicaSet.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the ReplicaSet as retrieved from the API server. |
Additional Notes | |
kube_replicationcontroller_name
| |
---|
Prometheus ID | kube_replicationcontroller_name |
Legacy ID | kubernetes.replicationController.name |
OSS KSM ID | replicationcontroller |
Category | Kubernetes |
Description | Name of the Replication Controller as retrieved from the API server. |
Additional Notes | |
kube_replicationcontroller_owner_is_controller
| |
---|
Prometheus ID | kube_replicationcontroller_owner_is_controller |
Legacy ID | kubernetes.replicationController.owner.isController |
OSS KSM ID | owner_is_controller |
Category | Kubernetes |
Description | Designates whether the Replication Controller is created by a higher-level controller object |
Additional Notes | |
kube_replicationcontroller_owner_kind
| |
---|
Prometheus ID | kube_replicationcontroller_owner_kind |
Legacy ID | kubernetes.replicationController.owner.kind |
OSS KSM ID | owner_kind |
Category | Kubernetes |
Description | The workload resource type of the object that created the Replication Controller if owned by a higher-level controller object |
Additional Notes | |
kube_replicationcontroller_owner_name
| |
---|
Prometheus ID | kube_replicationcontroller_owner_name |
Legacy ID | kubernetes.replicationController.owner.name |
OSS KSM ID | owner_name |
Category | Kubernetes |
Description | The name of the object that created the Replication Controller if owned by a higher-level controller object |
Additional Notes | |
kube_replicationcontroller_uid
| |
---|
Prometheus ID | kube_replicationcontroller_uid |
Legacy ID | kubernetes.replicationController.uid |
OSS KSM ID | _uid |
Category | Kubernetes |
Description | Unique ID of the Replication Controller as retrieved from the API server. |
Additional Notes | |
kube_resourcequota_name
| |
---|
Prometheus ID | kube_resourcequota_name |
Legacy ID | kubernetes.resourcequota.name |
OSS KSM ID | resourcequota |
Category | Kubernetes |
Description | Name of the Resource Quota as retrieved from the API server. |
Additional Notes | |
kube_resourcequota_namespace
| |
---|
Prometheus ID | kube_resourcequota_namespace |
Legacy ID | kubernetes.resourcequota.namespace |
OSS KSM ID | namespace |
Category | Kubernetes |
Description | Namespace in which the Resource Quota is being enforced |
Additional Notes | |
kube_resourcequota_resource
| |
---|
Prometheus ID | kube_resourcequota_resource |
Legacy ID | kubernetes.resourcequota.resource |
OSS KSM ID | resource |
Category | Kubernetes |
Description | The resource and the amount of it in which the Resource Quota is being enforced |
Additional Notes | |
kube_resourcequota_resourcequota
| |
---|
Prometheus ID | kube_resourcequota_resourcequota |
Legacy ID | kubernetes.resourcequota.resourcequota |
OSS KSM ID | resourcequota |
Category | Kubernetes |
Description | Name of the Resource Quota as retrieved from the API server. |
Additional Notes | |
kube_resourcequota_type
| |
---|
Prometheus ID | kube_resourcequota_type |
Legacy ID | kubernetes.resourcequota.type |
OSS KSM ID | type |
Category | Kubernetes |
Description | Used in combination with kube_resourcequota_resource to designate whether the amount is Used or is the Hard limit |
Additional Notes | |
kube_resourcequota_uid
| |
---|
Prometheus ID | kube_resourcequota_uid |
Legacy ID | kubernetes.resourcequota.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Resource Quota as retrieved from the API server. |
Additional Notes | |
kube_service_cluster_ip
| |
---|
Prometheus ID | kube_service_cluster_ip |
Legacy ID | kubernetes.service.clusterIp |
OSS KSM ID | cluster_ip |
Category | Kubernetes |
Description | The IP address associated with the Service |
Additional Notes | |
kube_service_name
| |
---|
Prometheus ID | kube_service_name |
Legacy ID | kubernetes.service.name |
OSS KSM ID | service |
Category | Kubernetes |
Description | Name of the Service as retrieved from the API server. |
Additional Notes | |
kube_service_service_ip
| |
---|
Prometheus ID | kube_service_service_ip |
Legacy ID | kubernetes.service.service.ip |
OSS KSM ID | service_ip |
Category | Kubernetes |
Description | The IP address associated with the Service |
Additional Notes | |
kube_service_uid
| |
---|
Prometheus ID | kube_service_uid |
Legacy ID | kubernetes.service.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Service as retrieved from the API server. |
Additional Notes | |
kube_statefulset_name
| |
---|
Prometheus ID | kube_statefulset_name |
Legacy ID | kubernetes.statefulSet.name |
OSS KSM ID | statefulset |
Category | Kubernetes |
Description | Name of the StatefulSet as retrieved from the API server. |
Additional Notes | |
kube_statefulset_uid
| |
---|
Prometheus ID | kube_statefulset_uid |
Legacy ID | kubernetes.statefulSet.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the StatefulSet as retrieved from the API server. |
Additional Notes | |
kube_storageclass_name
| |
---|
Prometheus ID | kube_storageclass_name |
Legacy ID | kubernetes.storageclass.name |
OSS KSM ID | storageclass |
Category | Kubernetes |
Description | Name of the Storage Class as retrieved from the API server. |
Additional Notes | |
provisioner
| |
---|
Prometheus ID | provisioner |
Legacy ID | kubernetes.storageclass.provisioner |
OSS KSM ID | - |
Category | Kubernetes |
Description | The Provisioner of the Storage Class as retrieved from the API server. |
Additional Notes | |
reclaim_policy
| |
---|
Prometheus ID | reclaim_policy |
Legacy ID | kubernetes.storageclass.reclaimPolicy |
OSS KSM ID | - |
Category | Kubernetes |
Description | The reclaim policy for the Storage Class as retrieved from the API server. |
Additional Notes | |
kube_storageclass_uid
| |
---|
Prometheus ID | kube_storageclass_uid |
Legacy ID | kubernetes.storageclass.uid |
OSS KSM ID | uid |
Category | Kubernetes |
Description | Unique ID of the Storage Class as retrieved from the API server. |
Additional Notes | |
volume_binding_mode
| |
---|
Prometheus ID | volume_binding_mode |
Legacy ID | kubernetes.storageclass.volumeBindingMode |
OSS KSM ID | - |
Category | Kubernetes |
Description | The volume binding mode for the Storage Class as retrieved from the API server. |
Additional Notes | |
kube_workload_name
| |
---|
Prometheus ID | kube_workload_name |
Legacy ID | kubernetes.workload.name |
OSS KSM ID | workload_name |
Category | Kubernetes |
Description | The name of the Kubernetes workload resource object |
Additional Notes | |
kube_workload_type
| |
---|
Prometheus ID | kube_workload_type |
Legacy ID | kubernetes.workload.type |
OSS KSM ID | workload_type |
Category | Kubernetes |
Description | The type of the Kubernetes workload resource i.e. Deployment, DaemonSet, Job, etc. |
Additional Notes | |
marathon_app_id
| |
---|
Prometheus ID | marathon_app_id |
Legacy ID | marathon.app.id |
OSS KSM ID | - |
Category | Marathon |
Description | |
Additional Notes | |
marathon_app_name
| |
---|
Prometheus ID | marathon_app_name |
Legacy ID | marathon.app.name |
OSS KSM ID | - |
Category | Marathon |
Description | |
Additional Notes | |
marathon_group_id
| |
---|
Prometheus ID | marathon_group_id |
Legacy ID | marathon.group.id |
OSS KSM ID | - |
Category | Marathon |
Description | |
Additional Notes | |
marathon_group_name
| |
---|
Prometheus ID | marathon_group_name |
Legacy ID | marathon.group.name |
OSS KSM ID | - |
Category | Marathon |
Description | |
Additional Notes | |
mesos_cluster_id
| |
---|
Prometheus ID | mesos_cluster_id |
Legacy ID | mesos.cluster.id |
OSS KSM ID | - |
Category | Mesos |
Description | |
Additional Notes | |
mesos_cluster_name
| |
---|
Prometheus ID | mesos_cluster_name |
Legacy ID | mesos.cluster.name |
OSS KSM ID | - |
Category | Mesos |
Description | |
Additional Notes | |
mesos_framework_id
| |
---|
Prometheus ID | mesos_framework_id |
Legacy ID | mesos.framework.id |
OSS KSM ID | - |
Category | Mesos |
Description | |
Additional Notes | |
mesos_framework_name
| |
---|
Prometheus ID | mesos_framework_name |
Legacy ID | mesos.framework.name |
OSS KSM ID | - |
Category | Mesos |
Description | |
Additional Notes | |
mesos_slave_id
| |
---|
Prometheus ID | mesos_slave_id |
Legacy ID | mesos.slave.id |
OSS KSM ID | - |
Category | Mesos |
Description | |
Additional Notes | |
mesos_slave_name
| |
---|
Prometheus ID | mesos_slave_name |
Legacy ID | mesos.slave.name |
OSS KSM ID | - |
Category | Mesos |
Description | |
Additional Notes | |
mesos_task_id
| |
---|
Prometheus ID | mesos_task_id |
Legacy ID | mesos.task.id |
OSS KSM ID | - |
Category | Mesos |
Description | |
Additional Notes | |
mesos_task_name
| |
---|
Prometheus ID | mesos_task_name |
Legacy ID | mesos.task.name |
OSS KSM ID | - |
Category | Mesos |
Description | |
Additional Notes | |
net_client_ip
| |
---|
Prometheus ID | net_client_ip |
Legacy ID | net.client.ip |
OSS KSM ID | - |
Category | Network |
Description | Client IP address. |
Additional Notes | |
net_http_method
| |
---|
Prometheus ID | net_http_method |
Legacy ID | net.http.method |
OSS KSM ID | - |
Category | Network |
Description | HTTP request method. |
Additional Notes | |
net_http_statuscode
| |
---|
Prometheus ID | net_http_statuscode |
Legacy ID | net.http.statusCode |
OSS KSM ID | - |
Category | Network |
Description | HTTP response status code. |
Additional Notes | |
net_http_url
| |
---|
Prometheus ID | net_http_url |
Legacy ID | net.http.url |
OSS KSM ID | - |
Category | Network |
Description | URL from an HTTP request. |
Additional Notes | |
net_local_endpoint
| |
---|
Prometheus ID | net_local_endpoint |
Legacy ID | net.local.endpoint |
OSS KSM ID | - |
Category | Network |
Description | IP address of a local node. |
Additional Notes | |
net_local_service
| |
---|
Prometheus ID | net_local_service |
Legacy ID | net.local.service |
OSS KSM ID | - |
Category | Network |
Description | Service (port number) of a local node. |
Additional Notes | |
net_mongodb_collection
| |
---|
Prometheus ID | net_mongodb_collection |
Legacy ID | net.mongodb.collection |
OSS KSM ID | - |
Category | Network |
Description | MongoDB collection. |
Additional Notes | |
net_mongodb_operation
| |
---|
Prometheus ID | net_mongodb_operation |
Legacy ID | net.mongodb.operation |
OSS KSM ID | - |
Category | Network |
Description | MongoDB operation. |
Additional Notes | |
net_protocol
| |
---|
Prometheus ID | net_protocol |
Legacy ID | net.protocol |
OSS KSM ID | - |
Category | Network |
Description | The network protocol of a request (e.g. HTTP, MySQL). |
Additional Notes | |
net_remote_endpoint
| |
---|
Prometheus ID | net_remote_endpoint |
Legacy ID | net.remote.endpoint |
OSS KSM ID | - |
Category | Network |
Description | IP address of a remote node. |
Additional Notes | |
net_remote_service
| |
---|
Prometheus ID | net_remote_service |
Legacy ID | net.remote.service |
OSS KSM ID | - |
Category | Network |
Description | Service (port number) of a remote node. |
Additional Notes | |
net_server_ip
| |
---|
Prometheus ID | net_server_ip |
Legacy ID | net.server.ip |
OSS KSM ID | - |
Category | Network |
Description | Server IP address. |
Additional Notes | |
net_server_port
| |
---|
Prometheus ID | net_server_port |
Legacy ID | net.server.port |
OSS KSM ID | - |
Category | Network |
Description | TCP/UDP Server Port number. |
Additional Notes | |
net_sql_query
| |
---|
Prometheus ID | net_sql_query |
Legacy ID | net.sql.query |
OSS KSM ID | - |
Category | Network |
Description | The full SQL query. |
Additional Notes | |
net_sql_querytype
| |
---|
Prometheus ID | net_sql_querytype |
Legacy ID | net.sql.query.type |
OSS KSM ID | - |
Category | Network |
Description | SQL query type (SELECT, INSERT, DELETE, etc.). |
Additional Notes | |
net_sql_table
| |
---|
Prometheus ID | net_sql_table |
Legacy ID | net.sql.table |
OSS KSM ID | - |
Category | Network |
Description | SQL query table name. |
Additional Notes | |
program_name
| |
---|
Prometheus ID | program_name |
Legacy ID | proc.client.name |
OSS KSM ID | - |
Category | Program |
Description | Name of the Client process. |
Additional Notes | |
program_cmd_line
| |
---|
Prometheus ID | program_cmd_line |
Legacy ID | proc.commandLine |
OSS KSM ID | - |
Category | Program |
Description | Command line used to start the process. |
Additional Notes | |
program_name
| |
---|
Prometheus ID | program_name |
Legacy ID | proc.name |
OSS KSM ID | - |
Category | Program |
Description | Name of the process. |
Additional Notes | |
program_name
| |
---|
Prometheus ID | program_name |
Legacy ID | proc.server.name |
OSS KSM ID | - |
Category | Program |
Description | Name of the server process. |
Additional Notes | |
swarm_cluster_id
| |
---|
Prometheus ID | swarm_cluster_id |
Legacy ID | swarm.cluster.id |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_cluster_name
| |
---|
Prometheus ID | swarm_cluster_name |
Legacy ID | swarm.cluster.name |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_manager_reachability
| |
---|
Prometheus ID | swarm_manager_reachability |
Legacy ID | swarm.manager.reachability |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_node_availability
| |
---|
Prometheus ID | swarm_node_availability |
Legacy ID | swarm.node.availability |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_node_id
| |
---|
Prometheus ID | swarm_node_id |
Legacy ID | swarm.node.id |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_node_ip_address
| |
---|
Prometheus ID | swarm_node_ip_address |
Legacy ID | swarm.node.ip_address |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_node_name
| |
---|
Prometheus ID | swarm_node_name |
Legacy ID | swarm.node.name |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_node_role
| |
---|
Prometheus ID | swarm_node_role |
Legacy ID | swarm.node.role |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_node_state
| |
---|
Prometheus ID | swarm_node_state |
Legacy ID | swarm.node.state |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_node_version
| |
---|
Prometheus ID | swarm_node_version |
Legacy ID | swarm.node.version |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_service_id
| |
---|
Prometheus ID | swarm_service_id |
Legacy ID | swarm.service.id |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_service_name
| |
---|
Prometheus ID | swarm_service_name |
Legacy ID | swarm.service.name |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_task_container_id
| |
---|
Prometheus ID | swarm_task_container_id |
Legacy ID | swarm.task.container_id |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_task_id
| |
---|
Prometheus ID | swarm_task_id |
Legacy ID | swarm.task.id |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_task_name
| |
---|
Prometheus ID | swarm_task_name |
Legacy ID | swarm.task.name |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_task_node_id
| |
---|
Prometheus ID | swarm_task_node_id |
Legacy ID | swarm.task.node_id |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_task_service_id
| |
---|
Prometheus ID | swarm_task_service_id |
Legacy ID | swarm.task.service_id |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
swarm_task_state
| |
---|
Prometheus ID | swarm_task_state |
Legacy ID | swarm.task.state |
OSS KSM ID | - |
Category | Swarm |
Description | |
Additional Notes | |
6.2.4 - File
sysdig_filestats_host_file_error_total_count
| |
---|
Prometheus ID | sysdig_filestats_host_file_error_total_count |
Legacy ID | file.error.total.count |
Metric Type | counter |
Unit | number |
Description | Number of error caused by file access. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_filestats_host_file_in_bytes
| |
---|
Prometheus ID | sysdig_filestats_host_file_in_bytes |
Legacy ID | file.bytes.in |
Metric Type | counter |
Unit | data |
Description | Amount of bytes read from file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_filestats_host_file_open_count
| |
---|
Prometheus ID | sysdig_filestats_host_file_open_count |
Legacy ID | file.open.count |
Metric Type | counter |
Unit | number |
Description | Number of time the file has been opened. |
Additional Notes | |
sysdig_filestats_host_file_out_bytes
| |
---|
Prometheus ID | sysdig_filestats_host_file_out_bytes |
Legacy ID | file.bytes.out |
Metric Type | counter |
Unit | data |
Description | Amount of bytes written to file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_filestats_host_file_total_bytes
| |
---|
Prometheus ID | sysdig_filestats_host_file_total_bytes |
Legacy ID | file.bytes.total |
Metric Type | counter |
Unit | data |
Description | Amount of bytes read from and written to file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_filestats_host_file_total_time
| |
---|
Prometheus ID | sysdig_filestats_host_file_total_time |
Legacy ID | file.time.total |
Metric Type | counter |
Unit | time |
Description | Time spent in file I/O. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_fs_free_bytes
| |
---|
Prometheus ID | sysdig_fs_free_bytes |
Legacy ID | fs.bytes.free |
Metric Type | gauge |
Unit | data |
Description | Filesystem available space. |
Additional Notes | |
sysdig_fs_free_percent
| |
---|
Prometheus ID | sysdig_fs_free_percent |
Legacy ID | fs.free.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of filesystem free space. |
Additional Notes | |
sysdig_fs_inodes_total_count
| |
---|
Prometheus ID | sysdig_fs_inodes_total_count |
Legacy ID | fs.inodes.total.count |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_fs_inodes_used_count
| |
---|
Prometheus ID | sysdig_fs_inodes_used_count |
Legacy ID | fs.inodes.used.count |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_fs_inodes_used_percent
| |
---|
Prometheus ID | sysdig_fs_inodes_used_percent |
Legacy ID | fs.inodes.used.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_fs_total_bytes
| |
---|
Prometheus ID | sysdig_fs_total_bytes |
Legacy ID | fs.bytes.total |
Metric Type | gauge |
Unit | data |
Description | Filesystem size. |
Additional Notes | |
sysdig_fs_used_bytes
| |
---|
Prometheus ID | sysdig_fs_used_bytes |
Legacy ID | fs.bytes.used |
Metric Type | gauge |
Unit | data |
Description | Filesystem used space. |
Additional Notes | |
sysdig_fs_used_percent
| |
---|
Prometheus ID | sysdig_fs_used_percent |
Legacy ID | fs.used.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of the sum of all filesystems in use. |
Additional Notes | |
6.2.5 - Host
sysdig_host_container_count
| |
---|
Prometheus ID | sysdig_host_container_count |
Legacy ID | container.count |
Metric Type | gauge |
Unit | number |
Description | Count of the number of containers. |
Additional Notes | This metric is perfect for dashboards and alerts. In particular, you can create alerts that notify you when you have too many (or too few) containers of a certain type in a certain group or node - try segmenting by container.image, .id or .name. See also: host.count. |
sysdig_host_container_start_count
| |
---|
Prometheus ID | sysdig_host_container_start_count |
Legacy ID | host.container.start.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_count
| |
---|
Prometheus ID | sysdig_host_count |
Legacy ID | host.count |
Metric Type | gauge |
Unit | number |
Description | Count of the number of hosts. |
Additional Notes | This metric is perfect for dashboards and alerts. In particular, you can create alerts that notify you when you have too many (or too few) machines of a certain type in a certain group - try segment by tag or hostname. See also: container.count. |
sysdig_host_cpu_cores_used
| |
---|
Prometheus ID | sysdig_host_cpu_cores_used |
Legacy ID | cpu.cores.used |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_cpu_cores_used_percent
| |
---|
Prometheus ID | sysdig_host_cpu_cores_used_percent |
Legacy ID | cpu.cores.used.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_cpu_idle_percent
| |
---|
Prometheus ID | sysdig_host_cpu_idle_percent |
Legacy ID | cpu.idle.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_cpu_iowait_percent
| |
---|
Prometheus ID | sysdig_host_cpu_iowait_percent |
Legacy ID | cpu.iowait.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_cpu_nice_percent
| |
---|
Prometheus ID | sysdig_host_cpu_nice_percent |
Legacy ID | cpu.nice.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of CPU utilization that occurred while executing at the user level with nice priority. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_cpu_stolen_percent
| |
---|
Prometheus ID | sysdig_host_cpu_stolen_percent |
Legacy ID | cpu.stolen.percent |
Metric Type | gauge |
Unit | percent |
Description | CPU steal time is a measure of the percent of time that a virtual machine’s CPU is in a state of involuntary wait due to the fact that the physical CPU is shared among virtual machines. In calculating steal time, the operating system kernel detects when it has work available but does not have access to the physical CPU to perform that work. |
Additional Notes | If the percent of steal time is consistently high, you may want to stop and restart the instance (since it will most likely start on different physical hardware) or upgrade to a virtual machine with more CPU power. Also see the metric ‘capacity total percent’ to see how steal time directly impacts the number of server requests that could not be handled. On AWS EC2, steal time does not depend on the activity of other virtual machine neighbours. EC2 is simply making sure your instance is not using more CPU cycles than paid for. |
sysdig_host_cpu_system_percent
| |
---|
Prometheus ID | sysdig_host_cpu_system_percent |
Legacy ID | cpu.system.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of CPU utilization that occurred while executing at the system level (kernel). |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_cpu_used_percent
| |
---|
Prometheus ID | sysdig_host_cpu_used_percent |
Legacy ID | cpu.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The CPU usage for each container is obtained from cgroups, and normalized by dividing by the number of cores to determine an overall percentage. For example, if the environment contains six cores on a host, and the container or processes are assigned two cores, Sysdig will report CPU usage of 2/6 * 100% = 33.33%. This metric is calculated differently for hosts and processes. |
Additional Notes | |
sysdig_host_cpu_user_percent
| |
---|
Prometheus ID | sysdig_host_cpu_user_percent |
Legacy ID | cpu.user.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of CPU utilization that occurred while executing at the user level (application). |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_cpucore_idle_percent
| |
---|
Prometheus ID | sysdig_host_cpucore_idle_percent |
Legacy ID | cpucore.idle.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_cpucore_iowait_percent
| |
---|
Prometheus ID | sysdig_host_cpucore_iowait_percent |
Legacy ID | cpucore.iowait.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_cpucore_nice_percent
| |
---|
Prometheus ID | sysdig_host_cpucore_nice_percent |
Legacy ID | cpucore.nice.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_cpucore_stolen_percent
| |
---|
Prometheus ID | sysdig_host_cpucore_stolen_percent |
Legacy ID | cpucore.stolen.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_cpucore_system_percent
| |
---|
Prometheus ID | sysdig_host_cpucore_system_percent |
Legacy ID | cpucore.system.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_cpucore_used_percent
| |
---|
Prometheus ID | sysdig_host_cpucore_used_percent |
Legacy ID | cpucore.used.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_cpucore_user_percent
| |
---|
Prometheus ID | sysdig_host_cpucore_user_percent |
Legacy ID | cpucore.user.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_fd_used_percent
| |
---|
Prometheus ID | sysdig_host_fd_used_percent |
Legacy ID | fd.used.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of used file descriptors out of the maximum available. |
Additional Notes | Usually, when a process reaches its FD limit it will stop operating properly and possibly crash. As a consequence, this is a metric you want to monitor carefully, or even better use for alerts. |
sysdig_host_file_error_open_count
| |
---|
Prometheus ID | sysdig_host_file_error_open_count |
Legacy ID | file.error.open.count |
Metric Type | counter |
Unit | number |
Description | Number of errors in opening files. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_file_error_total_count
| |
---|
Prometheus ID | sysdig_host_file_error_total_count |
Legacy ID | file.error.total.count |
Metric Type | counter |
Unit | number |
Description | Number of error caused by file access. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_file_in_bytes
| |
---|
Prometheus ID | sysdig_host_file_in_bytes |
Legacy ID | file.bytes.in |
Metric Type | counter |
Unit | data |
Description | Amount of bytes read from file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_file_in_iops
| |
---|
Prometheus ID | sysdig_host_file_in_iops |
Legacy ID | file.iops.in |
Metric Type | counter |
Unit | number |
Description | Number of file read operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_host_file_in_time
| |
---|
Prometheus ID | sysdig_host_file_in_time |
Legacy ID | file.time.in |
Metric Type | counter |
Unit | time |
Description | Time spent in file reading. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_file_open_count
| |
---|
Prometheus ID | sysdig_host_file_open_count |
Legacy ID | file.open.count |
Metric Type | counter |
Unit | number |
Description | Number of time the file has been opened. |
Additional Notes | |
sysdig_host_file_out_bytes
| |
---|
Prometheus ID | sysdig_host_file_out_bytes |
Legacy ID | file.bytes.out |
Metric Type | counter |
Unit | data |
Description | Amount of bytes written to file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_file_out_iops
| |
---|
Prometheus ID | sysdig_host_file_out_iops |
Legacy ID | file.iops.out |
Metric Type | counter |
Unit | number |
Description | Number of file write operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_host_file_out_time
| |
---|
Prometheus ID | sysdig_host_file_out_time |
Legacy ID | file.time.out |
Metric Type | counter |
Unit | time |
Description | Time spent in file writing. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_file_total_bytes
| |
---|
Prometheus ID | sysdig_host_file_total_bytes |
Legacy ID | file.bytes.total |
Metric Type | counter |
Unit | data |
Description | Amount of bytes read from and written to file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_file_total_iops
| |
---|
Prometheus ID | sysdig_host_file_total_iops |
Legacy ID | file.iops.total |
Metric Type | counter |
Unit | number |
Description | Number of read and write file operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_host_file_total_time
| |
---|
Prometheus ID | sysdig_host_file_total_time |
Legacy ID | file.time.total |
Metric Type | counter |
Unit | time |
Description | Time spent in file I/O. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_fs_free_bytes
| |
---|
Prometheus ID | sysdig_host_fs_free_bytes |
Legacy ID | fs.bytes.free |
Metric Type | gauge |
Unit | data |
Description | Filesystem available space. |
Additional Notes | |
sysdig_host_fs_free_percent
| |
---|
Prometheus ID | sysdig_host_fs_free_percent |
Legacy ID | fs.free.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of filesystem free space. |
Additional Notes | |
sysdig_host_fs_inodes_total_count
| |
---|
Prometheus ID | sysdig_host_fs_inodes_total_count |
Legacy ID | fs.inodes.total.count |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_fs_inodes_used_count
| |
---|
Prometheus ID | sysdig_host_fs_inodes_used_count |
Legacy ID | fs.inodes.used.count |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_fs_inodes_used_percent
| |
---|
Prometheus ID | sysdig_host_fs_inodes_used_percent |
Legacy ID | fs.inodes.used.percent |
Metric Type | gauge |
Unit | percent |
Description | |
Additional Notes | |
sysdig_host_fs_largest_used_percent
| |
---|
Prometheus ID | sysdig_host_fs_largest_used_percent |
Legacy ID | fs.largest.used.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of the largest filesystem in use. |
Additional Notes | |
sysdig_host_fs_root_used_percent
| |
---|
Prometheus ID | sysdig_host_fs_root_used_percent |
Legacy ID | fs.root.used.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of the root filesystem in use. |
Additional Notes | |
sysdig_host_fs_total_bytes
| |
---|
Prometheus ID | sysdig_host_fs_total_bytes |
Legacy ID | fs.bytes.total |
Metric Type | gauge |
Unit | data |
Description | Filesystem size. |
Additional Notes | |
sysdig_host_fs_used_bytes
| |
---|
Prometheus ID | sysdig_host_fs_used_bytes |
Legacy ID | fs.bytes.used |
Metric Type | gauge |
Unit | data |
Description | Filesystem used space. |
Additional Notes | |
sysdig_host_fs_used_percent
| |
---|
Prometheus ID | sysdig_host_fs_used_percent |
Legacy ID | fs.used.percent |
Metric Type | gauge |
Unit | percent |
Description | Percentage of the sum of all filesystems in use. |
Additional Notes | |
sysdig_host_info
| |
---|
Prometheus ID | sysdig_host_info |
Legacy ID | info |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_load_average_15m
| |
---|
Prometheus ID | sysdig_host_load_average_15m |
Legacy ID | load.average.15m |
Metric Type | gauge |
Unit | number |
Description | The 15 minute system load average represents the average number of jobs in (1) the CPU run queue or (2) waiting for disk I/O averaged over 15 minutes for all cores. The value should correspond to the third (and last) load average value displayed by ‘uptime’ command. |
Additional Notes | |
sysdig_host_load_average_1m
| |
---|
Prometheus ID | sysdig_host_load_average_1m |
Legacy ID | load.average.1m |
Metric Type | gauge |
Unit | number |
Description | The 1 minute system load average represents the average number of jobs in (1) the CPU run queue or (2) waiting for disk I/O averaged over 1 minute for all cores. The value should correspond to the first (of three) load average values displayed by ‘uptime’ command. |
Additional Notes | |
sysdig_host_load_average_5m
| |
---|
Prometheus ID | sysdig_host_load_average_5m |
Legacy ID | load.average.5m |
Metric Type | gauge |
Unit | number |
Description | The 5 minute system load average represents the average number of jobs in (1) the CPU run queue or (2) waiting for disk I/O averaged over 5 minutes for all cores. The value should correspond to the second (of three) load average values displayed by ‘uptime’ command. |
Additional Notes | |
sysdig_host_load_average_percpu_15m
| |
---|
Prometheus ID | sysdig_host_load_average_percpu_15m |
Legacy ID | load.average.percpu.15m |
Metric Type | gauge |
Unit | number |
Description | The 15 minute system load average represents the average number of jobs in (1) the CPU run queue or (2) waiting for disk I/O averaged over 15 minutes, divided by number of system CPUs. |
Additional Notes | |
sysdig_host_load_average_percpu_1m
| |
---|
Prometheus ID | sysdig_host_load_average_percpu_1m |
Legacy ID | load.average.percpu.1m |
Metric Type | gauge |
Unit | number |
Description | The 1 minute system load average represents the average number of jobs in (1) the CPU run queue or (2) waiting for disk I/O averaged over 1 minute, divided by number of system CPUs. |
Additional Notes | |
sysdig_host_load_average_percpu_5m
| |
---|
Prometheus ID | sysdig_host_load_average_percpu_5m |
Legacy ID | load.average.percpu.5m |
Metric Type | gauge |
Unit | number |
Description | The 5 minute system load average represents the average number of jobs in (1) the CPU run queue or (2) waiting for disk I/O averaged over 5 minutes, divided by number of system CPUs. |
Additional Notes | |
sysdig_host_memory_available_bytes
| |
---|
Prometheus ID | sysdig_host_memory_available_bytes |
Legacy ID | memory.bytes.available |
Metric Type | gauge |
Unit | data |
Description | The available memory for a host is obtained from /proc/meminfo. For environments using Linux kernel version 3.12 and later, the available memory is obtained using the mem.available field in /proc/meminfo. For environments using earlier kernel versions, the formula is MemFree + Cached + Buffers. |
Additional Notes | |
sysdig_host_memory_swap_available_bytes
| |
---|
Prometheus ID | sysdig_host_memory_swap_available_bytes |
Legacy ID | memory.swap.bytes.available |
Metric Type | gauge |
Unit | data |
Description | Available amount of swap memory. |
Additional Notes | Sum of free and cached swap memory. By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_memory_swap_total_bytes
| |
---|
Prometheus ID | sysdig_host_memory_swap_total_bytes |
Legacy ID | memory.swap.bytes.total |
Metric Type | gauge |
Unit | data |
Description | Total amount of swap memory. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_memory_swap_used_bytes
| |
---|
Prometheus ID | sysdig_host_memory_swap_used_bytes |
Legacy ID | memory.swap.bytes.used |
Metric Type | gauge |
Unit | data |
Description | Used amount of swap memory. |
Additional Notes | The amount of used swap memory is calculated by subtracting available from total swap memory. By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_memory_swap_used_percent
| |
---|
Prometheus ID | sysdig_host_memory_swap_used_percent |
Legacy ID | memory.swap.used.percent |
Metric Type | gauge |
Unit | percent |
Description | Used percent of swap memory. |
Additional Notes | The percentage of used swap memory is calculated as percentual ratio of used and total swap memory. By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_memory_total_bytes
| |
---|
Prometheus ID | sysdig_host_memory_total_bytes |
Legacy ID | memory.bytes.total |
Metric Type | gauge |
Unit | data |
Description | The total memory of a host, in bytes. This value is obtained from /proc. |
Additional Notes | |
sysdig_host_memory_used_bytes
| |
---|
Prometheus ID | sysdig_host_memory_used_bytes |
Legacy ID | memory.bytes.used |
Metric Type | gauge |
Unit | data |
Description | The amount of physical memory currently in use. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_host_memory_used_percent
| |
---|
Prometheus ID | sysdig_host_memory_used_percent |
Legacy ID | memory.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of physical memory in use. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_memory_virtual_bytes
| |
---|
Prometheus ID | sysdig_host_memory_virtual_bytes |
Legacy ID | memory.bytes.virtual |
Metric Type | gauge |
Unit | data |
Description | The virtual memory size of the process, in bytes. This value is obtained from Sysdig events. |
Additional Notes | |
sysdig_host_net_connection_in_count
| |
---|
Prometheus ID | sysdig_host_net_connection_in_count |
Legacy ID | net.connection.count.in |
Metric Type | counter |
Unit | number |
Description | Number of currently established client (inbound) connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_host_net_connection_out_count
| |
---|
Prometheus ID | sysdig_host_net_connection_out_count |
Legacy ID | net.connection.count.out |
Metric Type | counter |
Unit | number |
Description | Number of currently established server (outbound) connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_host_net_connection_total_count
| |
---|
Prometheus ID | sysdig_host_net_connection_total_count |
Legacy ID | net.connection.count.total |
Metric Type | counter |
Unit | number |
Description | Number of currently established connections. This value may exceed the sum of the inbound and outbound metrics since it represents client and server inter-host connections as well as internal only connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_host_net_error_count
| |
---|
Prometheus ID | sysdig_host_net_error_count |
Legacy ID | net.error.count |
Metric Type | counter |
Unit | number |
Description | Number of network errors. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_net_http_error_count
| |
---|
Prometheus ID | sysdig_host_net_http_error_count |
Legacy ID | net.http.error.count |
Metric Type | counter |
Unit | number |
Description | Number of failed HTTP requests as counted from 4xx/5xx status codes. |
Additional Notes | |
sysdig_host_net_http_request_count
| |
---|
Prometheus ID | sysdig_host_net_http_request_count |
Legacy ID | net.http.request.count |
Metric Type | counter |
Unit | number |
Description | Count of HTTP requests. |
Additional Notes | |
sysdig_host_net_http_request_time
| |
---|
Prometheus ID | sysdig_host_net_http_request_time |
Legacy ID | net.http.request.time |
Metric Type | counter |
Unit | time |
Description | Average time for HTTP requests. |
Additional Notes | |
sysdig_host_net_http_statuscode_error_count
| |
---|
Prometheus ID | sysdig_host_net_http_statuscode_error_count |
Legacy ID | net.http.statuscode.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_http_statuscode_request_count
| |
---|
Prometheus ID | sysdig_host_net_http_statuscode_request_count |
Legacy ID | net.http.statuscode.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_http_url_error_count
| |
---|
Prometheus ID | sysdig_host_net_http_url_error_count |
Legacy ID | net.http.url.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_http_url_request_count
| |
---|
Prometheus ID | sysdig_host_net_http_url_request_count |
Legacy ID | net.http.url.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_http_url_request_time
| |
---|
Prometheus ID | sysdig_host_net_http_url_request_time |
Legacy ID | net.http.url.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_collection_error_count
| |
---|
Prometheus ID | sysdig_host_net_mongodb_collection_error_count |
Legacy ID | net.mongodb.collection.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_collection_request_count
| |
---|
Prometheus ID | sysdig_host_net_mongodb_collection_request_count |
Legacy ID | net.mongodb.collection.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_collection_request_time
| |
---|
Prometheus ID | sysdig_host_net_mongodb_collection_request_time |
Legacy ID | net.mongodb.collection.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_error_count
| |
---|
Prometheus ID | sysdig_host_net_mongodb_error_count |
Legacy ID | net.mongodb.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_operation_error_count
| |
---|
Prometheus ID | sysdig_host_net_mongodb_operation_error_count |
Legacy ID | net.mongodb.operation.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_operation_request_count
| |
---|
Prometheus ID | sysdig_host_net_mongodb_operation_request_count |
Legacy ID | net.mongodb.operation.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_operation_request_time
| |
---|
Prometheus ID | sysdig_host_net_mongodb_operation_request_time |
Legacy ID | net.mongodb.operation.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_request_count
| |
---|
Prometheus ID | sysdig_host_net_mongodb_request_count |
Legacy ID | net.mongodb.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_mongodb_request_time
| |
---|
Prometheus ID | sysdig_host_net_mongodb_request_time |
Legacy ID | net.mongodb.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_host_net_in_bytes
| |
---|
Prometheus ID | sysdig_host_net_in_bytes |
Legacy ID | net.bytes.in |
Metric Type | counter |
Unit | data |
Description | Inbound network bytes. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_net_out_bytes
| |
---|
Prometheus ID | sysdig_host_net_out_bytes |
Legacy ID | net.bytes.out |
Metric Type | counter |
Unit | data |
Description | Outbound network bytes. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_net_request_count
| |
---|
Prometheus ID | sysdig_host_net_request_count |
Legacy ID | net.request.count |
Metric Type | counter |
Unit | number |
Description | Total number of network requests. Note, this value may exceed the sum of inbound and outbound requests, because this count includes requests over internal connections. |
Additional Notes | |
sysdig_host_net_request_in_count
| |
---|
Prometheus ID | sysdig_host_net_request_in_count |
Legacy ID | net.request.count.in |
Metric Type | counter |
Unit | number |
Description | Number of inbound network requests. |
Additional Notes | |
sysdig_host_net_request_in_time
| |
---|
Prometheus ID | sysdig_host_net_request_in_time |
Legacy ID | net.request.time.in |
Metric Type | counter |
Unit | time |
Description | Average time to serve an inbound request. |
Additional Notes | |
sysdig_host_net_request_out_count
| |
---|
Prometheus ID | sysdig_host_net_request_out_count |
Legacy ID | net.request.count.out |
Metric Type | counter |
Unit | number |
Description | Number of outbound network requests. |
Additional Notes | |
sysdig_host_net_request_out_time
| |
---|
Prometheus ID | sysdig_host_net_request_out_time |
Legacy ID | net.request.time.out |
Metric Type | counter |
Unit | time |
Description | Average time spent waiting for an outbound request. |
Additional Notes | |
sysdig_host_net_request_time
| |
---|
Prometheus ID | sysdig_host_net_request_time |
Legacy ID | net.request.time |
Metric Type | counter |
Unit | time |
Description | Average time to serve a network request. |
Additional Notes | |
sysdig_host_net_server_connection_in_count
| |
---|
Prometheus ID | sysdig_host_net_server_connection_in_count |
Legacy ID | net.server.connection.count.in |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_server_in_bytes
| |
---|
Prometheus ID | sysdig_host_net_server_in_bytes |
Legacy ID | net.server.bytes.in |
Metric Type | counter |
Unit | data |
Description | |
Additional Notes | |
sysdig_host_net_server_out_bytes
| |
---|
Prometheus ID | sysdig_host_net_server_out_bytes |
Legacy ID | net.server.bytes.out |
Metric Type | counter |
Unit | data |
Description | |
Additional Notes | |
sysdig_host_net_server_total_bytes
| |
---|
Prometheus ID | sysdig_host_net_server_total_bytes |
Legacy ID | net.server.bytes.total |
Metric Type | counter |
Unit | data |
Description | |
Additional Notes | |
sysdig_host_net_sql_error_count
| |
---|
Prometheus ID | sysdig_host_net_sql_error_count |
Legacy ID | net.sql.error.count |
Metric Type | counter |
Unit | number |
Description | Number of Failed SQL requests. |
Additional Notes | |
sysdig_host_net_sql_query_error_count
| |
---|
Prometheus ID | sysdig_host_net_sql_query_error_count |
Legacy ID | net.sql.query.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_sql_query_request_count
| |
---|
Prometheus ID | sysdig_host_net_sql_query_request_count |
Legacy ID | net.sql.query.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_sql_query_request_time
| |
---|
Prometheus ID | sysdig_host_net_sql_query_request_time |
Legacy ID | net.sql.query.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_host_net_sql_querytype_error_count
| |
---|
Prometheus ID | sysdig_host_net_sql_querytype_error_count |
Legacy ID | net.sql.querytype.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_sql_querytype_request_count
| |
---|
Prometheus ID | sysdig_host_net_sql_querytype_request_count |
Legacy ID | net.sql.querytype.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_sql_querytype_request_time
| |
---|
Prometheus ID | sysdig_host_net_sql_querytype_request_time |
Legacy ID | net.sql.querytype.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_host_net_sql_request_count
| |
---|
Prometheus ID | sysdig_host_net_sql_request_count |
Legacy ID | net.sql.request.count |
Metric Type | counter |
Unit | number |
Description | Number of SQL requests. |
Additional Notes | |
sysdig_host_net_sql_request_time
| |
---|
Prometheus ID | sysdig_host_net_sql_request_time |
Legacy ID | net.sql.request.time |
Metric Type | counter |
Unit | time |
Description | Average time to complete a SQL request. |
Additional Notes | |
sysdig_host_net_sql_table_error_count
| |
---|
Prometheus ID | sysdig_host_net_sql_table_error_count |
Legacy ID | net.sql.table.error.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_sql_table_request_count
| |
---|
Prometheus ID | sysdig_host_net_sql_table_request_count |
Legacy ID | net.sql.table.request.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_net_sql_table_request_time
| |
---|
Prometheus ID | sysdig_host_net_sql_table_request_time |
Legacy ID | net.sql.table.request.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
sysdig_host_net_tcp_queue_len
| |
---|
Prometheus ID | sysdig_host_net_tcp_queue_len |
Legacy ID | net.tcp.queue.len |
Metric Type | counter |
Unit | number |
Description | Length of the TCP request queue. |
Additional Notes | |
sysdig_host_net_total_bytes
| |
---|
Prometheus ID | sysdig_host_net_total_bytes |
Legacy ID | net.bytes.total |
Metric Type | counter |
Unit | data |
Description | Total network bytes, inbound and outbound. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_proc_count
| |
---|
Prometheus ID | sysdig_host_proc_count |
Legacy ID | proc.count |
Metric Type | counter |
Unit | number |
Description | Number of processes on host or container. |
Additional Notes | |
sysdig_host_syscall_count
| |
---|
Prometheus ID | sysdig_host_syscall_count |
Legacy ID | syscall.count |
Metric Type | gauge |
Unit | number |
Description | Total number of syscalls seen |
Additional Notes | Syscalls are resource intensive. This metric tracks how many have been made by a given process or container |
sysdig_host_syscall_error_count
| |
---|
Prometheus ID | sysdig_host_syscall_error_count |
Legacy ID | host.error.count |
Metric Type | counter |
Unit | number |
Description | Number of system call errors. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_host_system_uptime
| |
---|
Prometheus ID | sysdig_host_system_uptime |
Legacy ID | system.uptime |
Metric Type | gauge |
Unit | time |
Description | This metric is sent by the agent and represent the amount of seconds since host boot time. It is not available with container granularity. |
Additional Notes | |
sysdig_host_thread_count
| |
---|
Prometheus ID | sysdig_host_thread_count |
Legacy ID | thread.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_timeseries_count_appcheck
| |
---|
Prometheus ID | sysdig_host_timeseries_count_appcheck |
Legacy ID | metricCount.appCheck |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_timeseries_count_jmx
| |
---|
Prometheus ID | sysdig_host_timeseries_count_jmx |
Legacy ID | metricCount.jmx |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_timeseries_count_prometheus
| |
---|
Prometheus ID | sysdig_host_timeseries_count_prometheus |
Legacy ID | metricCount.prometheus |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_timeseries_count_statsd
| |
---|
Prometheus ID | sysdig_host_timeseries_count_statsd |
Legacy ID | metricCount.statsd |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_host_up
| |
---|
Prometheus ID | sysdig_host_up |
Legacy ID | uptime |
Metric Type | gauge |
Unit | number |
Description | The percentage of time the selected entity was down during the visualized time sample. This can be used to determine if a machine (or a group of machines) went down. |
Additional Notes | |
6.2.6 - JMX/JVM
jmx_jvm_class_loaded
| |
---|
Prometheus ID | jmx_jvm_class_loaded |
Legacy ID | jvm.class.loaded |
Metric Type | gauge |
Unit | number |
Description | The number of classes that are currently loaded in the JVM. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_class_unloaded
| |
---|
Prometheus ID | jmx_jvm_class_unloaded |
Legacy ID | jvm.class.unloaded |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
jmx_jvm_gc_ConcurrentMarkSweep_count
| |
---|
Prometheus ID | jmx_jvm_gc_ConcurrentMarkSweep_count |
Legacy ID | jvm.gc.ConcurrentMarkSweep.count |
Metric Type | counter |
Unit | number |
Description | The number of times the Concurrent Mark-Sweep garbage collector has run. |
Additional Notes | |
jmx_jvm_gc_ConcurrentMarkSweep_time
| |
---|
Prometheus ID | jmx_jvm_gc_ConcurrentMarkSweep_time |
Legacy ID | jvm.gc.ConcurrentMarkSweep.time |
Metric Type | counter |
Unit | time |
Description | The amount of time the Concurrent Mark-Sweep garbage collector has run. |
Additional Notes | |
jmx_jvm_gc_Copy_count
| |
---|
Prometheus ID | jmx_jvm_gc_Copy_count |
Legacy ID | jvm.gc.Copy.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
jmx_jvm_gc_Copy_time
| |
---|
Prometheus ID | jmx_jvm_gc_Copy_time |
Legacy ID | jvm.gc.Copy.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
jmx_jvm_gc_G1_Old_Generation_count
| |
---|
Prometheus ID | jmx_jvm_gc_G1_Old_Generation_count |
Legacy ID | jvm.gc.G1_Old_Generation.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
jmx_jvm_gc_G1_Old_Generation_time
| |
---|
Prometheus ID | jmx_jvm_gc_G1_Old_Generation_time |
Legacy ID | jvm.gc.G1_Old_Generation.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
jmx_jvm_gc_G1_Young_Generation_count
| |
---|
Prometheus ID | jmx_jvm_gc_G1_Young_Generation_count |
Legacy ID | jvm.gc.G1_Young_Generation.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
jmx_jvm_gc_G1_Young_Generation_time
| |
---|
Prometheus ID | jmx_jvm_gc_G1_Young_Generation_time |
Legacy ID | jvm.gc.G1_Young_Generation.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
jmx_jvm_gc_MarkSweepCompact_count
| |
---|
Prometheus ID | jmx_jvm_gc_MarkSweepCompact_count |
Legacy ID | jvm.gc.MarkSweepCompact.count |
Metric Type | counter |
Unit | number |
Description | |
Additional Notes | |
jmx_jvm_gc_MarkSweepCompact_time
| |
---|
Prometheus ID | jmx_jvm_gc_MarkSweepCompact_time |
Legacy ID | jvm.gc.MarkSweepCompact.time |
Metric Type | counter |
Unit | time |
Description | |
Additional Notes | |
jmx_jvm_gc_PS_MarkSweep_count
| |
---|
Prometheus ID | jmx_jvm_gc_PS_MarkSweep_count |
Legacy ID | jvm.gc.PS_MarkSweep.count |
Metric Type | counter |
Unit | number |
Description | The number of times the parallel scavenge Mark-Sweep old generation garbage collector has run. |
Additional Notes | |
jmx_jvm_gc_PS_MarkSweep_time
| |
---|
Prometheus ID | jmx_jvm_gc_PS_MarkSweep_time |
Legacy ID | jvm.gc.PS_MarkSweep.time |
Metric Type | counter |
Unit | time |
Description | The amount of time the parallel scavenge Mark-Sweep old generation garbage collector has run. |
Additional Notes | |
jmx_jvm_gc_PS_Scavenge_count
| |
---|
Prometheus ID | jmx_jvm_gc_PS_Scavenge_count |
Legacy ID | jvm.gc.PS_Scavenge.count |
Metric Type | counter |
Unit | number |
Description | The number of times the parallel eden/survivor space garbage collector has run. |
Additional Notes | |
jmx_jvm_gc_PS_Scavenge_time
| |
---|
Prometheus ID | jmx_jvm_gc_PS_Scavenge_time |
Legacy ID | jvm.gc.PS_Scavenge.time |
Metric Type | counter |
Unit | time |
Description | The amount of time the parallel eden/survivor space garbage collector has run. |
Additional Notes | |
jmx_jvm_gc_ParNew_count
| |
---|
Prometheus ID | jmx_jvm_gc_ParNew_count |
Legacy ID | jvm.gc.ParNew.count |
Metric Type | counter |
Unit | number |
Description | The number of times the parallel garbage collector has run. |
Additional Notes | |
jmx_jvm_gc_ParNew_time
| |
---|
Prometheus ID | jmx_jvm_gc_ParNew_time |
Legacy ID | jvm.gc.ParNew.time |
Metric Type | counter |
Unit | time |
Description | The amount of time the parallel garbage collector has run. |
Additional Notes | |
jmx_jvm_heap_committed
| |
---|
Prometheus ID | jmx_jvm_heap_committed |
Legacy ID | jvm.heap.committed |
Metric Type | counter |
Unit | number |
Description | The amount of memory that is currently allocated to the JVM for heap memory. Heap memory is the storage area for Java objects. The JVM may release memory to the system and Heap Committed could decrease below Heap Init; but Heap Committed can never increase above Heap Max. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_heap_init
| |
---|
Prometheus ID | jmx_jvm_heap_init |
Legacy ID | jvm.heap.init |
Metric Type | counter |
Unit | number |
Description | The initial amount of memory that the JVM requests from the operating system for heap memory during startup (defined by the –Xms option). The JVM may request additional memory from the operating system and may also release memory to the system over time. The value of Heap Init may be undefined. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_heap_max
| |
---|
Prometheus ID | jmx_jvm_heap_max |
Legacy ID | jvm.heap.max |
Metric Type | counter |
Unit | number |
Description | The maximum size allocation of heap memory for the JVM (defined by the –Xmx option). Any memory allocation attempt that would exceed this limit will cause an OutOfMemoryError exception to be thrown. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_heap_used
| |
---|
Prometheus ID | jmx_jvm_heap_used |
Legacy ID | jvm.heap.used |
Metric Type | counter |
Unit | number |
Description | The amount of allocated heap memory (ie Heap Committed) currently in use. Heap memory is the storage area for Java objects. An object in the heap that is referenced by another object is ’live’, and will remain in the heap as long as it continues to be referenced. Objects that are no longer referenced are garbage and will be cleared out of the heap to reclaim space. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_heap_used_percent
| |
---|
Prometheus ID | jmx_jvm_heap_used_percent |
Legacy ID | jvm.heap.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The ratio between Heap Used and Heap Committed. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_nonHeap_committed
| |
---|
Prometheus ID | jmx_jvm_nonHeap_committed |
Legacy ID | jvm.nonHeap.committed |
Metric Type | counter |
Unit | number |
Description | The amount of memory that is currently allocated to the JVM for non-heap memory. Non-heap memory is used by Java to store loaded classes and other meta-data. The JVM may release memory to the system and Non-Heap Committed could decrease below Non-Heap Init; but Non-Heap Committed can never increase above Non-Heap Max. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_nonHeap_init
| |
---|
Prometheus ID | jmx_jvm_nonHeap_init |
Legacy ID | jvm.nonHeap.init |
Metric Type | counter |
Unit | number |
Description | The initial amount of memory that the JVM requests from the operating system for non-heap memory during startup. The JVM may request additional memory from the operating system and may also release memory to the system over time. The value of Non-Heap Init may be undefined. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_nonHeap_max
| |
---|
Prometheus ID | jmx_jvm_nonHeap_max |
Legacy ID | jvm.nonHeap.max |
Metric Type | counter |
Unit | number |
Description | The maximum size allocation of non-heap memory for the JVM. This memory is used by Java to store loaded classes and other meta-data. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_nonHeap_used
| |
---|
Prometheus ID | jmx_jvm_nonHeap_used |
Legacy ID | jvm.nonHeap.used |
Metric Type | counter |
Unit | number |
Description | The amount of allocated non-heap memory (ie Non-Heap Committed) currently in use. Non-heap memory is used by Java to store loaded classes and other meta-data. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_nonHeap_used_percent
| |
---|
Prometheus ID | jmx_jvm_nonHeap_used_percent |
Legacy ID | jvm.nonHeap.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The ratio between Non-Heap Used and Non-Heap Committed. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_thread_count
| |
---|
Prometheus ID | jmx_jvm_thread_count |
Legacy ID | jvm.thread.count |
Metric Type | gauge |
Unit | number |
Description | The current number of live daemon and non-daemon threads. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
jmx_jvm_thread_daemon
| |
---|
Prometheus ID | jmx_jvm_thread_daemon |
Legacy ID | jvm.thread.daemon |
Metric Type | gauge |
Unit | number |
Description | The current number of live daemon threads. Daemon threads are used for background supporting tasks and are only needed while normal threads are executing. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
6.2.7 - Kubernetes
kube_certificatesigningrequest_created
| |
---|
Prometheus ID | kube_certificatesigningrequest_created |
Legacy ID | |
Metric Type | gauge |
Unit | - |
Description | Timestamp of when the CSR object was created |
Additional Notes | The timestamp is in Unix epoch time. |
kube_certificatesigningrequest_condition
| |
---|
Prometheus ID | kube_certificatesigningrequest_condition |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores whether the CSR was approved or denied |
Additional Notes | The metric will be 1 if the condition occurred and 0 if it didn’t. |
kube_certificatesigningrequest_labels
| |
---|
Prometheus ID | kube_certificatesigningrequest_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_certificatesigningrequest_cert_length
| |
---|
Prometheus ID | kube_certificatesigningrequest_cert_length |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of characters in the certificate |
Additional Notes | |
kube_daemonset_labels
| |
---|
Prometheus ID | kube_daemonset_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_daemonset_status_current_number_scheduled
| |
---|
Prometheus ID | kube_daemonset_status_current_number_scheduled |
Legacy ID | kubernetes.daemonSet.pods.scheduled |
Metric Type | gauge |
Unit | number |
Description | The number of nodes that running at least one daemon and are supposed to. |
Additional Notes | |
kube_daemonset_status_desired_number_scheduled
| |
---|
Prometheus ID | kube_daemonset_status_desired_number_scheduled |
Legacy ID | kubernetes.daemonSet.pods.desired |
Metric Type | gauge |
Unit | number |
Description | The number of nodes that should be running the daemon Pod. |
Additional Notes | |
kube_daemonset_status_number_misscheduled
| |
---|
Prometheus ID | kube_daemonset_status_number_misscheduled |
Legacy ID | kubernetes.daemonSet.pods.misscheduled |
Metric Type | gauge |
Unit | number |
Description | The number of nodes running a daemon Pod that are not supposed to. |
Additional Notes | |
kube_daemonset_status_number_ready
| |
---|
Prometheus ID | kube_daemonset_status_number_ready |
Legacy ID | kubernetes.daemonSet.pods.ready |
Metric Type | gauge |
Unit | number |
Description | The number of nodes that should be running the daemon Pod and have one or more of the daemon Pod running and ready. |
Additional Notes | |
kube_deployment_labels
| |
---|
Prometheus ID | kube_deployment_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_deployment_spec_paused
| |
---|
Prometheus ID | kube_deployment_spec_paused |
Legacy ID | kubernetes.deployment.replicas.paused |
Metric Type | gauge |
Unit | number |
Description | The number of paused Pods per deployment. These Pods will not be processed by the deployment controller. |
Additional Notes | |
kube_deployment_spec_replicas
| |
---|
Prometheus ID | kube_deployment_spec_replicas |
Legacy ID | kubernetes.deployment.replicas.desired |
Metric Type | gauge |
Unit | number |
Description | The number of desired Pods per deployment. |
Additional Notes | |
kube_deployment_status_replicas
| |
---|
Prometheus ID | kube_deployment_status_replicas |
Legacy ID | kubernetes.deployment.replicas.running |
Metric Type | gauge |
Unit | number |
Description | The number of running Pods per deployment. |
Additional Notes | |
kube_deployment_status_replicas_available
| |
---|
Prometheus ID | kube_deployment_status_replicas_available |
Legacy ID | kubernetes.deployment.replicas.available |
Metric Type | gauge |
Unit | number |
Description | The number of available Pods per deployment. |
Additional Notes | |
kube_deployment_status_replicas_unavailable
| |
---|
Prometheus ID | kube_deployment_status_replicas_unavailable |
Legacy ID | kubernetes.deployment.replicas.unavailable |
Metric Type | gauge |
Unit | number |
Description | The number of unavailable Pods per deployment. |
Additional Notes | |
kube_deployment_status_replicas_updated
| |
---|
Prometheus ID | kube_deployment_status_replicas_updated |
Legacy ID | kubernetes.deployment.replicas.updated |
Metric Type | gauge |
Unit | number |
Description | The number of updated Pods per deployment. |
Additional Notes | |
kube_hpa_labels
| |
---|
Prometheus ID | kube_hpa_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_hpa_spec_max_replicas
| |
---|
Prometheus ID | kube_hpa_spec_max_replicas |
Legacy ID | kubernetes.hpa.replicas.max |
Metric Type | gauge |
Unit | number |
Description | Upper limit for the number of Pods that can be set by the autoscaler. |
Additional Notes | |
kube_hpa_spec_min_replicas
| |
---|
Prometheus ID | kube_hpa_spec_min_replicas |
Legacy ID | kubernetes.hpa.replicas.min |
Metric Type | gauge |
Unit | number |
Description | Lower limit for the number of Pods that can be set by the autoscaler. |
Additional Notes | |
kube_hpa_status_current_replicas
| |
---|
Prometheus ID | kube_hpa_status_current_replicas |
Legacy ID | kubernetes.hpa.replicas.current |
Metric Type | gauge |
Unit | number |
Description | Current number of replicas of Pods managed by this autoscaler. |
Additional Notes | |
kube_hpa_status_desired_replicas
| |
---|
Prometheus ID | kube_hpa_status_desired_replicas |
Legacy ID | kubernetes.hpa.replicas.desired |
Metric Type | gauge |
Unit | number |
Description | Desired number of replicas of Pods managed by this autoscaler. |
Additional Notes | |
kube_ingress_info
| |
---|
Prometheus ID | kube_ingress_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_ingress_labels
| |
---|
Prometheus ID | kube_ingress_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_ingress_created
| |
---|
Prometheus ID | kube_ingress_created |
Legacy ID | |
Metric Type | gauge |
Unit | - |
Description | Timestamp of when the ingress object was created |
Additional Notes | The timestamp is in Unix epoch time. |
kube_ingress_path
| |
---|
Prometheus ID | kube_ingress_path |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Information about the path of the ingress object is stored as labels on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_ingress_tls
| |
---|
Prometheus ID | kube_ingress_tls |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Information about the TLS configuration of the ingress object is stored as labels on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_job_complete
| |
---|
Prometheus ID | kube_job_complete |
Legacy ID | kubernetes.job.numSucceeded |
Metric Type | gauge |
Unit | number |
Description | The number of Pods which reached Phase Succeeded. |
Additional Notes | |
kube_job_failed
| |
---|
Prometheus ID | kube_job_failed |
Legacy ID | kubernetes.job.numFailed |
Metric Type | gauge |
Unit | number |
Description | The number of Pods which reached Phase Failed. |
Additional Notes | |
kube_job_info
| |
---|
Prometheus ID | kube_job_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_job_labels
| |
---|
Prometheus ID | kube_job_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_job_owner
| |
---|
Prometheus ID | kube_job_owner |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Information about the owner of the job is stored as labels on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_job_spec_completions
| |
---|
Prometheus ID | kube_job_spec_completions |
Legacy ID | kubernetes.job.completions |
Metric Type | gauge |
Unit | number |
Description | The desired number of successfully finished Pods that the job should be run with. |
Additional Notes | |
kube_job_spec_parallelism
| |
---|
Prometheus ID | kube_job_spec_parallelism |
Legacy ID | kubernetes.job.parallelism |
Metric Type | gauge |
Unit | number |
Description | The maximum desired number of Pods that the job should run at any given time. |
Additional Notes | |
kube_job_status_active
| |
---|
Prometheus ID | kube_job_status_active |
Legacy ID | kubernetes.job.status.active |
Metric Type | gauge |
Unit | number |
Description | The number of actively running Pods. |
Additional Notes | |
kube_namespace_labels
| |
---|
Prometheus ID | kube_namespace_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_namespace_sysdig_count
| |
---|
Prometheus ID | kube_namespace_sysdig_count |
Legacy ID | kubernetes.namespace.count |
Metric Type | gauge |
Unit | number |
Description | The number of namespaces. |
Additional Notes | |
kube_namespace_sysdig_deployment_count
| |
---|
Prometheus ID | kube_namespace_sysdig_deployment_count |
Legacy ID | kubernetes.namespace.deployment.count |
Metric Type | gauge |
Unit | number |
Description | The number of deployments per namespace. |
Additional Notes | |
kube_namespace_sysdig_hpa_count
| |
---|
Prometheus ID | kube_namespace_sysdig_hpa_count |
Legacy ID | kubernetes.namespace.hpa.count |
Metric Type | gauge |
Unit | number |
Description | The number of HPA per namespace. |
Additional Notes | |
kube_namespace_sysdig_job_count
| |
---|
Prometheus ID | kube_namespace_sysdig_job_count |
Legacy ID | kubernetes.namespace.job.count |
Metric Type | gauge |
Unit | number |
Description | The number of jobs per namespace. |
Additional Notes | |
kube_namespace_sysdig_persistentvolumeclaim_count
| |
---|
Prometheus ID | kube_namespace_sysdig_persistentvolumeclaim_count |
Legacy ID | kubernetes.namespace.persistentvolumeclaim.count |
Metric Type | gauge |
Unit | number |
Description | The number of persistentvolumeclaim per namespace. |
Additional Notes | |
kube_namespace_sysdig_pod_available_count
| |
---|
Prometheus ID | kube_namespace_sysdig_pod_available_count |
Legacy ID | kubernetes.namespace.pod.available.count |
Metric Type | gauge |
Unit | number |
Description | The number of available Pods per namespace. |
Additional Notes | |
kube_namespace_sysdig_pod_desired_count
| |
---|
Prometheus ID | kube_namespace_sysdig_pod_desired_count |
Legacy ID | kubernetes.namespace.pod.desired.count |
Metric Type | gauge |
Unit | number |
Description | The number of desired Pods per namespace. |
Additional Notes | |
kube_namespace_sysdig_pod_running_count
| |
---|
Prometheus ID | kube_namespace_sysdig_pod_running_count |
Legacy ID | kubernetes.namespace.pod.running.count |
Metric Type | gauge |
Unit | number |
Description | The number of Pods running per namespace. |
Additional Notes | |
kube_namespace_sysdig_replicaset_count
| |
---|
Prometheus ID | kube_namespace_sysdig_replicaset_count |
Legacy ID | kubernetes.namespace.replicaSet.count |
Metric Type | gauge |
Unit | number |
Description | The number of replicaSets per namespace. |
Additional Notes | |
kube_namespace_sysdig_resourcequota_count
| |
---|
Prometheus ID | kube_namespace_sysdig_resourcequota_count |
Legacy ID | kubernetes.namespace.resourcequota.count |
Metric Type | gauge |
Unit | number |
Description | The number of resource quota per namespace. |
Additional Notes | |
kube_namespace_sysdig_service_count
| |
---|
Prometheus ID | kube_namespace_sysdig_service_count |
Legacy ID | kubernetes.namespace.service.count |
Metric Type | gauge |
Unit | number |
Description | The number of services per namespace. |
Additional Notes | |
kube_namespace_sysdig_statefulset_count
| |
---|
Prometheus ID | kube_namespace_sysdig_statefulset_count |
Legacy ID | kubernetes.namespace.statefulSet.count |
Metric Type | gauge |
Unit | number |
Description | The number of statefulset per namespace. |
Additional Notes | |
kube_node_info
| |
---|
Prometheus ID | kube_node_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_node_labels
| |
---|
Prometheus ID | kube_node_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_node_spec_unschedulable
| |
---|
Prometheus ID | kube_node_spec_unschedulable |
Legacy ID | kubernetes.node.unschedulable |
Metric Type | gauge |
Unit | number |
Description | The number of nodes unavailable to schedule new Pods. |
Additional Notes | |
kube_node_spec_taint
| |
---|
Prometheus ID | kube_node_spec_taint |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the taint’s key, value, and effect as labels on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_node_status_allocatable
| |
---|
Prometheus ID | kube_node_status_allocatable |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The amount of a resource on a node that is freely available. |
Additional Notes | The type and unit of the resource are stored as labels on the metric. |
kube_node_status_allocatable_cpu_cores
| |
---|
Prometheus ID | kube_node_status_allocatable_cpu_cores |
Legacy ID | kubernetes.node.allocatable.cpuCores |
Metric Type | gauge |
Unit | number |
Description | The CPU resources of a node that are available for scheduling. |
Additional Notes | |
kube_node_status_allocatable_memory_bytes
| |
---|
Prometheus ID | kube_node_status_allocatable_memory_bytes |
Legacy ID | kubernetes.node.allocatable.memBytes |
Metric Type | gauge |
Unit | data |
Description | The memory resources of a node that are available for scheduling. |
Additional Notes | |
kube_node_status_allocatable_pods
| |
---|
Prometheus ID | kube_node_status_allocatable_pods |
Legacy ID | kubernetes.node.allocatable.pods |
Metric Type | gauge |
Unit | number |
Description | The Pod resources of a node that are available for scheduling. |
Additional Notes | |
kube_node_status_capacity
| |
---|
Prometheus ID | kube_node_status_capacity |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The total amount of a resource on a node. |
Additional Notes | The type and unit of the resource are stored as labels on the metric. |
kube_node_status_capacity_cpu_cores
| |
---|
Prometheus ID | kube_node_status_capacity_cpu_cores |
Legacy ID | kubernetes.node.capacity.cpuCores |
Metric Type | gauge |
Unit | number |
Description | The maximum CPU resources of the node. |
Additional Notes | |
kube_node_status_capacity_memory_bytes
| |
---|
Prometheus ID | kube_node_status_capacity_memory_bytes |
Legacy ID | kubernetes.node.capacity.memBytes |
Metric Type | gauge |
Unit | data |
Description | The maximum memory resources of the node. |
Additional Notes | |
kube_node_status_capacity_pods
| |
---|
Prometheus ID | kube_node_status_capacity_pods |
Legacy ID | kubernetes.node.capacity.pods |
Metric Type | gauge |
Unit | number |
Description | The maximum number of Pods of the node. |
Additional Notes | |
kube_node_status_condition
| |
---|
Prometheus ID | kube_node_status_condition |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the condition of the node as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_node_sysdig_disk_pressure
| |
---|
Prometheus ID | kube_node_sysdig_disk_pressure |
Legacy ID | kubernetes.node.diskPressure |
Metric Type | gauge |
Unit | number |
Description | The number of nodes with disk pressure. |
Additional Notes | |
kube_node_sysdig_host
| |
---|
Prometheus ID | kube_node_sysdig_host |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the hostname of the node as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_node_sysdig_memory_pressure
| |
---|
Prometheus ID | kube_node_sysdig_memory_pressure |
Legacy ID | kubernetes.node.memoryPressure |
Metric Type | gauge |
Unit | number |
Description | The number of nodes with memory pressure. |
Additional Notes | |
kube_node_sysdig_network_unavailable
| |
---|
Prometheus ID | kube_node_sysdig_network_unavailable |
Legacy ID | kubernetes.node.networkUnavailable |
Metric Type | gauge |
Unit | number |
Description | The number of nodes with network unavailable. |
Additional Notes | |
kube_node_sysdig_ready
| |
---|
Prometheus ID | kube_node_sysdig_ready |
Legacy ID | kubernetes.node.ready |
Metric Type | gauge |
Unit | number |
Description | The number of nodes that are ready. |
Additional Notes | |
kube_persistentvolume_capacity_bytes
| |
---|
Prometheus ID | kube_persistentvolume_capacity_bytes |
Legacy ID | kubernetes.persistentvolume.storage |
Metric Type | gauge |
Unit | number |
Description | The persistent volume’s capacity. |
Additional Notes | |
kube_persistentvolume_claim_ref
| |
---|
Prometheus ID | kube_persistentvolume_claim_ref |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the claim’s name and namespace as labels on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_persistentvolume_info
| |
---|
Prometheus ID | kube_persistentvolume_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_persistentvolume_labels
| |
---|
Prometheus ID | kube_persistentvolume_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_persistentvolume_status_phase
| |
---|
Prometheus ID | kube_persistentvolume_status_phase |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the phase of the PV as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_persistentvolumeclaim_access_mode
| |
---|
Prometheus ID | kube_persistentvolumeclaim_access_mode |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the access mode of the PVC as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_persistentvolumeclaim_info
| |
---|
Prometheus ID | kube_persistentvolumeclaim_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_persistentvolumeclaim_labels
| |
---|
Prometheus ID | kube_persistentvolumeclaim_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_persistentvolumeclaim_resource_requests_storage_bytes
| |
---|
Prometheus ID | kube_persistentvolumeclaim_resource_requests_storage_bytes |
Legacy ID | kubernetes.persistentvolumeclaim.requests.storage |
Metric Type | gauge |
Unit | number |
Description | The amount of bytes that the PVC has requested. |
Additional Notes | |
kube_persistentvolumeclaim_status_phase
| |
---|
Prometheus ID | kube_persistentvolumeclaim_status_phase |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the phase of the PVC as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_persistentvolumeclaim_sysdig_storage
| |
---|
Prometheus ID | kube_persistentvolumeclaim_sysdig_storage |
Legacy ID | kubernetes.persistentvolumeclaim.storage |
Metric Type | gauge |
Unit | number |
Description | The actual resources of the underlying volume. |
Additional Notes | |
kube_pod_container_info
| |
---|
Prometheus ID | kube_pod_container_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_container_resource_limits
| |
---|
Prometheus ID | kube_pod_container_resource_limits |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The amount of the resource limit for a container in a pod. |
Additional Notes | |
kube_pod_container_resource_requests
| |
---|
Prometheus ID | kube_pod_container_resource_requests |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The amount of the resource request for a container in a pod. |
Additional Notes | |
kube_pod_container_status_last_terminated_reason
| |
---|
Prometheus ID | kube_pod_container_status_last_terminated_reason |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the reason for the last terminated state as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_container_status_ready
| |
---|
Prometheus ID | kube_pod_container_status_ready |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of containers in the Pod in the ready state. |
Additional Notes | |
kube_pod_container_status_restarts_total
| |
---|
Prometheus ID | kube_pod_container_status_restarts_total |
Legacy ID | |
Metric Type | counter |
Unit | number |
Description | The number of times that containers in the Pod have restarted. |
Additional Notes | |
kube_pod_container_status_running
| |
---|
Prometheus ID | kube_pod_container_status_running |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of containers in the Pod in the running state. |
Additional Notes | |
kube_pod_container_status_terminated
| |
---|
Prometheus ID | kube_pod_container_status_terminated |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of containers in the Pod in the terminated state. |
Additional Notes | |
kube_pod_container_status_terminated_reason
| |
---|
Prometheus ID | kube_pod_container_status_terminated_reason |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the reason that the container is in the terminated state as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_container_status_waiting
| |
---|
Prometheus ID | kube_pod_container_status_waiting |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of containers in the Pod in the waiting state. |
Additional Notes | |
kube_pod_container_status_waiting_reason
| |
---|
Prometheus ID | kube_pod_container_status_waiting_reason |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the reason that the container is in the waiting state as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_info
| |
---|
Prometheus ID | kube_pod_info |
Legacy ID | kubernetes.pod.info |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_init_container_resource_limits
| |
---|
Prometheus ID | kube_pod_init_container_resource_limits |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The amount of the resource limit for an init container in a pod. |
Additional Notes | |
kube_pod_init_container_resource_requests
| |
---|
Prometheus ID | kube_pod_init_container_resource_requests |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The amount of the resource request for an init container in a pod. |
Additional Notes | |
kube_pod_init_container_status_last_terminated_reason
| |
---|
Prometheus ID | kube_pod_init_container_status_last_terminated_reason |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the reason for the last terminated state as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_init_container_status_ready
| |
---|
Prometheus ID | kube_pod_init_container_status_ready |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of init containers in the Pod in the ready state. |
Additional Notes | |
kube_pod_init_container_status_restarts_total
| |
---|
Prometheus ID | kube_pod_init_container_status_restarts_total |
Legacy ID | |
Metric Type | counter |
Unit | number |
Description | The number of times that init containers in the Pod have restarted. |
Additional Notes | |
kube_pod_init_container_status_running
| |
---|
Prometheus ID | kube_pod_init_container_status_running |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of init containers in the Pod in the running state. |
Additional Notes | |
kube_pod_init_container_status_terminated
| |
---|
Prometheus ID | kube_pod_init_container_status_terminated |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of init containers in the Pod in the terminated state. |
Additional Notes | |
kube_pod_init_container_status_terminated_reason
| |
---|
Prometheus ID | kube_pod_init_container_status_terminated_reason |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the reason that the init container is in the terminated state as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_init_container_status_waiting
| |
---|
Prometheus ID | kube_pod_init_container_status_waiting |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The number of init containers in the Pod in the waiting state. |
Additional Notes | |
kube_pod_init_container_status_waiting_reason
| |
---|
Prometheus ID | kube_pod_init_container_status_waiting_reason |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the reason that the init container is in the waiting state as a label on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_labels
| |
---|
Prometheus ID | kube_pod_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_pod_owner
| |
---|
Prometheus ID | kube_pod_owner |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Information about the owner of the pod is stored as labels on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_spec_volumes_persistentvolumeclaims_info
| |
---|
Prometheus ID | kube_pod_spec_volumes_persistentvolumeclaims_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores information about the PVC specified in a Pod’s spec. |
Additional Notes | The value of the metric will always be 1. |
kube_pod_spec_volumes_persistentvolumeclaims_readonly
| |
---|
Prometheus ID | kube_pod_spec_volumes_persistentvolumeclaims_readonly |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Describes whether a PVC is mounted read-only. |
Additional Notes | The value of the metric wil be 1 if the PVC is read-only and 0 if not. |
kube_pod_sysdig_containers_waiting
| |
---|
Prometheus ID | kube_pod_sysdig_containers_waiting |
Legacy ID | kubernetes.pod.containers.waiting |
Metric Type | gauge |
Unit | number |
Description | The number of containers waiting for a Pod. |
Additional Notes | |
kube_pod_sysdig_resource_limits_cpu_cores
| |
---|
Prometheus ID | kube_pod_sysdig_resource_limits_cpu_cores |
Legacy ID | kubernetes.pod.resourceLimits.cpuCores |
Metric Type | gauge |
Unit | number |
Description | The limit on CPU cores to be used by a container. |
Additional Notes | |
kube_pod_sysdig_resource_limits_memory_bytes
| |
---|
Prometheus ID | kube_pod_sysdig_resource_limits_memory_bytes |
Legacy ID | kubernetes.pod.resourceLimits.memBytes |
Metric Type | gauge |
Unit | data |
Description | The limit on memory to be used by a container in bytes. |
Additional Notes | |
kube_pod_sysdig_resource_requests_cpu_cores
| |
---|
Prometheus ID | kube_pod_sysdig_resource_requests_cpu_cores |
Legacy ID | kubernetes.pod.resourceRequests.cpuCores |
Metric Type | gauge |
Unit | number |
Description | The number of CPU cores requested by containers in the Pod. |
Additional Notes | |
kube_pod_sysdig_resource_requests_memory_bytes
| |
---|
Prometheus ID | kube_pod_sysdig_resource_requests_memory_bytes |
Legacy ID | kubernetes.pod.resourceRequests.memBytes |
Metric Type | gauge |
Unit | data |
Description | The number of memory bytes requested by containers in the Pod. |
Additional Notes | |
kube_pod_sysdig_restart_count
| |
---|
Prometheus ID | kube_pod_sysdig_restart_count |
Legacy ID | kubernetes.pod.restart.count |
Metric Type | gauge |
Unit | number |
Description | The number of container restarts for the Pod. |
Additional Notes | |
kube_pod_sysdig_restart_rate
| |
---|
Prometheus ID | kube_pod_sysdig_restart_rate |
Legacy ID | kubernetes.pod.restart.rate |
Metric Type | gauge |
Unit | number |
Description | Number of times the pod has been restarted per second |
Additional Notes | |
kube_pod_sysdig_status_ready
| |
---|
Prometheus ID | kube_pod_sysdig_status_ready |
Legacy ID | kubernetes.pod.status.ready |
Metric Type | gauge |
Unit | number |
Description | The number of pods ready to serve requests. |
Additional Notes | |
kube_replicaset_labels
| |
---|
Prometheus ID | kube_replicaset_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_replicaset_owner
| |
---|
Prometheus ID | kube_replicaset_owner |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Information about the owner of the pod is stored as labels on the metric. |
Additional Notes | The value of the metric will always be 1. |
kube_replicaset_spec_replicas
| |
---|
Prometheus ID | kube_replicaset_spec_replicas |
Legacy ID | kubernetes.replicaSet.replicas.desired |
Metric Type | gauge |
Unit | number |
Description | The number of desired Pods per replicaSet. |
Additional Notes | |
kube_replicaset_status_fully_labeled_replicas
| |
---|
Prometheus ID | kube_replicaset_status_fully_labeled_replicas |
Legacy ID | kubernetes.replicaSet.replicas.fullyLabeled |
Metric Type | gauge |
Unit | number |
Description | The number of fully labeled Pods per replicaSet. |
Additional Notes | |
kube_replicaset_status_ready_replicas
| |
---|
Prometheus ID | kube_replicaset_status_ready_replicas |
Legacy ID | kubernetes.replicaSet.replicas.ready |
Metric Type | gauge |
Unit | number |
Description | The number of ready Pods per replicaSet. |
Additional Notes | |
kube_replicaset_status_replicas
| |
---|
Prometheus ID | kube_replicaset_status_replicas |
Legacy ID | kubernetes.replicaSet.replicas.running |
Metric Type | gauge |
Unit | number |
Description | The number of running Pods per replicaSet. |
Additional Notes | |
kube_resourcequota
| |
---|
Prometheus ID | kube_resourcequota |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The amount of a resource that the resource quota is configured for. |
Additional Notes | The resource type and whether the quota is hard or soft is stored as labels on the metric. |
kube_resourcequota_sysdig_limits_cpu_hard
| |
---|
Prometheus ID | kube_resourcequota_sysdig_limits_cpu_hard |
Legacy ID | kubernetes.resourcequota.limits.cpu.hard |
Metric Type | gauge |
Unit | number |
Description | Enforced CPU Limit quota per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_limits_cpu_used
| |
---|
Prometheus ID | kube_resourcequota_sysdig_limits_cpu_used |
Legacy ID | kubernetes.resourcequota.limits.cpu.used |
Metric Type | gauge |
Unit | number |
Description | Current observed CPU limit usage per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_limits_memory_hard
| |
---|
Prometheus ID | kube_resourcequota_sysdig_limits_memory_hard |
Legacy ID | kubernetes.resourcequota.limits.memory.hard |
Metric Type | gauge |
Unit | number |
Description | Enforced memory limit quota per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_limits_memory_used
| |
---|
Prometheus ID | kube_resourcequota_sysdig_limits_memory_used |
Legacy ID | kubernetes.resourcequota.limits.memory.used |
Metric Type | gauge |
Unit | number |
Description | Current observed memory limit usage per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_persistentvolumeclaims_hard
| |
---|
Prometheus ID | kube_resourcequota_sysdig_persistentvolumeclaims_hard |
Legacy ID | kubernetes.resourcequota.persistentvolumeclaims.hard |
Metric Type | gauge |
Unit | number |
Description | Enforced Peristentvolumeclaim quota per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_persistentvolumeclaims_used
| |
---|
Prometheus ID | kube_resourcequota_sysdig_persistentvolumeclaims_used |
Legacy ID | kubernetes.resourcequota.persistentvolumeclaims.used |
Metric Type | gauge |
Unit | number |
Description | Current observed Persistentvolumeclaim usage per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_pods_hard
| |
---|
Prometheus ID | kube_resourcequota_sysdig_pods_hard |
Legacy ID | kubernetes.resourcequota.pods.hard |
Metric Type | gauge |
Unit | number |
Description | Enforced Pod quota per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_pods_used
| |
---|
Prometheus ID | kube_resourcequota_sysdig_pods_used |
Legacy ID | kubernetes.resourcequota.pods.used |
Metric Type | gauge |
Unit | number |
Description | Current observed Pod usage per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_requests_cpu_hard
| |
---|
Prometheus ID | kube_resourcequota_sysdig_requests_cpu_hard |
Legacy ID | kubernetes.resourcequota.requests.cpu.hard |
Metric Type | gauge |
Unit | number |
Description | Enforced CPU request quota per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_requests_cpu_used
| |
---|
Prometheus ID | kube_resourcequota_sysdig_requests_cpu_used |
Legacy ID | kubernetes.resourcequota.requests.cpu.used |
Metric Type | gauge |
Unit | number |
Description | Current observed CPU request usage per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_requests_memory_hard
| |
---|
Prometheus ID | kube_resourcequota_sysdig_requests_memory_hard |
Legacy ID | kubernetes.resourcequota.requests.memory.hard |
Metric Type | gauge |
Unit | number |
Description | Enforced memory request quota per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_requests_memory_used
| |
---|
Prometheus ID | kube_resourcequota_sysdig_requests_memory_used |
Legacy ID | kubernetes.resourcequota.requests.memory.used |
Metric Type | gauge |
Unit | number |
Description | Current observed memory request usage per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_services_hard
| |
---|
Prometheus ID | kube_resourcequota_sysdig_services_hard |
Legacy ID | kubernetes.resourcequota.services.hard |
Metric Type | gauge |
Unit | number |
Description | Enforced service quota per namespace. |
Additional Notes | |
kube_resourcequota_sysdig_services_used
| |
---|
Prometheus ID | kube_resourcequota_sysdig_services_used |
Legacy ID | kubernetes.resourcequota.services.used |
Metric Type | gauge |
Unit | number |
Description | Current observed service usage per namespace. |
Additional Notes | |
kube_service_info
| |
---|
Prometheus ID | kube_service_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_service_labels
| |
---|
Prometheus ID | kube_service_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_statefulset_labels
| |
---|
Prometheus ID | kube_statefulset_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_statefulset_replicas
| |
---|
Prometheus ID | kube_statefulset_replicas |
Legacy ID | kubernetes.statefulSet.replicas |
Metric Type | gauge |
Unit | number |
Description | Desired number of replicas of the given Template. |
Additional Notes | |
kube_statefulset_status_replicas
| |
---|
Prometheus ID | kube_statefulset_status_replicas |
Legacy ID | kubernetes.statefulSet.status.replicas |
Metric Type | gauge |
Unit | number |
Description | Number of Pods created by the StatefulSet controller. |
Additional Notes | |
kube_statefulset_status_replicas_current
| |
---|
Prometheus ID | kube_statefulset_status_replicas_current |
Legacy ID | kubernetes.statefulSet.status.replicas.current |
Metric Type | gauge |
Unit | number |
Description | The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by currrentRevision. |
Additional Notes | |
kube_statefulset_status_replicas_ready
| |
---|
Prometheus ID | kube_statefulset_status_replicas_ready |
Legacy ID | kubernetes.statefulSet.status.replicas.ready |
Metric Type | gauge |
Unit | number |
Description | Number of Pods created by the StatefulSet controller that have a Ready Condition. |
Additional Notes | |
kube_statefulset_status_replicas_updated
| |
---|
Prometheus ID | kube_statefulset_status_replicas_updated |
Legacy ID | kubernetes.statefulSet.status.replicas.updated |
Metric Type | gauge |
Unit | number |
Description | Number of Pods created by the StatefulSet controller from the StatefulSet version indicated by updateRevision. |
Additional Notes | |
kube_storageclass_created
| |
---|
Prometheus ID | kube_storageclass_created |
Legacy ID | |
Metric Type | gauge |
Unit | - |
Description | Unix epoch time when the storageclass was created. |
Additional Notes | |
kube_storageclass_info
| |
---|
Prometheus ID | kube_storageclass_info |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | The labels on the metric store information about the object. |
Additional Notes | The value of the metric will always be 1. |
kube_storageclass_labels
| |
---|
Prometheus ID | kube_storageclass_labels |
Legacy ID | |
Metric Type | gauge |
Unit | number |
Description | Stores the labels associated with the object as labels on the metric. |
Additional Notes | The value of the metric will always be 1. The labels will be prepended with ’label_'. |
kube_workload_pods_status_phase
| |
---|
Prometheus ID | kube_workload_pods_status_phase |
Legacy ID | kubernetes.workload.pods.status.phase |
Metric Type | gauge |
Unit | number |
Description | The number of Pods in a particular phase for the workload. |
Additional Notes | Stores the phase as a label on the metric. |
kube_workload_status_replicas_misscheduled
| |
---|
Prometheus ID | kube_workload_status_replicas_misscheduled |
Legacy ID | kubernetes.workload.status.replicas.misscheduled |
Metric Type | gauge |
Unit | number |
Description | The number of running Pods for a workload that are not supposed to be running. |
Additional Notes | |
kube_workload_status_replicas_scheduled
| |
---|
Prometheus ID | kube_workload_status_replicas_scheduled |
Legacy ID | kubernetes.workload.status.replicas.scheduled |
Metric Type | gauge |
Unit | number |
Description | The number of Pods scheduled to be run for a workload. |
Additional Notes | |
kube_workload_status_replicas_updated
| |
---|
Prometheus ID | kube_workload_status_replicas_updated |
Legacy ID | kubernetes.workload.status.replicas.updated |
Metric Type | gauge |
Unit | number |
Description | The number of updated Pods per workload. |
Additional Notes | |
kube_workload_status_running
| |
---|
Prometheus ID | kube_workload_status_running |
Legacy ID | kubernetes.workload.status.running |
Metric Type | gauge |
Unit | number |
Description | The number of running Pods for a workload. |
Additional Notes | |
kube_workload_status_unavailable
| |
---|
Prometheus ID | kube_workload_status_unavailable |
Legacy ID | kubernetes.workload.status.unavailable |
Metric Type | gauge |
Unit | number |
Description | The number of unavailable Pods per workload. |
Additional Notes | |
6.2.8 - Network
sysdig_connection_net_connection_in_count
| |
---|
Prometheus ID | sysdig_connection_net_connection_in_count |
Legacy ID | net.connection.count.in |
Metric Type | counter |
Unit | number |
Description | The number of currently established client (inbound) connections. |
Additional Notes | net_connection* metric is especially useful when segmented by protocol, port or process. This is the TCP-level connection counts. Sysdig also performs heuristics to present UDP packets as connections based on src and dst IP for these. These calculation is depend on syscalls such as connect and accept. This is different from the net_request* metrics. They are calculated based on our classification of data by parsing read /write buffers associated with read/write and send/receive syscalls into request and response . Even though buffers are not evaluated to determine protocol-level info, Sysdig can determine that a request (for example, a certain server process has received a read syscall or a client process has sent a write syscall) has been made and an associated response has been sent. Using this information, Sysdig generates the metrics without protocol-level segmentation. The latency is determined using the time delta between a request and a response. |
sysdig_connection_net_connection_out_count
| |
---|
Prometheus ID | sysdig_connection_net_connection_out_count |
Legacy ID | net.connection.count.out |
Metric Type | counter |
Unit | number |
Description | The number of currently established server (outbound) connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_connection_net_connection_total_count
| |
---|
Prometheus ID | sysdig_connection_net_connection_total_count |
Legacy ID | net.connection.count.total |
Metric Type | counter |
Unit | number |
Description | The number of currently established connections. This value may exceed the sum of the inbound and outbound metrics since it represents client and server inter-host connections as well as internal only connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_connection_net_in_bytes
| |
---|
Prometheus ID | sysdig_connection_net_in_bytes |
Legacy ID | net.bytes.in |
Metric Type | counter |
Unit | data |
Description | The number of inbound network bytes. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_connection_net_out_bytes
| |
---|
Prometheus ID | sysdig_connection_net_out_bytes |
Legacy ID | net.bytes.out |
Metric Type | counter |
Unit | data |
Description | The number of outbound network bytes. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_connection_net_request_count
| |
---|
Prometheus ID | sysdig_connection_net_request_count |
Legacy ID | net.request.count |
Metric Type | counter |
Unit | number |
Description | The total number of network requests. Note, this value may exceed the sum of inbound and outbound requests, because this count includes requests over internal connections. |
Additional Notes | |
sysdig_connection_net_request_in_count
| |
---|
Prometheus ID | sysdig_connection_net_request_in_count |
Legacy ID | net.request.count.in |
Metric Type | counter |
Unit | number |
Description | The number of inbound network requests. |
Additional Notes | |
sysdig_connection_net_request_in_time
| |
---|
Prometheus ID | sysdig_connection_net_request_in_time |
Legacy ID | net.request.time.in |
Metric Type | counter |
Unit | time |
Description | The average time to serve an inbound request. |
Additional Notes | |
sysdig_connection_net_request_out_count
| |
---|
Prometheus ID | sysdig_connection_net_request_out_count |
Legacy ID | net.request.count.out |
Metric Type | counter |
Unit | number |
Description | The number of outbound network requests. |
Additional Notes | |
sysdig_connection_net_request_out_time
| |
---|
Prometheus ID | sysdig_connection_net_request_out_time |
Legacy ID | net.request.time.out |
Metric Type | counter |
Unit | time |
Description | The number of average time spent waiting for an outbound request. |
Additional Notes | |
sysdig_connection_net_request_time
| |
---|
Prometheus ID | sysdig_connection_net_request_time |
Legacy ID | net.request.time |
Metric Type | counter |
Unit | time |
Description | The number of average time to serve a network request. |
Additional Notes | |
sysdig_connection_net_total_bytes
| |
---|
Prometheus ID | sysdig_connection_net_total_bytes |
Legacy ID | net.bytes.total |
Metric Type | counter |
Unit | data |
Description | The total network bytes, including both inbound and outbound connections. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
6.2.9 - Program
sysdig_program_cpu_cores_used
| |
---|
Prometheus ID | sysdig_program_cpu_cores_used |
Legacy ID | cpu.cores.used |
Metric Type | gauge |
Unit | number |
Description | The CPU core usage of each program is obtained from cgroups, and is equal to the number of cores used by the program. For example, if a program uses two of an available four cores, the value of sysdig_program_cpu_cores_used will be two. |
Additional Notes | |
sysdig_program_cpu_cores_used_percent
| |
---|
Prometheus ID | sysdig_program_cpu_cores_used_percent |
Legacy ID | cpu.cores.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The CPU core usage percent for each program is obtained from cgroups, and is equal to the number of cores multiplied by 100. For example, if a program uses three cores, the value of sysdig_program_cpu_cores_used_percent would be 300%. |
Additional Notes | |
sysdig_program_cpu_used_percent
| |
---|
Prometheus ID | sysdig_program_cpu_used_percent |
Legacy ID | cpu.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The CPU usage for each program is obtained from cgroups, and normalized by dividing by the number of cores to determine an overall percentage. For example, if the environment contains six cores on a host, and the processes are assigned two cores, Sysdig will report CPU usage of 2/6 * 100% = 33.33%. This metric is calculated differently for hosts and containers. |
Additional Notes | |
sysdig_program_fd_used_percent
| |
---|
Prometheus ID | sysdig_program_fd_used_percent |
Legacy ID | fd.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of used file descriptors out of the maximum available. |
Additional Notes | Usually, when a process reaches its FD limit it will stop operating properly and possibly crash. As a consequence, this is a metric you want to monitor carefully, or even better use for alerts. |
sysdig_program_file_error_open_count
| |
---|
Prometheus ID | sysdig_program_file_error_open_count |
Legacy ID | file.error.open.count |
Metric Type | counter |
Unit | number |
Description | The number of errors caused by opening files. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_file_error_total_count
| |
---|
Prometheus ID | sysdig_program_file_error_total_count |
Legacy ID | file.error.total.count |
Metric Type | counter |
Unit | number |
Description | The number of error caused by file access. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_file_in_bytes
| |
---|
Prometheus ID | sysdig_program_file_in_bytes |
Legacy ID | file.bytes.in |
Metric Type | counter |
Unit | data |
Description | The number of bytes read from file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_file_in_iops
| |
---|
Prometheus ID | sysdig_program_file_in_iops |
Legacy ID | file.iops.in |
Metric Type | counter |
Unit | number |
Description | The number of file read operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_program_file_in_time
| |
---|
Prometheus ID | sysdig_program_file_in_time |
Legacy ID | file.time.in |
Metric Type | counter |
Unit | time |
Description | The time spent in file reading. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_file_open_count
| |
---|
Prometheus ID | sysdig_program_file_open_count |
Legacy ID | file.open.count |
Metric Type | counter |
Unit | number |
Description | The number of time the file has been opened. |
Additional Notes | |
sysdig_program_file_out_bytes
| |
---|
Prometheus ID | sysdig_program_file_out_bytes |
Legacy ID | file.bytes.out |
Metric Type | counter |
Unit | data |
Description | The number of bytes written to file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_file_out_iops
| |
---|
Prometheus ID | sysdig_program_file_out_iops |
Legacy ID | file.iops.out |
Metric Type | counter |
Unit | number |
Description | The number of file write operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_program_file_out_time
| |
---|
Prometheus ID | sysdig_program_file_out_time |
Legacy ID | file.time.out |
Metric Type | counter |
Unit | time |
Description | The time spent in file writing. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_file_total_bytes
| |
---|
Prometheus ID | sysdig_program_file_total_bytes |
Legacy ID | file.bytes.total |
Metric Type | counter |
Unit | data |
Description | The number of bytes read from and written to file. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_file_total_iops
| |
---|
Prometheus ID | sysdig_program_file_total_iops |
Legacy ID | file.iops.total |
Metric Type | counter |
Unit | number |
Description | The number of read and write file operations per second. |
Additional Notes | This is calculated by measuring the actual number of read and write requests made by a process. Therefore, it can differ from what other tools show, which is usually based on interpolating this value from the number of bytes read and written to the file system. |
sysdig_program_file_total_time
| |
---|
Prometheus ID | sysdig_program_file_total_time |
Legacy ID | file.time.total |
Metric Type | counter |
Unit | time |
Description | The time spent in file I/O. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_info
| |
---|
Prometheus ID | sysdig_program_info |
Legacy ID | info |
Metric Type | gauge |
Unit | number |
Description | |
Additional Notes | |
sysdig_program_memory_used_bytes
| |
---|
Prometheus ID | sysdig_program_memory_used_bytes |
Legacy ID | memory.bytes.used |
Metric Type | gauge |
Unit | data |
Description | The amount of physical memory currently in use. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, the metric can also be segmented by using ‘Segment by’ in the UI. |
sysdig_program_memory_used_percent
| |
---|
Prometheus ID | sysdig_program_memory_used_percent |
Legacy ID | memory.used.percent |
Metric Type | gauge |
Unit | percent |
Description | The percentage of physical memory in use. |
Additional Notes | By default, this metric shows the average value for the selected scope. For instance, if you apply it to a group of machines, you will see the average value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_net_connection_in_count
| |
---|
Prometheus ID | sysdig_program_net_connection_in_count |
Legacy ID | net.connection.count.in |
Metric Type | counter |
Unit | number |
Description | The number of currently established client (inbound) connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_program_net_connection_out_count
| |
---|
Prometheus ID | sysdig_program_net_connection_out_count |
Legacy ID | net.connection.count.out |
Metric Type | counter |
Unit | number |
Description | The number of currently established server (outbound) connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_program_net_connection_total_count
| |
---|
Prometheus ID | sysdig_program_net_connection_total_count |
Legacy ID | net.connection.count.total |
Metric Type | counter |
Unit | number |
Description | The number of currently established connections. This value may exceed the sum of the inbound and outbound metrics since it represents client and server inter-host connections as well as internal only connections. |
Additional Notes | This metric is especially useful when segmented by protocol, port or process. |
sysdig_program_net_error_count
| |
---|
Prometheus ID | sysdig_program_net_error_count |
Legacy ID | net.error.count |
Metric Type | counter |
Unit | number |
Description | The total number of network errors occurred in a second. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_net_in_bytes
| |
---|
Prometheus ID | sysdig_program_net_in_bytes |
Legacy ID | net.bytes.in |
Metric Type | counter |
Unit | data |
Description | The number of inbound network bytes. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_net_out_bytes
| |
---|
Prometheus ID | sysdig_program_net_out_bytes |
Legacy ID | net.bytes.out |
Metric Type | counter |
Unit | data |
Description | The number of outbound network bytes. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_net_request_count
| |
---|
Prometheus ID | sysdig_program_net_request_count |
Legacy ID | net.request.count |
Metric Type | counter |
Unit | number |
Description | The total number of network requests. Note, this value may exceed the sum of inbound and outbound requests, because this count includes requests over internal connections. |
Additional Notes | |
sysdig_program_net_request_in_count
| |
---|
Prometheus ID | sysdig_program_net_request_in_count |
Legacy ID | net.request.count.in |
Metric Type | counter |
Unit | number |
Description | The number of inbound network requests. |
Additional Notes | |
sysdig_program_net_request_in_time
| |
---|
Prometheus ID | sysdig_program_net_request_in_time |
Legacy ID | net.request.time.in |
Metric Type | counter |
Unit | time |
Description | The average time to serve an inbound request. |
Additional Notes | |
sysdig_program_net_request_out_count
| |
---|
Prometheus ID | sysdig_program_net_request_out_count |
Legacy ID | net.request.count.out |
Metric Type | counter |
Unit | number |
Description | The number of outbound network requests. |
Additional Notes | |
sysdig_program_net_request_out_time
| |
---|
Prometheus ID | sysdig_program_net_request_out_time |
Legacy ID | net.request.time.out |
Metric Type | counter |
Unit | time |
Description | The average time spent waiting for an outbound request. |
Additional Notes | |
sysdig_program_net_request_time
| |
---|
Prometheus ID | sysdig_program_net_request_time |
Legacy ID | net.request.time |
Metric Type | counter |
Unit | time |
Description | Average time to serve a network request. |
Additional Notes | |
sysdig_program_net_tcp_queue_len
| |
---|
Prometheus ID | sysdig_program_net_tcp_queue_len |
Legacy ID | net.tcp.queue.len |
Metric Type | counter |
Unit | number |
Description | The length of the TCP request queue. |
Additional Notes | |
sysdig_program_net_total_bytes
| |
---|
Prometheus ID | sysdig_program_net_total_bytes |
Legacy ID | net.bytes.total |
Metric Type | counter |
Unit | data |
Description | The total network bytes, including inbound and outbound connections, in a program. |
Additional Notes | By default, this metric shows the total value for the selected scope. For instance, if you apply it to a group of machines, you will see the total value for the whole group. However, you can easily segment the metric to see it by host, process, container, and so on. Just use ‘Segment by’ in the UI. |
sysdig_program_proc_count
| |
---|
Prometheus ID | sysdig_program_proc_count |
Legacy ID | proc.count |
Metric Type | counter |
Unit | number |
Description | The number of processes on a host or container. |
Additional Notes | |
sysdig_program_syscall_count
| |
---|
Prometheus ID | sysdig_program_syscall_count |
Legacy ID | syscall.count |
Metric Type | gauge |
Unit | number |
Description | The total number of syscalls seen |
Additional Notes | Syscalls are resource intensive. This metric tracks how many have been made by a given process or container |
sysdig_program_thread_count
| |
---|
Prometheus ID | sysdig_program_thread_count |
Legacy ID | thread.count |
Metric Type | counter |
Unit | number |
Description | The total number of threads running in a program. |
Additional Notes | |
sysdig_program_timeseries_count_appcheck
| |
---|
Prometheus ID | sysdig_program_timeseries_count_appcheck |
Legacy ID | metricCount.appCheck |
Metric Type | gauge |
Unit | number |
Description | The number of app check custom metrics. |
Additional Notes | |
sysdig_program_timeseries_count_jmx
| |
---|
Prometheus ID | sysdig_program_timeseries_count_jmx |
Legacy ID | metricCount.jmx |
Metric Type | gauge |
Unit | number |
Description | The number of JMS custom metrics. |
Additional Notes | |
sysdig_program_timeseries_count_prometheus
| |
---|
Prometheus ID | sysdig_program_timeseries_count_prometheus |
Legacy ID | metricCount.prometheus |
Metric Type | gauge |
Unit | number |
Description | The number of Prometheus custom metrics. |
Additional Notes | |
sysdig_program_up
| |
---|
Prometheus ID | sysdig_program_up |
Legacy ID | uptime |
Metric Type | gauge |
Unit | number |
Description | The percentage of time the selected entity was down during the visualized time sample. This can be used to determine if a machine (or a group of machines) went down. |
Additional Notes | |
6.2.10 - Provider
sysdig_cloud_provider_info
| |
---|
Prometheus ID | sysdig_cloud_provider_info |
Legacy ID | info |
Metric Type | gauge |
Unit | number |
Description | The metrics will always have the value of 1. |
Additional Notes | |
6.3 - Metrics in Sysdig Legacy Format
The Sysdig legacy metrics dictionary lists the default legacy metrics
supported by the Sysdig product suite, as well as kube state and cloud
provider metrics.
The metrics listed in this section follows the statsd-compatible Sysdig naming convention. To see a mapping between Prometheus notation and Sysdig notation, see Metrics and Label Mapping.
Overview
Each metric in the dictionary has several pieces of metadata listed to
provide greater context for how the metric can be used within Sysdig
products. An example layout is displayed below:
Metric Name
Metric definition. For some metrics, the equation for how the value is
determined is provided.
Metric Type | Metric type determines whether the metric value is a counter metric or a gauge metric. Sysdig Monitor offers two Metric types: Counter: The metric whose value keeps on increasing and is reliant on previous values. It helps you record how many times something has happened, for example, a user login. Gauge: Represents a single numerical value that can arbitrarily fluctuate over time. Each value returns an instantaneous measurement, for example, CPU usage. |
Value Type | The type of value the metric can have. The possible values are: Percent (%) Byte Date Double Integer (int) relativeTime String
|
Segment By | The levels within the infrastructure that the metric can be segmented at: Host Container Process Kubernetes Mesos Swarm CloudProvider
|
Default Time Aggregation | The default time aggregation format for the metric. |
Available Time Aggregation Formats | The time aggregation formats the metric can be aggregated by: Average (Avg) Rate Sum Minimum (Min) Maximum (Max)
|
Default Group Aggregation | The default group aggregation format for the metric. |
Available Group Aggregation Formats | The group aggregation formats the metric can be aggregated by: Average (Avg) Sum Minimum (Min) Maximum (Max)
|
6.3.1 - Agent
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
dragent.analyzer
dragent
is the main process in the agent that collects and collates
data from multiple sources, including syscall events from the kernel in
order to generate metrics. The analyzer
module that runs in the
dragent
process does much of the work involved in generating metrics.
These internal metrics are used to troubleshoot the health of the
analyzer component.
Sysdig Monitor provides the following analyzer metrics:
Metrics | Type | Minimum Agent Version | Description |
---|
dragent.analyzer.processes | gauge | 0.80.0 or above | The number of processes found by the analyzer. |
dragent.analyzer.threads | The number of threads found by the analyzer. | | |
dragent.analyzer.threads.dropped | counter | The number of threads not reported due to thread limits. | |
dragent.analyzer.containers | gauge | The number of containers found by the analyzer. | |
dragent.analyzer.javaprocs | The number of java processes found by the analyzer. | | |
dragent.analyzer.appchecks | The number of application checks reporting to the analyzer. | | |
dragent.analyzer.mesos.autodetect | If the agent is configured to autodetect a Mesos environment, value is 1, otherwise is 0. | | |
dragent.analyzer.mesos.detected | If the agent actually found a Mesos environment, value is 1, otherwise, value is 0 | | |
dragent.analyzer.fp.pct100 | The analyzer flush CPU % (0-100) | | |
dragent.analyzer.fl.ms | The analyzer flush duration (milliseconds) | | |
dragent.analyzer.sr | The current sampling ratio (1=all events, 2= half of events analyzed, 4=one fourth of events analyzed, and so on. | | |
dragent.analyzer.n_evts | The number of events processed | | |
dragent.analyzer.n_drops | The number of events dropped | | |
dragent.analyzer.n_drops_buffer | The number of events dropped due to the buffer being full. | | |
dragent.analyzer.n_preemptions | The number of driver preemptions. | | |
dragent.analyzer.n_command_lines | The number of command lines collected and sent to the collector. | | |
dragent.analyzer.command_line_cats.n_none | | | |
dragent.analyzer.n_container_healthcheck_command_lines | 0.80.1 or above | The number of command lines identified as container health checks. This metric does not change even if healthcheck command lines are not sent to the collector. | |
6.3.2 - Applications
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
The metrics in this section are collected from either default or
customized agent configurations for integrated applications. See also:
Integrate Applications (Default App
Checks).
Contents
6.3.2.1 - Apache Metrics
See Application Integrations for more information.
apache.conns_async_closing
The number of asynchronous closing connections.
apache.conns_async_keep_alive
The number of asynchronous keep-alive connections.
apache.conns_async_writing
The number of asynchronous write connections.
apache.conns_total
The total number of connections handled.
apache.net.bytes
The total number of bytes served.
apache.net.bytes_per_s
The number of bytes served per second.
apache.net.hits
The total number of requests performed.
apache.net.request_per_s
The number of requests performed per second.
The number of workers currently serving requests.
The percentage of CPU used.
The number of idle workers in the instance.
The amount of time the server has been running in seconds.
6.3.2.2 - Apache Kafka Metrics
Contents
6.3.2.2.1 - Apache Kafka Consumer Metrics
See Application Integrations for more information.
kafka.broker_offset
The current message offset value on the broker.
kafka.consumer_lag
The lag in messages between the consumer and the broker.
kafka.consumer_offset
The current message offset value on the consumer.
6.3.2.2.2 - Apache Kafka JMX Metrics
See Application Integrations for more information.
The kafka.consumer.*
and kafka.producer.*
metrics are only available
with JMX customization as documented in Integrate JMX Metrics from Java
Virtual Machines.
kafka.consumer.bytes_consumed
The average number of bytes consumed for a specific topic per second.
kafka.consumer.bytes_in
The rate of bytes coming in to the consumer.
kafka.consumer.delayed_requests
The number of delayed consumer requests.
kafka.consumer.expires_per_second
The rate of delayed consumer request expiration.
kafka.consumer.fetch_rate
The minimum rate at which the consumer sends fetch requests to a broker.
kafka.consumer.fetch_size_avg
The average number of bytes fetched for a specific topic per request.
kafka.consumer.fetch_size_max
The maximum number of bytes fetched for a specific topic per request.
kafka.consumer.kafka_commits
The rate of offset commits to Kafka.
kafka.consumer.max_lag
The maximum consumer lag.
kafka.consumer.messages_in
The rate of consumer message consumption.
kafka.consumer.records_consumed
The average number of records consumed per second for a specific topic.
kafka.consumer.records_per_request_avg
The average number of records in each request for a specific topic.
kafka.consumer.zookeeper_commits
The rate of offset commits to ZooKeeper.
kafka.expires_sec
The rate of delayed producer request expiration.
kafka.follower.expires_per_second
The rate of request expiration on followers.
kafka.log.flush_rate
The log flush rate.
kafka.messages_in
The incoming message rate.
kafka.net.bytes_in
The incoming byte rate.
kafka.net.bytes_out
The outgoing byte rate.
kafka.net.bytes_rejected
The rejected byte rate.
kafka.producer.available_buffer_bytes
The total amount of buffer memory, including unallocated buffer memory
and memory in the free list, that is not being used.
kafka.producer.batch_size_avg
The average number of bytes sent per partition per-request.
kafka.producer.batch_size_max
The maximum number of bytes sent per partition per-request.
kafka.producer.buffer_bytes_total
The maximum amount of buffer memory the client can use.
kafka.producer.bufferpool_wait_time
The fraction of time an appender waits for space allocation.
kafka.producer.bytes_out
The rate of bytes going out for the producer.
kafka.producer.compression_rate
The average compression rate of record batches for a topic.
kafka.producer.compression_rate_avg
The average compression rate of record batches.
kafka.producer.delayed_requests
The number of producer requests delayed.
kafka.producer.expires_per_seconds
The rate of producer request expiration.
kafka.producer.io_wait
The producer I/O wait time.
kafka.producer.message_rate
The producer message rate.
The age of the current producer metadata being used, in seconds.
kafka.producer.record_error_rate
The average number of retried record sends for a topic per second.
kafka.producer.record_queue_time_avg
The average time that record batches spent in the record accumulator, in
milliseconds.
kafka.producer.record_queue_time_max
The maximum amount of time record batches can spend in the record
accumulator, in milliseconds.
kafka.producer.record_retry_rate
The average number of retried record sends for a topic per second.
kafka.producer.record_send_rate
The average number of records sent per second for a topic.
kafka.producer.record_size_avg
The average record size.
kafka.producer.record_size_max
The maximum record size.
kafka.producer.records_per_request
The average number of records sent per second.
kafka.producer.request_latency_avg
The average request latency of the producer.
kafka.producer.request_latency_max
The maximum request latency in milliseconds.
kafka.producer.request_rate
The number of producer requests per second.
kafka.producer.requests_in_flight
The current number of in-flight requests awaiting a response
kafka.producer.response_rate
The number of producer responses per second.
kafka.producer.throttle_time_avg
The average time in a request was throttled by a broker, in
milliseconds.
kafka.producer.throttle_time_max
The maximum time in a request was throttled by a broker, in
milliseconds.
kafka.producer.waiting_threads
The number of user threads blocked waiting for buffer memory to enqueue
their records.
kafka.replication.isr_expands
The rate of replicas joining the ISR pool.
kafka.replication.isr_shrinks
The rate of replicas leaving the ISR pool.
kafka.replication.leader_elections
The leader election rate.
kafka.replication.unclean_leader_elections
The unclean leader election rate.
kafka.replication.under_replicated_partitions
The number of unreplicated partitions.
kafka.request.fetch.failed
The number of client fetch request failures.
kafka.request.fetch.failed_per_second
The rate of client fetch request failures per second.
kafka.request.fetch.time.99percentile
The time for fetch requests for the 99th percentile.
kafka.request.fetch.time.avg
The average time per fetch request.
kafka.request.handler.avg.idle.pct
The average fraction of time the request handler threads are idle.
The time for metadata requests for 99th percentile.
The average time for a metadata request.
kafka.request.offsets.time.99percentile
The time for offset requests for the 99th percentile.
kafka.request.offsets.time.avg
The average time for an offset request.
kafka.request.produce.failed
The number of failed produce requests.
kafka.request.produce.failed_per_second
The rate of failed produce requests per second.
kafka.request.produce.time.99percentile
The time for produce requests for the 99th percentile.
kafka.request.produce.time.avg
The average time for a produce request.
The time for update metadata requests for the 99th percentile
The average time for a request to update metadata.
6.3.2.3 - Consul Metrics
Contents
6.3.2.3.1 - Base Consul Metrics
See Application Integrations for more information.
consul.catalog.nodes_critical
Number of nodes with service status `critical` from those registered.
consul.catalog.nodes_passing
Number of nodes with service status `passing` from those registered.
consul.catalog.nodes_up
Number of nodes.
consul.catalog.nodes_warning
Number of nodes with service status `warning` from those registered.
consul.catalog.services_critical
Total critical services on nodes.
consul.catalog.services_passing
Total passing services on nodes.
consul.catalog.services_up
Total services registered on nodes.
consul.catalog.services_warning
Total warning services on nodes.
consul.catalog.total_nodes
Number of nodes registered in the consul cluster.
consul.net.node.latency.max
Maximum latency from this node to all others.
Median latency from this node to all others.
consul.net.node.latency.min
Minimum latency from this node to all others.
consul.net.node.latency.p25
p25 latency from this node to all others.
consul.net.node.latency.p75
p75 latency from this node to all others.
consul.net.node.latency.p90
p90 latency from this node to all others.
consul.net.node.latency.p95
p95 latency from this node to all others.
consul.net.node.latency.p99
p99 latency from this node to all others.
consul.peers
Number of peers in the peer set.
6.3.2.3.2 - Consul StatsD Metrics
See Application Integrations for more information.
consul.memberlist.msg.suspect
Number of times an agent suspects another as failed while probing during
gossip protocol.
consul.raft.apply
Number of raft transactions occurring.
consul.raft.commitTime.95percentile
The p95 time it takes to commit a new entry to the raft log on the
leader.
consul.raft.commitTime.avg
The average time it takes to commit a new entry to the raft log on the
leader.
consul.raft.commitTime.count
The number of samples of raft.commitTime
consul.raft.commitTime.max
The max time it takes to commit a new entry to the raft log on the
leader.
The median time it takes to commit a new entry to the raft log on the
leader.
consul.raft.leader.dispatchLog.95percentile
The p95 time it takes for the leader to write log entries to disk.
consul.raft.leader.dispatchLog.avg
The average time it takes for the leader to write log entries to disk.
consul.raft.leader.dispatchLog.count
The number of samples of raft.leader.dispatchLog.
consul.raft.leader.dispatchLog.max
The max time it takes for the leader to write log entries to disk.
The median time it takes for the leader to write log entries to disk.
P95 time elapsed since the leader was last able to check its lease with
followers.
Average time elapsed since the leader was last able to check its lease
with followers.
The number of samples of raft.leader.lastContact.
Max time elapsed since the leader was last able to check its lease with
followers.
Median time elapsed since the leader was last able to check its lease
with followers.
consul.raft.state.candidate
The number of initiated leader elections.
consul.raft.state.leader
Number of completed leader elections.
consul.runtime.alloc_bytes
Current bytes allocated by the Consul process.
consul.runtime.free_count
Cumulative count of heap objects freed.
consul.runtime.heap_objects
Number of objects allocated on the heap.
consul.runtime.malloc_count
Cumulative count of heap objects allocated.
Number of running goroutines.
consul.runtime.sys_bytes
Total size of the virtual address space reserved by the Go runtime.
consul.runtime.total_gc_pause_ns
Cumulative nanoseconds in GC stop-the-world pauses since Consul started.
consul.runtime.total_gc_runs
Number of completed GC cycles.
consul.serf.events
Incremented when an agent processes a serf event.
consul.serf.member.flap
Number of times an agent is marked dead and then quickly recovers.
consul.serf.member.join
Incremented when an agent processes a join event.
6.3.2.4 - Couchbase Metrics
See Application Integrations for more information.
couchbase.by_bucket.avg_bg_wait_time
The average background wait time.
couchbase.by_bucket.avg_disk_commit_time
The average disk commit time.
couchbase.by_bucket.avg_disk_update_time
The average disk update time.
couchbase.by_bucket.bg_wait_total
The total background wait time.
couchbase.by_bucket.bytes_read
The number of bytes read.
couchbase.by_bucket.bytes_written
The number of bytes written.
couchbase.by_bucket.cas_badval
The number of compare and swap bad values.
couchbase.by_bucket.cas_hits
The number of compare and swap hits.
couchbase.by_bucket.cas_misses
The number of compare and swap misses.
couchbase.by_bucket.cmd_get
The number of compare and swap gets.
couchbase.by_bucket.cmd_set
The number of compare and swap sets.
couchbase.by_bucket.couch_docs_actual_disk_size
The size of the couchbase docs on disk.
couchbase.by_bucket.couch_docs_data_size
The data size of the couchbase docs.
couchbase.by_bucket.couch_docs_disk_size
Couch docs total size in bytes.
couchbase.by_bucket.couch_docs_fragmentation
The percentage of couchbase docs fragmentation.
couchbase.by_bucket.couch_spatial_data_size
The size of object data for spatial views.
couchbase.by_bucket.couch_spatial_disk_size
The amount of disk space occupied by spatial views.
couchbase.by_bucket.couch_spatial_ops
Spatial operations.
couchbase.by_bucket.couch_total_disk_size
The total disk size for couchbase.
couchbase.by_bucket.couch_views_data_size
The size of object data for views.
couchbase.by_bucket.couch_views_disk_size
The amount of disk space occupied by views.
couchbase.by_bucket.couch_views_fragmentation
The view fragmentation.
couchbase.by_bucket.couch_views_ops
View operations.
couchbase.by_bucket.cpu_idle_ms
CPU idle milliseconds.
couchbase.by_bucket.cpu_utilization_rate
CPU utilization percentage.
couchbase.by_bucket.curr_connections
Current bucket connections.
couchbase.by_bucket.curr_items
Number of active items in memory.
couchbase.by_bucket.curr_items_tot
Total number of items.
couchbase.by_bucket.decr_hits
Decrement hits.
couchbase.by_bucket.decr_misses
Decrement misses.
couchbase.by_bucket.delete_hits
Delete hits.
couchbase.by_bucket.delete_misses
Delete misses.
couchbase.by_bucket.disk_commit_count
Disk commits.
couchbase.by_bucket.disk_update_count
Disk updates.
couchbase.by_bucket.disk_write_queue
Disk write queue depth.
couchbase.by_bucket.ep_bg_fetched
Disk reads per second.
couchbase.by_bucket.ep_cache_miss_rate
Cache miss rate.
couchbase.by_bucket.ep_cache_miss_ratio
Cache miss ratio.
couchbase.by_bucket.ep_dcp_2i_backoff
Number of backoffs for indexes DCP connections.
couchbase.by_bucket.ep_dcp_2i_count
Number of indexes DCP connections.
couchbase.by_bucket.ep_dcp_2i_items_remaining
Number of indexes items remaining to be sent.
couchbase.by_bucket.ep_dcp_2i_items_sent
Number of indexes items sent.
couchbase.by_bucket.ep_dcp_2i_producer_count
Number of indexes producers
couchbase.by_bucket.ep_dcp_2i_total_bytes
Number bytes per second being sent for indexes DCP connections.
couchbase.by_bucket.ep_dcp_fts_backoff
Number of backoffs for fts DCP connections.
couchbase.by_bucket.ep_dcp_fts_count
Number of fts DCP connections.
couchbase.by_bucket.ep_dcp_fts_items_remaining
Number of fts items remaining to be sent.
couchbase.by_bucket.ep_dcp_fts_items_sent
Number of fts items sent.
couchbase.by_bucket.ep_dcp_fts_producer_count
Number of fts producers.
couchbase.by_bucket.ep_dcp_fts_total_bytes
Number bytes per second being sent for fts DCP connections.
couchbase.by_bucket.ep_dcp_other_backoff
Number of backoffs for other DCP connections.
couchbase.by_bucket.ep_dcp_other_count
Number of other DCP connections.
couchbase.by_bucket.ep_dcp_other_items_remaining
Number of other items remaining to be sent.
couchbase.by_bucket.ep_dcp_other_items_sent
Number of other items sent.
couchbase.by_bucket.ep_dcp_other_producer_count
Number of other producers.
couchbase.by_bucket.ep_dcp_other_total_bytes
Number bytes per second being sent for other DCP connections.
couchbase.by_bucket.ep_dcp_replica_backoff
Number of backoffs for replica DCP connections.
couchbase.by_bucket.ep_dcp_replica_count
Number of replica DCP connections.
couchbase.by_bucket.ep_dcp_replica_items_remaining
Number of replica items remaining to be sent.
couchbase.by_bucket.ep_dcp_replica_items_sent
Number of replica items sent.
couchbase.by_bucket.ep_dcp_replica_producer_count
Number of replica producers.
couchbase.by_bucket.ep_dcp_replica_total_bytes
Number bytes per second being sent for replica DCP connections.
couchbase.by_bucket.ep_dcp_views_backoff
Number of backoffs for views DCP connections.
couchbase.by_bucket.ep_dcp_views_count
Number of views DCP connections.
couchbase.by_bucket.ep_dcp_views_items_remaining
Number of views items remaining to be sent.
couchbase.by_bucket.ep_dcp_views_items_sent
Number of views items sent.
couchbase.by_bucket.ep_dcp_views_producer_count
Number of views producers.
couchbase.by_bucket.ep_dcp_views_total_bytes
Number bytes per second being sent for views DCP connections.
couchbase.by_bucket.ep_dcp_xdcr_backoff
Number of backoffs for xdcr DCP connections.
couchbase.by_bucket.ep_dcp_xdcr_count
Number of xdcr DCP connections.
couchbase.by_bucket.ep_dcp_xdcr_items_remaining
Number of xdcr items remaining to be sent.
couchbase.by_bucket.ep_dcp_xdcr_items_sent
Number of xdcr items sent.
couchbase.by_bucket.ep_dcp_xdcr_producer_count
Number of xdcr producers.
couchbase.by_bucket.ep_dcp_xdcr_total_bytes
Number bytes per second being sent for xdcr DCP connections.
couchbase.by_bucket.ep_diskqueue_drain
Total Drained items on disk queue.
couchbase.by_bucket.ep_diskqueue_fill
Total enqueued items on disk queue.
couchbase.by_bucket.ep_diskqueue_items
Total number of items waiting to be written to disk.
couchbase.by_bucket.ep_flusher_todo
Number of items currently being written.
couchbase.by_bucket.ep_item_commit_failed
Number of times a transaction failed to commit due to storage errors.
couchbase.by_bucket.ep_kv_size
Total amount of user data cached in RAM in this bucket.
couchbase.by_bucket.ep_max_size
The maximum amount of memory this bucket can use.
couchbase.by_bucket.ep_mem_high_wat
Memory usage high water mark for auto-evictions.
couchbase.by_bucket.ep_mem_low_wat
Memory usage low water mark for auto-evictions.
Total amount of item metadata consuming RAM in this bucket.
couchbase.by_bucket.ep_num_non_resident
Number of non-resident items.
Number of delete operations per second for this bucket as the target for
XDCR.
Number of delRetMeta operations per second for this bucket as the target
for XDCR.
Number of read operations per second for this bucket as the target for
XDCR.
Number of set operations per second for this bucket as the target for
XDCR.
Number of setRetMeta operations per second for this bucket as the target
for XDCR.
couchbase.by_bucket.ep_num_value_ejects
Number of times item values got ejected from memory to disk.\
couchbase.by_bucket.ep_oom_errors
Number of times unrecoverable OOMs happened while processing operations.
couchbase.by_bucket.ep_ops_create
Create operations.
couchbase.by_bucket.ep_ops_update
Update operations.
couchbase.by_bucket.ep_overhead
Extra memory used by transient data like persistence queues or
checkpoints.
couchbase.by_bucket.ep_queue_size
Number of items queued for storage.
couchbase.by_bucket.ep_resident_items_rate
Number of resident items.
couchbase.by_bucket.ep_tap_replica_queue_drain
Total drained items in the replica queue.
couchbase.by_bucket.ep_tap_total_queue_drain
Total drained items in the queue.
couchbase.by_bucket.ep_tap_total_queue_fill
Total enqueued items in the queue.
couchbase.by_bucket.ep_tap_total_total_backlog_size
Number of remaining items for replication.
couchbase.by_bucket.ep_tmp_oom_errors
Number of times recoverable OOMs happened while processing operations.
couchbase.by_bucket.ep_vb_total
Total number of vBuckets for this bucket.
couchbase.by_bucket.evictions
Number of evictions
couchbase.by_bucket.get_hits
Number of get hits
couchbase.by_bucket.get_misses
Number of get misses.
couchbase.by_bucket.hibernated_requests
Number of streaming requests now idle.
couchbase.by_bucket.hibernated_waked
Rate of streaming request wakeups.
couchbase.by_bucket.hit_ratio
Hit ratio.
couchbase.by_bucket.incr_hits
Number of increment hits.
couchbase.by_bucket.incr_misses
Number of increment misses.
couchbase.by_bucket.mem_actual_free
Free memory.
couchbase.by_bucket.mem_actual_used
Used memory.
couchbase.by_bucket.mem_free
Free memory.
couchbase.by_bucket.mem_total
Total available memory.
couchbase.by_bucket.mem_used (deprecated)
Engine’s total memory usage.
couchbase.by_bucket.mem_used_sys
System memory usage.
couchbase.by_bucket.misses
Total number of misses.
couchbase.by_bucket.ops
Total number of operations.
couchbase.by_bucket.page_faults
Number of page faults.
couchbase.by_bucket.replication_docs_rep_queue
couchbase.by_bucket.rest_requests
Number of HTTP requests.
couchbase.by_bucket.swap_total
Total amount of swap available.
couchbase.by_bucket.swap_used
Amount of swap used.
couchbase.by_bucket.vb_active_eject
Number of items per second being ejected to disk from active vBuckets.
couchbase.by_bucket.vb_active_itm_memory
Amount of active user data cached in RAM in this bucket.
Amount of active item metadata consuming RAM in this bucket.
couchbase.by_bucket.vb_active_num
Number of active items.
couchbase.by_bucket.vb_active_num_non_resident
Number of non resident vBuckets in the active state for this bucket.
couchbase.by_bucket.vb_active_ops_create
New items per second being inserted into active vBuckets in this bucket.
couchbase.by_bucket.vb_active_ops_update
Number of items updated on active vBucket per second for this bucket.
couchbase.by_bucket.vb_active_queue_age
Sum of disk queue item age in milliseconds.
couchbase.by_bucket.vb_active_queue_drain
Total drained items in the queue.
couchbase.by_bucket.vb_active_queue_fill
Number of active items per second being put on the active item disk
queue.
couchbase.by_bucket.vb_active_queue_size
Number of active items in the queue.
couchbase.by_bucket.vb_active_resident_items_ratio
Number of resident items.
couchbase.by_bucket.vb_avg_active_queue_age
Average age in seconds of active items in the active item queue.
couchbase.by_bucket.vb_avg_pending_queue_age
Average age in seconds of pending items in the pending item queue.
couchbase.by_bucket.vb_avg_replica_queue_age
Average age in seconds of replica items in the replica item queue.
couchbase.by_bucket.vb_avg_total_queue_age
Average age of items in the queue.
couchbase.by_bucket.vb_pending_curr_items
Number of items in pending vBuckets.
couchbase.by_bucket.vb_pending_eject
Number of items per second being ejected to disk from pending vBuckets.
couchbase.by_bucket.vb_pending_itm_memory
Amount of pending user data cached in RAM in this bucket.
Amount of pending item metadata consuming RAM in this bucket.
couchbase.by_bucket.vb_pending_num
Number of pending items.
couchbase.by_bucket.vb_pending_num_non_resident
Number of non resident vBuckets in the pending state for this bucket.
couchbase.by_bucket.vb_pending_ops_create
Number of pending create operations.
couchbase.by_bucket.vb_pending_ops_update
Number of items updated on pending vBucket per second for this bucket.
couchbase.by_bucket.vb_pending_queue_age
Sum of disk pending queue item age in milliseconds.
couchbase.by_bucket.vb_pending_queue_drain
Total drained pending items in the queue.
couchbase.by_bucket.vb_pending_queue_fill
Total enqueued pending items on disk queue.
couchbase.by_bucket.vb_pending_queue_size
Number of pending items in the queue.
couchbase.by_bucket.vb_pending_resident_items_ratio
Number of resident pending items.
couchbase.by_bucket.vb_replica_curr_items
Number of in memory items.
couchbase.by_bucket.vb_replica_eject
Number of items per second being ejected to disk from replica vBuckets.
couchbase.by_bucket.vb_replica_itm_memory
Amount of replica user data cached in RAM in this bucket.
Total metadata memory.
couchbase.by_bucket.vb_replica_num
Number of replica vBuckets.
couchbase.by_bucket.vb_replica_num_non_resident
Number of non resident vBuckets in the replica state for this bucket.
couchbase.by_bucket.vb_replica_ops_create
Number of replica create operations.
couchbase.by_bucket.vb_replica_ops_update
Number of items updated on replica vBucket per second for this bucket.
couchbase.by_bucket.vb_replica_queue_age
Sum of disk replica queue item age in milliseconds.
couchbase.by_bucket.vb_replica_queue_drain
Total drained replica items in the queue.
couchbase.by_bucket.vb_replica_queue_fill
Total enqueued replica items on disk queue.
couchbase.by_bucket.vb_replica_queue_size
Replica items in disk queue.
couchbase.by_bucket.vb_replica_resident_items_ratio
Number of resident replica items.
couchbase.by_bucket.vb_total_queue_age
Sum of disk queue item age in milliseconds.
couchbase.by_bucket.xdc_ops
Number of cross-datacenter replication operations.
couchbase.by_node.couch_docs_actual_disk_size
Couch docs total size on disk in bytes.
couchbase.by_node.couch_docs_data_size
Couch docs data size in bytes.
couchbase.by_node.couch_views_actual_disk_size
Couch views total size on disk in bytes.
couchbase.by_node.couch_views_data_size
Couch views data size on disk in bytes.
couchbase.by_node.curr_items
Number of active items in memory.
couchbase.by_node.curr_items_tot
Total number of items.
couchbase.by_node.vb_replica_curr_items
Number of in memory items.
couchbase.hdd.free
Free hard disk space.
couchbase.hdd.quota_total
Hard disk quota.
couchbase.hdd.total
Total hard disk space.
couchbase.hdd.used
Used hard disk space.
couchbase.hdd.used_by_data
Hard disk used for data.
couchbase.query.cores
couchbase.query.cpu_sys_percent
couchbase.query.cpu_user_percent
couchbase.query.gc_num
couchbase.query.gc_pause_percent
couchbase.query.gc_pause_time
couchbase.query.memory_system
couchbase.query.memory_total
couchbase.query.memory_usage
couchbase.query.request_active_count
couchbase.query.request_completed_count
couchbase.query.request_per_sec_15min
couchbase.query.request_per_sec_1min
couchbase.query.request_per_sec_5min
couchbase.query.request_prepared_percent
couchbase.query.request_time_80percentile
couchbase.query.request_time_95percentile
couchbase.query.request_time_99percentile
couchbase.query.request_time_mean
couchbase.query.total_threads
couchbase.ram.quota_total
RAM quota.
couchbase.ram.total
The total RAM available.
couchbase.ram.used
The amount of RAM in use.
couchbase.ram.used_by_data
The amount of RAM used for data.
6.3.2.5 - Elasticsearch Metrics
See Application Integrations for more information.
All Elasticsearch metrics have the type gauge.
elasticsearch.active_primary_shards
The number of active primary shards in the cluster.
elasticsearch.active_shards
The number of active shards in the cluster.
elasticsearch.breakers.fielddata.estimated_size_in_bytes
The estimated size in bytes of the field data circuit breaker.
elasticsearch.breakers.fielddata.overhead
The constant multiplier for byte estimations of the field data circuit
breaker.
elasticsearch.breakers.fielddata.tripped
The number of times the field data circuit breaker has tripped.
elasticsearch.breakers.parent.estimated_size_in_bytes
The estimated size in bytes of the parent circuit breaker.
elasticsearch.breakers.parent.overhead
The constant multiplier for byte estimations of the parent circuit
breaker.
elasticsearch.breakers.parent.tripped
The number of times the parent circuit breaker has tripped.
elasticsearch.breakers.request.estimated_size_in_bytes
The estimated size in bytes of the request circuit breaker.
elasticsearch.breakers.request.overhead
The constant multiplier for byte estimations of the request circuit
breaker.
elasticsearch.breakers.request.tripped
The number of times the request circuit breaker has tripped.
elasticsearch.breakers.inflight_requests.tripped
The number of times the inflight circuit breaker has tripped.
elasticsearch.breakers.inflight_requests.overhead
The constant multiplier for byte estimations of the inflight circuit
breaker.
elasticsearch.breakers.inflight_requests.estimated_size_in_bytes
The estimated size in bytes of the inflight circuit breaker.
elasticsearch.cache.field.evictions
The total number of evictions from the field data cache.
elasticsearch.cache.field.size
The size of the field cache.
elasticsearch.cache.filter.count
The number of items in the filter cache.
elasticsearch.cache.filter.evictions
The total number of evictions from the filter cache.
elasticsearch.cache.filter.size
The size of the filter cache.
elasticsearch.cluster_status
The elasticsearch cluster health as a number: red = 0, yellow = 1, green
= 2
elasticsearch.docs.count
The total number of documents in the cluster across all shards.
elasticsearch.docs.deleted
The total number of documents deleted from the cluster across all
shards.
elasticsearch.fielddata.evictions
The total number of evictions from the fielddata cache.
elasticsearch.fielddata.size
The size of the fielddata cache.
elasticsearch.flush.total
The total number of index flushes to disk since start.
elasticsearch.flush.total.time
The total time spent flushing the index to disk.
elasticsearch.fs.total.available_in_bytes
The total number of bytes available to this Java virtual machine on this
file store.
elasticsearch.fs.total.disk_io_op
The total I/O operations on the file store.
elasticsearch.fs.total.disk_io_size_in_bytes
Total bytes used for all I/O operations on the file store.
elasticsearch.fs.total.disk_read_size_in_bytes
The total bytes read from the file store.
elasticsearch.fs.total.disk_reads
The total number of reads from the file store.
elasticsearch.fs.total.disk_write_size_in_bytes
The total bytes written to the file store.
elasticsearch.fs.total.disk_writes
The total number of writes to the file store.
elasticsearch.fs.total.free_in_bytes
The total number of unallocated bytes in the file store.
elasticsearch.fs.total.total_in_bytes
The total size in bytes of the file store.
elasticsearch.get.current
The number of get requests currently running.
elasticsearch.get.exists.time
The total time spent on get requests where the document existed.
elasticsearch.get.exists.total
The total number of get requests where the document existed.
elasticsearch.get.missing.time
The total time spent on get requests where the document was missing.
elasticsearch.get.missing.total
The total number of get requests where the document was missing.
elasticsearch.get.time
The total time spent on get requests.
elasticsearch.get.total
The total number of get requests.
elasticsearch.http.current_open
The number of current open HTTP connections.
elasticsearch.http.total_opened
The total number of opened HTTP connections.
elasticsearch.id_cache.size
The size of the id cache
elasticsearch.indexing.delete.current
The number of documents currently being deleted from an index.
elasticsearch.indexing.delete.time
The total time spent deleting documents from an index.
elasticsearch.indexing.delete.total
The total number of documents deleted from an index.
elasticsearch.indexing.index.current
The number of documents currently being indexed to an index.
elasticsearch.indexing.index.time
The total time spent indexing documents to an index.
elasticsearch.indexing.index.total
The total number of documents indexed to an index.
elasticsearch.indices.count
The number of indices in the cluster.
elasticsearch.indices.indexing.index_failed
The number of failed indexing operations.
elasticsearch.indices.indexing.throttle_time
The total time indexing waited due to throttling.
elasticsearch.indices.query_cache.evictions
The number of query cache evictions.
elasticsearch.indices.query_cache.hit_count
The number of query cache hits.
elasticsearch.indices.query_cache.memory_size_in_bytes
The memory used by the query cache.
elasticsearch.indices.query_cache.miss_count
The number of query cache misses.
elasticsearch.indices.recovery.current_as_source
The number of ongoing recoveries for which a shard serves as a source.
elasticsearch.indices.recovery.current_as_target
The number of ongoing recoveries for which a shard serves as a target.
elasticsearch.indices.recovery.throttle_time
The total time recoveries waited due to throttling.
elasticsearch.indices.request_cache.evictions
The number of request cache evictions.
elasticsearch.indices.request_cache.hit_count
The number of request cache hits.
elasticsearch.indices.request_cache.memory_size_in_bytes
The memory used by the request cache.
elasticsearch.indices.request_cache.miss_count
The number of request cache misses.
elasticsearch.indices.segments.count
The number of segments in an index shard.
elasticsearch.indices.segments.doc_values_memory_in_bytes
The memory used by doc values.
elasticsearch.indices.segments.fixed_bit_set_memory_in_bytes
The memory used by fixed bit set.
elasticsearch.indices.segments.index_writer_max_memory_in_bytes
The maximum memory used by the index writer.
elasticsearch.indices.segments.index_writer_memory_in_bytes
The memory used by the index writer.
elasticsearch.indices.segments.memory_in_bytes
The memory used by index segments.
elasticsearch.indices.segments.norms_memory_in_bytes
The memory used by norms.
elasticsearch.indices.segments.stored_fields_memory_in_bytes
The memory used by stored fields.
elasticsearch.indices.segments.term_vectors_memory_in_bytes
The memory used by term vectors.
elasticsearch.indices.segments.terms_memory_in_bytes
The memory used by terms.
elasticsearch.indices.segments.version_map_memory_in_bytes
The memory used by the segment version map.
elasticsearch.indices.translog.operations
The number of operations in the transaction log.
elasticsearch.indices.translog.size_in_bytes
The size of the transaction log.
elasticsearch.initializing_shards
The number of shards that are currently initializing.
elasticsearch.merges.current
The number of currently active segment merges.
elasticsearch.merges.current.docs
The number of documents across segments currently being merged.
elasticsearch.merges.current.size
The size of the segments currently being merged.
elasticsearch.merges.total
The total number of segment merges.
elasticsearch.merges.total.docs
The total number of documents across all merged segments.
elasticsearch.merges.total.size
The total size of all merged segments.
elasticsearch.merges.total.time
The total time spent on segment merging.
elasticsearch.number_of_data_nodes
The number of data nodes in the cluster.
elasticsearch.number_of_nodes
The total number of nodes in the cluster.
elasticsearch.pending_tasks_priority_high
The number of high priority pending tasks.
elasticsearch.pending_tasks_priority_urgent
The number of urgent priority pending tasks.
elasticsearch.pending_tasks_time_in_queue
The average time spent by tasks in the queue.
elasticsearch.pending_tasks_total
The total number of pending tasks.
elasticsearch.process.open_fd
The number of opened file descriptors associated with the current
process, or -1 if not supported.
elasticsearch.refresh.total
The total number of index refreshes.
elasticsearch.refresh.total.time
The total time spent on index refreshes.
elasticsearch.relocating_shards
The number of shards that are relocating from one node to another.
elasticsearch.search.fetch.current
The number of search fetches currently running.
elasticsearch.search.fetch.open_contexts
The number of active searches.
elasticsearch.search.fetch.time
The total time spent on the search fetch.
elasticsearch.search.fetch.total
The total number of search fetches.
elasticsearch.search.query.current
The number of currently active queries.
elasticsearch.search.query.time
The total time spent on queries.
elasticsearch.search.query.total
The total number of queries.
elasticsearch.store.size
The total size in bytes of the store.
elasticsearch.thread_pool.bulk.active
The number of active threads in the bulk pool.
elasticsearch.thread_pool.bulk.queue
The number of queued threads in the bulk pool.
elasticsearch.thread_pool.bulk.threads
The total number of threads in the bulk pool.
elasticsearch.thread_pool.bulk.rejected
The number of rejected threads in the bulk pool.
elasticsearch.thread_pool.fetch_shard_started.active
The number of active threads in the fetch shard started pool.
elasticsearch.thread_pool.fetch_shard_started.threads
The total number of threads in the fetch shard started pool.
elasticsearch.thread_pool.fetch_shard_started.queue
The number of queued threads in the fetch shard started pool.
elasticsearch.thread_pool.fetch_shard_started.rejected
The number of rejected threads in the fetch shard started pool.
elasticsearch.thread_pool.fetch_shard_store.active
The number of active threads in the fetch shard store pool.
elasticsearch.thread_pool.fetch_shard_store.threads
The total number of threads in the fetch shard store pool.
elasticsearch.thread_pool.fetch_shard_store.queue
The number of queued threads in the fetch shard store pool.
elasticsearch.thread_pool.fetch_shard_store.rejected
The number of rejected threads in the fetch shard store pool.
elasticsearch.thread_pool.flush.active
The number of active threads in the flush queue.
elasticsearch.thread_pool.flush.queue
The number of queued threads in the flush pool.
elasticsearch.thread_pool.flush.threads
The total number of threads in the flush pool.
elasticsearch.thread_pool.flush.rejected
The number of rejected threads in the flush pool.
elasticsearch.thread_pool.force_merge.active
The number of active threads for force merge operations.
elasticsearch.thread_pool.force_merge.threads
The total number of threads for force merge operations.
elasticsearch.thread_pool.force_merge.queue
The number of queued threads for force merge operations.
elasticsearch.thread_pool.force_merge.rejected
The number of rejected threads for force merge operations.
elasticsearch.thread_pool.generic.active
The number of active threads in the generic pool.
elasticsearch.thread_pool.generic.queue
The number of queued threads in the generic pool.
elasticsearch.thread_pool.generic.threads
The total number of threads in the generic pool.
elasticsearch.thread_pool.generic.rejected
The number of rejected threads in the generic pool.
elasticsearch.thread_pool.get.active
The number of active threads in the get pool.
elasticsearch.thread_pool.get.queue
The number of queued threads in the get pool.
elasticsearch.thread_pool.get.threads
The total number of threads in the get pool.
elasticsearch.thread_pool.get.rejected
The number of rejected threads in the get pool.
elasticsearch.thread_pool.index.active
The number of active threads in the index pool.
elasticsearch.thread_pool.index.queue
The number of queued threads in the index pool.
elasticsearch.thread_pool.index.threads
The total number of threads in the index pool.
elasticsearch.thread_pool.index.rejected
The number of rejected threads in the index pool.
elasticsearch.thread_pool.listener.active
The number of active threads in the listener pool.
elasticsearch.thread_pool.listener.queue
The number of queued threads in the listener pool.
elasticsearch.thread_pool.listener.threads
The total number of threads in the listener pool.
elasticsearch.thread_pool.listener.rejected
The number of rejected threads in the listener pool.
elasticsearch.thread_pool.management.active
The number of active threads in the management pool.
elasticsearch.thread_pool.management.queue
The number of queued threads in the management pool.
elasticsearch.thread_pool.management.threads
The total number of threads in the management pool.
elasticsearch.thread_pool.management.rejected
The number of rejected threads in the management pool.
elasticsearch.thread_pool.merge.active
The number of active threads in the merge pool.
elasticsearch.thread_pool.merge.queue
The number of queued threads in the merge pool.
elasticsearch.thread_pool.merge.threads
The total number of threads in the merge pool.
elasticsearch.thread_pool.merge.rejected
The number of rejected threads in the merge pool.
elasticsearch.thread_pool.percolate.active
The number of active threads in the percolate pool.
elasticsearch.thread_pool.percolate.queue
The number of queued threads in the percolate pool.
elasticsearch.thread_pool.percolate.threads
The total number of threads in the percolate pool.
elasticsearch.thread_pool.percolate.rejected
The number of rejected threads in the percolate pool.
elasticsearch.thread_pool.refresh.active
The number of active threads in the refresh pool.
elasticsearch.thread_pool.refresh.queue
The number of queued threads in the refresh pool.
elasticsearch.thread_pool.refresh.threads
The total number of threads in the refresh pool.
elasticsearch.thread_pool.refresh.rejected
The number of rejected threads in the refresh pool.
elasticsearch.thread_pool.search.active
The number of active threads in the search pool.
elasticsearch.thread_pool.search.queue
The number of queued threads in the search pool.
elasticsearch.thread_pool.search.threads
The total number of threads in the search pool.
elasticsearch.thread_pool.search.rejected
The number of rejected threads in the search pool.
elasticsearch.thread_pool.snapshot.active
The number of active threads in the snapshot pool.
elasticsearch.thread_pool.snapshot.queue
The number of queued threads in the snapshot pool.
elasticsearch.thread_pool.snapshot.threads
The total number of threads in the snapshot pool.
elasticsearch.thread_pool.snapshot.rejected
The number of rejected threads in the snapshot pool.
elasticsearch.thread_pool.write.active
The number of active threads in the write pool.
elasticsearch.thread_pool.write.queue
The number of queued threads in the write pool.
elasticsearch.thread_pool.write.threads
The total number of threads in the write pool.
elasticsearch.thread_pool.write.rejected
The number of rejected threads in the write pool.
elasticsearch.transport.rx_count
The total number of packets received in cluster communication.
elasticsearch.transport.rx_size
The total size of data received in cluster communication.
elasticsearch.transport.server_open
The number of connections opened for cluster communication.
elasticsearch.transport.tx_count
The total number of packets sent in cluster communication.
elasticsearch.transport.tx_size
The total size of data sent in cluster communication.
elasticsearch.unassigned_shards
The number of shards that are unassigned to a node.
elasticsearch.delayed_unassigned_shards
The number of shards whose allocation has been delayed.
jvm.gc.collection_count
The total number of garbage collections run by the JVM.
jvm.gc.collection_time
The total time spent on garbage collection in the JVM.
jvm.gc.collectors.old.collection_time
The total time spent in major GCs in the JVM that collect old generation
objects.
jvm.gc.collectors.old.count
The total count of major GCs in the JVM that collect old generation
objects.
jvm.gc.collectors.young.collection_time
The total time spent in minor GCs in the JVM that collects young
generation objects.
jvm.gc.collectors.young.count
The total count of minor GCs in the JVM that collects young generation
objects.
jvm.gc.concurrent_mark_sweep.collection_time
The total time spent on “concurrent mark & sweep” GCs in the JVM.
jvm.gc.concurrent_mark_sweep.count
The total count of “concurrent mark & sweep” GCs in the JVM.
jvm.gc.par_new.collection_time
The total time spent on “parallel new” GCs in the JVM.
jvm.gc.par_new.count
The total count of “parallel new” GCs in the JVM.
jvm.mem.heap_committed
The amount of memory guaranteed to be available to the JVM heap.
jvm.mem.heap_in_use
The amount of memory currently used by the JVM heap as a value between 0
and 1.
jvm.mem.heap_max
The maximum amount of memory that can be used by the JVM heap.
jvm.mem.heap_used
The amount of memory in bytes currently used by the JVM heap.
jvm.mem.non_heap_committed
The amount of memory guaranteed to be available to JVM non-heap.
jvm.mem.non_heap_used
The amount of memory in bytes currently used by the JVM non-heap.
jvm.mem.pools.young.used
The amount of memory in bytes currently used by the Young Generation
heap region.
jvm.mem.pools.young.max
The maximum amount of memory that can be used by the Young Generation
heap region.
jvm.mem.pools.old.used
The amount of memory in bytes currently used by the Old Generation heap
region.
jvm.mem.pools.old.max
The maximum amount of memory that can be used by the Old Generation heap
region.
jvm.mem.pools.survivor.used
The amount of memory in bytes currently used by the Survivor Space.
jvm.mem.pools.survivor.max
The maximum amount of memory that can be used by the Survivor Space.
jvm.threads.count
The number of active threads in the JVM.
jvm.threads.peak_count
The peak number of threads used by the JVM.
elasticsearch.index.health
The status of the index.
elasticsearch.index.docs.count
The number of documents in the index.
elasticsearch.index.docs.deleted
The number of deleted documents in the index.
elasticsearch.index.primary_shards
The number of primary shards in the index.
elasticsearch.index.replica_shards
The number of replica shards in the index.
elasticsearch.index.primary_store_size
The store size of primary shards in the index.
elasticsearch.index.store_size
The store size of primary and replica shards in the index.
6.3.2.6 - etcd Metrics
See Application Integrations for more information.
etcd.leader.counts.fail
Rate of failed Raft RPC requests.
etcd.leader.counts.success
Rate of successful Raft RPC requests.
etcd.leader.latency.avg
Average latency to each peer in the cluster.
etcd.leader.latency.current
Current latency to each peer in the cluster.
etcd.leader.latency.max
Maximum latency to each peer in the cluster.
etcd.leader.latency.min
Minimum latency to each peer in the cluster.
etcd.leader.latency.stddev
Standard deviation latency to each peer in the cluster.
etcd.self.recv.appendrequest.count
Rate of append requests this node has processed.
etcd.self.recv.bandwidthrate
Rate of bytes received.
etcd.self.recv.pkgrate
Rate of packets received.
etcd.self.send.appendrequest.count
Rate of append requests this node has sent.
etcd.self.send.bandwidthrate
Rate of bytes sent.
etcd.self.send.pkgrate
Rate of packets sent.
etcd.store.compareanddelete.fail
Rate of compare and delete requests failure.
etcd.store.compareanddelete.success
Rate of compare and delete requests success.
etcd.store.compareandswap.fail
Rate of compare and swap requests failure.
etcd.store.compareandswap.success
Rate of compare and swap requests success.
etcd.store.create.fail
Rate of failed create requests.
etcd.store.create.success
Rate of successful create requests.
etcd.store.delete.fail
Rate of failed delete requests.
etcd.store.delete.success
Rate of successful delete requests.
etcd.store.expire.count
Rate of expired keys.
etcd.store.gets.fail
Rate of failed get requests.
etcd.store.gets.success
Rate of successful get requests.
etcd.store.sets.fail
Rate of failed set requests.
etcd.store.sets.success
Rate of successful set requests.
etcd.store.update.fail
Rate of failed update requests.
etcd.store.update.success
Rate of successful update requests.
etcd.store.watchers
Rate of watchers.
6.3.2.7 - fluentd Metrics
See Application Integrations for more information.
fluentd.buffer_queue_length
The length of the plugin buffer queue for this plugin.
fluentd.buffer_total_queued_size
The size of the buffer queue for this plugin.
fluentd.retry_count
The number of retries for this plugin.
6.3.2.8 - Go Metrics
See Application Integrations for more information.
go_expvar.memstats.alloc
The number of bytes allocated and not yet freed.
go_expvar.memstats.frees
The number of free bytes.
go_expvar.memstats.heap_alloc
go_expvar.memstats.heap_idle
The number of bytes in idle spans.
go_expvar.memstats.heap_inuse
The number of bytes in non-idle spans.
go_expvar.memstats.heap_objects
The total number of allocated objects.
go_expvar.memstats.heap_released
The number of bytes released to the OS.
go_expvar.memstats.heap_sys
The number of bytes obtained from the system.
go_expvar.memstats.lookups
The number of pointer lookups.
go_expvar.memstats.mallocs
The number of mallocs.
go_expvar.memstats.num_gc
The number of garbage collections.
go_expvar.memstats.pause_ns.avg
The average of recent GC pause durations.
go_expvar.memstats.pause_ns.count
The number of submitted GC pause durations.
go_expvar.memstats.pause_ns.max
The max GC pause duration.
The median GC pause duration.
go_expvar.memstats.pause_total_ns
The total GC pause duration over the lifetime of process.
go_expvar.memstats.total_alloc
The bytes allocated (even if freed).
6.3.2.9 - HTTP Metrics
See Application Integrations for more information.
http.ssl.days_left
The number of days until the SSL certificate expires.
network.http.response_time
The response time of a HTTP request to a specified URL.
6.3.2.10 - HAProxy Metrics
See Application Integrations for more information.
haproxy.backend_hosts
The number of backend hosts.
haproxy.backend.bytes.in_rate
The rate of bytes in on backend hosts.
haproxy.backend.bytes.out_rate
The rate of bytes out on backend hosts.
haproxy.backend.connect.time
The average connect time over the last 1024 requests.
haproxy.backend.denied.req_rate
The number of requests denied due to security concerns.
haproxy.backend.denied.resp_rate
The number of responses denied due to security concerns.
haproxy.backend.errors.con_rate
The rate of requests that encountered an error trying to connect to a
backend server.
haproxy.backend.errors.resp_rate
The rate of responses aborted due to error.
haproxy.backend.queue.current
The number of requests without an assigned backend.
haproxy.backend.queue.time
The average queue time over the last 1024 requests.
haproxy.backend.response.1xx
The backend HTTP responses with 1xx code.
haproxy.backend.response.2xx
The backend HTTP responses with 2xx code.
haproxy.backend.response.3xx
The backend HTTP responses with 3xx code.
haproxy.backend.response.4xx
The backend HTTP responses with 4xx code.
haproxy.backend.response.5xx
The backend HTTP responses with 5xx code.
haproxy.backend.response.other
The backend HTTP responses with another code (protocol error).
haproxy.backend.response.time
The average response time over the last 1024 requests (0 for TCP).
haproxy.backend.session.current
The number of active backend sessions.
haproxy.backend.session.limit
The configured backend session limit.
haproxy.backend.session.pct
The percentage of sessions in use. The formula used for this metric is
backend.session.current
/ backend.session.limit
* 100.
haproxy.backend.session.rate
The number of backend sessions created per second.
haproxy.backend.session.time
The average total session time over the last 1024 requests.
haproxy.backend.uptime
The number of seconds since the last UP<->DOWN transition.
haproxy.backend.warnings.redis_rate
The number of times a request was redispatched to another server.
haproxy.backend.warnings.retr_rate
The number of times a connection to a server was retried.
haproxy.count_per_status
The number of hosts by status (UP/DOWN/NOLB/MAINT).
haproxy.frontend.bytes.in_rate
The rate of bytes in on frontend hosts.
haproxy.frontend.bytes.out_rate
The rate of bytes out on frontend hosts.
haproxy.frontend.denied.req_rate
The number of requests denied due to security concerns.
haproxy.frontend.denied.resp_rate
The number of responses denied due to security concerns.
haproxy.frontend.errors.req_rate
The rate of request errors.
haproxy.frontend.requests.rate
The number of HTTP requests per second.
haproxy.frontend.response.1xx
The frontend HTTP responses with 1xx code.
haproxy.frontend.response.2xx
The frontend HTTP responses with 2xx code.
haproxy.frontend.response.3xx
The frontend HTTP responses with 3xx code.
haproxy.frontend.response.4xx
The frontend HTTP responses with 4xx code.
haproxy.frontend.response.5xx
The frontend HTTP responses with 5xx code.
haproxy.frontend.response.other
The frontend HTTP responses with another code (protocol error).
haproxy.frontend.session.current
The number of active frontend sessions.
haproxy.frontend.session.limit
The configured backend session limit.
haproxy.frontend.session.pct
The percentage of sessions in use. The formula used for this metric is
frontend.session.current
/ frontend.session.limit
* 100.
haproxy.frontend.session.rate
The number of frontend sessions created per second.
Agent 9.6.0 Additional HAProxy Metrics
haproxy.backend.requests.tot_rate
Rate of total number of HTTP requests
haproxy.frontend.connections.rate
Number of connections per second
haproxy.frontend.connections.tot_rate
Rate of total number of connections
haproxy.frontend.requests.intercepted
Number of intercepted requests per second
haproxy.frontend.requests.tot_rate
Rate of total number of HTTP requests
6.3.2.11 - Jenkins Metrics
See Application Integrations for more information.
jenkins.job.duration
The duration of a job, measured in seconds.
jenkins.job.success
The status of a successful job.
jenkins.job.failure
The status of a failed job.
6.3.2.12 - Lighttpd Metrics
See Application Integrations for more information.
lighttpd.net.bytes
The total number of bytes sent and received.
lighttpd.net.bytes_per_s
The number of bytes sent and received per second.
lighttpd.net.hits
The total number of hits since the start.
lighttpd.net.request_per_s
The number of requests per second.
The number of active connections.
The number of idle connections.
The amount of time the server has been up and running.
6.3.2.13 - Memcached Metrics
See Application Integrations for more information.
memcache.avg_item_size
The average size of an item.
memcache.bytes
The current number of bytes used by this server to store items.
memcache.bytes_read_rate
The rate of bytes read from the network by this server.
memcache.bytes_written_rate
The rate of bytes written to the network by this server.
memcache.cas_badval_rate
The rate at which keys are compared and swapped where the comparison
(original) value did not match the supplied value.
memcache.cas_hits_rate
The rate at which keys are compared and swapped and found present.
memcache.cas_misses_rate
The rate at which keys are compared and swapped and not found present.
memcache.cmd_flush_rate
The rate of flush_all
commands.
memcache.cmd_get_rate
The rate of get
commands.
memcache.cmd_set_rate
The rate of set
commands.
memcache.connection_structures
The number of connection structures allocated by the server.
memcache.curr_connections
The number of open connections to this server.
memcache.curr_items
The current number of items stored by the server.
memcache.delete_hits_rate
The rate at which delete commands result in items being removed.
memcache.delete_misses_rate
The rate at which delete commands result in no items being removed.
memcache.evictions_rate
The rate at which valid items are removed from cache to free memory for
new items.
memcache.fill_percent
The amount of memory being used by the server for storing items as a
percentage of the max allowed.
memcache.get_hit_percent
The percentage of requested keys that are found present since the start
of the Memcached server.
memcache.get_hits_rate
The rate at which keys are requested and found present.
memcache.get_misses_rate
The rate at which keys are requested and not found.
memcache.items.age
The age of the oldest item in the LRU.
memcache.items.crawler_reclaimed_rate
The rate at which items freed by the LRU Crawler.
memcache.items.direct_reclaims_rate
The rate at which worker threads had to directly pull LRU tails to find
memory for a new item.
memcache.items.evicted_nonzero_rate
The rate at which nonzero items which had an explicit expire time set
had to be evicted from the LRU before expiring.
memcache.items.evicted_rate
The rate st which items had to be evicted from the LRU before expiring.
memcache.items.evicted_time
The number of seconds since the last access for the most recent item
evicted from this class.
memcache.items.evicted_unfetched_rate
The rate at which valid items evicted from the LRU which were never
touched after being set.
memcache.items.expired_unfetched_rate
The rate at which expired items reclaimed from the LRU which were never
touched after being set.
memcache.items.lrutail_reflocked_rate
The rate at which items found to be refcount locked in the LRU tail.
memcache.items.moves_to_cold_rate
The rate at which items were moved from HOT or WARM into COLD.
memcache.items.moves_to_warm_rate
The rate at which items were moved from COLD to WARM.
memcache.items.moves_within_lru_rate
The rate at which active items were bumped within HOT or WARM.
memcache.items.number
The number of items presently stored in this slab class.
memcache.items.number_cold
The number of items presently stored in the COLD LRU.
memcache.items.number_hot
The number of items presently stored in the HOT LRU.
memcache.items.number_noexp
The number of items presently stored in the NOEXP class.
memcache.items.number_warm
The number of items presently stored in the WARM LRU.
memcache.items.outofmemory_rate
The rate at which the underlying slab class was unable to store a new
item.
memcache.items.reclaimed_rate
The rate at which entries were stored using memory from an expired
entry.
memcache.items.tailrepairs_rate
The rate at which Memcached self-healed a slab with a refcount leak.
memcache.limit_maxbytes
The number of bytes this server is allowed to use for storage.
memcache.listen_disabled_num_rate
The rate at which the server has reached the max connection limit.
memcache.pointer_size
The default size of pointers on the host OS (generally 32 or 64).
memcache.rusage_system_rate
The fraction of user time the CPU spent executing this server process.
memcache.rusage_user_rate
The fraction of time the CPU spent executing kernel code on behalf of
this server process.
memcache.slabs.active_slabs
The total number of slab classes allocated.
memcache.slabs.cas_badval_rate
The rate at which CAS commands failed to modify a value due to a bad CAS
ID.
memcache.slabs.cas_hits_rate
The rate at which CAS commands modified this slab class.
memcache.slabs.chunk_size
The amount of space each chunk uses.
memcache.slabs.chunks_per_page
The number of chunks that exist within one page.
memcache.slabs.cmd_set_rate
The rate at which set requests stored data in this slab class.
memcache.slabs.decr_hits_rate
The rate at which decrs commands modified this slab class.
memcache.slabs.delete_hits_rate
The rate at which delete commands succeeded in this slab class.
memcache.slabs.free_chunks
The number of chunks not yet allocated to items or freed via delete.
memcache.slabs.free_chunks_end
The number of free chunks at the end of the last allocated page.
memcache.slabs.get_hits_rate
The rate at which get requests were serviced by this slab class.
memcache.slabs.incr_hits_rate
The rate at which incrs commands modified this slab class.
memcache.slabs.mem_requested
The number of bytes requested to be stored in this slab.
memcache.slabs.total_chunks
The total number of chunks allocated to the slab class.
memcache.slabs.total_malloced
The total amount of memory allocated to slab pages.
memcache.slabs.total_pages
The total number of pages allocated to the slab class.
memcache.slabs.touch_hits_rate
The rate of touches serviced by this slab class.
memcache.slabs.used_chunks
The number of chunks that have been allocated to items.
memcache.slabs.used_chunks_rate
The rate at which chunks have been allocated to items.
memcache.threads
The number of threads used by the current Memcached server process.
memcache.total_connections_rate
The rate at which connections to this server are opened.
memcache.total_items
The total number of items stored by this server since it started.
memcache.uptime
The number of seconds this server has been running.
6.3.2.14 - Mesos/Marathon Metrics
Contents
6.3.2.14.1 - Mesos Agent Metrics
See Application Integrations for more information.
mesos.slave.cpus_percent
The percentage of CPUs allocated to the slave.
mesos.slave.cpus_total
The total number of CPUs.
mesos.slave.cpus_used
The number of CPUs allocated to the slave.
mesos.slave.disk_percent
The percentage of disk space allocated to the slave.
mesos.slave.disk_total
The total disk space available.
mesos.slave.disk_used
The amount of disk space allocated to the slave.
mesos.slave.executors_registering
The number of executors registering.
mesos.slave.executors_running
The number of executors currently running.
mesos.slave.executors_terminated
The number of terminated executors.
mesos.slave.executors_terminating
The number of terminating executors.
mesos.slave.frameworks_active
The number of active frameworks.
mesos.slave.invalid_framework_messages
The number of invalid framework messages.
mesos.slave.invalid_status_updates
The number of invalid status updates.
mesos.slave.mem_percent
The percentage of memory allocated to the slave.
mesos.slave.mem_total
The total memory available.
mesos.slave.mem_used
The amount of memory allocated to the slave.
mesos.slave.recovery_errors
The number of errors encountered during slave recovery.
mesos.slave.tasks_failed
The number of failed tasks.
mesos.slave.tasks_finished
The number of finished tasks.
mesos.slave.tasks_killed
The number of killed tasks.
mesos.slave.tasks_lost
The number of lost tasks.
mesos.slave.tasks_running
The number of running tasks.
mesos.slave.tasks_staging
The number of staging tasks.
mesos.slave.tasks_starting
The number of starting tasks.
mesos.slave.valid_framework_messages
The number of valid framework messages.
mesos.slave.valid_status_updates
The number of valid status updates.
mesos.state.task.cpu
The task CPU.
mesos.state.task.disk
The disk space available for the task.
mesos.state.task.mem
The amount of memory used by the task.
mesos.stats.registered
Defines whether this slave is registered with a master.
mesos.stats.system.cpus_total
The total number of CPUs available.
mesos.stats.system.load_1min
The average load for the last minute.
mesos.stats.system.load_5min
The average load for the last five minutes.
mesos.stats.system.load_15min
The average load for the last 15 minutes.
mesos.stats.system.mem_free_bytes
The amount of free memory.
mesos.stats.system.mem_total_bytes
The total amount of memory.
mesos.stats.uptime_secs
The current uptime for the slave.
6.3.2.14.2 - Mesos Master Metrics
See Application Integrations for more information.
mesos.cluster.cpus_percent
The percentage of CPUs allocated to the cluster.
mesos.cluster.cpus_total
The total number of CPUs.
mesos.cluster.cpus_used
The number of CPUs used by the cluster.
mesos.cluster.disk_percent
The percentage of disk space allocated to the cluster.
mesos.cluster.disk_total
The total amount of disk space.
mesos.cluster.disk_used
The amount of disk space used by the cluster.
mesos.cluster.dropped_messages
The number of dropped messages.
mesos.cluster.event_queue_dispatches
The number of dispatches in the event queue.
mesos.cluster.event_queue_http_requests
The number of HTTP requests in the event queue.
mesos.cluster.event_queue_messages
The number of messages in the event queue.
mesos.cluster.frameworks_active
The number of active frameworks.
mesos.cluster.frameworks_connected
The number of connected frameworks.
mesos.cluster.frameworks_disconnected
The number of disconnected frameworks.
mesos.cluster.frameworks_inactive
The number of inactive frameworks.
mesos.cluster.gpus_total
The total number of GPUs.
mesos.cluster.invalid_framework_to_executor_messages
The number of invalid messages between the framework and the executor.
mesos.cluster.invalid_status_update_acknowledgements
The number of invalid status update acknowledgements.
mesos.cluster.invalid_status_updates
The number of invalid framework messages.
mesos.cluster.mem_percent
The percentage of memory allocated to the cluster.
mesos.cluster.mem_total
The total amount of memory available.
mesos.cluster.mem_used
The amount of memory the cluster is using.
mesos.cluster.outstanding_offers
The number of outstanding resource offers.
mesos.cluster.slave_registrations
The number of slaves able to rejoin the cluster after a disconnect.
mesos.cluster.slave_removals
The number of slaves that have been removed for any reason, including
maintenance.
mesos.cluster.slave_reregistrations
The number of slaves that have re-registered.
mesos.cluster.slave_shutdowns_canceled
The number of slave shutdowns processes that have been cancelled.
mesos.cluster.slave_shutdowns_scheduled
The number of slaves that have failed health checks and are scheduled
for removal.
mesos.cluster.slaves_active
The number of active slaves.
mesos.cluster.slaves_connected
The number of connected slaves.
mesos.cluster.slaves_disconnected
The number of disconnected slaves.
mesos.cluster.slaves_inactive
The number of inactive slaves.
mesos.cluster.tasks_error
The number of cluster tasks that resulted in an error.
mesos.cluster.tasks_failed
The number of failed cluster tasks.
mesos.cluster.tasks_finished
The number of completed cluster tasks.
mesos.cluster.tasks_killed
The number of killed cluster tasks.
mesos.cluster.tasks_lost
The number of lost cluster tasks.
mesos.cluster.tasks_running
The number of cluster tasks currently running.
mesos.cluster.tasks_staging
The number of cluster tasks currently staging.
mesos.cluster.tasks_starting
The number of cluster tasks starting.
mesos.cluster.valid_framework_to_executor_messages
The number of valid framework messages.
mesos.cluster.valid_status_update_acknowledgements
The number of valid status update acknowledgements.
mesos.cluster.valid_status_updates
The number of valid status updates.
mesos.framework.cpu
The CPU of the Mesos framework.
mesos.framework.disk
The total disk space of the Mesos framework, measured in mebibytes.
mesos.framework.mem
The total memory of the Mesos framework, measured in mebibytes.
mesos.registrar.queued_operations
The number of queued operations.
mesos.registrar.registry_size_bytes
The size of the Mesos registry in bytes.
mesos.registrar.state_fetch_ms
The Mesos registry’s read latency, in bytes.
mesos.registrar.state_store_ms
The Mesos registry’s write latency, in bytes.
mesos.registrar.state_store_ms.count
The Mesos registry’s write count, in bytes.
mesos.registrar.state_store_ms.max
The maximum write latency for the registry, in milliseconds.
mesos.registrar.state_store_ms.min
The minimum write latency for the registry, in miliseconds.
mesos.registrar.state_store_ms.p50
The median registry write latency, in milliseconds.
mesos.registrar.state_store_ms.p90
The 90th percentile registry write latency, in milliseconds.
mesos.registrar.state_store_ms.p95
The 95th percentile registry write latency, in milliseconds.
mesos.registrar.state_store_ms.p99
The 99th percentile registry write latency, in milliseconds.
mesos.registrar.state_store_ms.p999
The 99.9th percentile registry write latency, in milliseconds.
mesos.registrar.state_store_ms.p9999
The 99.99th percentile registry write latency, in milliseconds.
mesos.role.cpu
The CPU capacity of the configured role.
mesos.role.disk
The total disk space available to the Mesos role, in mebibytes.
mesos.role.mem
The total memory available to the Mesos role, in mebibytes.
mesos.stats.elected
Defines whether this is the elected master or not.
mesos.stats.system.cpus_total
The total number of CPUs in the system.
mesos.stats.system.load_1min
The average load for the last minute.
mesos.stats.system.load_5min
The average load for the last five minutes.
mesos.stats.system.load_15min
The average load for the last fifteen minutes.
mesos.stats.system.mem_free_bytes
The total amount of free system memory, in bytes.
mesos.stats.system.mem_total_bytes
The total cluster memory in bytes.
mesos.stats.uptime_secs
The current uptime of the cluster.
6.3.2.14.3 - Marathon Metrics
See Application Integrations for more information.
marathon.apps
The total number of applications.
marathon.backoffFactor
The multiplication factor for the delay between each consecutive failed
task. This value is multiplied by the value of marathon.backoffSeconds
each time the task fails until the maximum delay is reached, or the task
succeeds.
marathon.backoffSeconds
The period of time between attempts to run a failed task. This value is
multiplied by marathon.backoffFactor for each consecutive task failure,
until either the task succeeds or the maximum delay is reached.
marathon.cpus
The number of CPUs configured for each application instance.
marathon.disk
The amount of disk space configured for each application instance.
marathon.instances
The number of instances of a specific application.
marathon.mem
The total amount of configured memory for each instance of a specific
application.
marathon.tasksRunning
The number of tasks running for a specific application.
marathon.tasksStaged
The number of tasks staged for a specific application.
6.3.2.15 - MongoDB Metrics
See Application Integrations for more information.
Metrics Introduced with Agent v9.7.0
The following metrics are supported by Sysdig Agent v9.7.0 and above.
Metric Name | Description |
---|
mongodb.tcmalloc.generic.current_allocated_bytes | The number of bytes used by the application. |
mongodb.tcmalloc.generic.heap_size | Bytes of system memory reserved by TCMalloc. |
mongodb.tcmalloc.tcmalloc.aggressive_memory_decommit | Status of aggressive memory de-commit mode. |
mongodb.tcmalloc.tcmalloc.central_cache_free_bytes | The number of free bytes in the central cache. |
mongodb.tcmalloc.tcmalloc.current_total_thread_cache_bytes | The number of bytes used across all thread caches. |
mongodb.tcmalloc.tcmalloc.max_total_thread_cache_bytes | The upper limit on the total number of bytes stored across all per-thread caches. |
mongodb.tcmalloc.tcmalloc.pageheap_free_bytes | The number of bytes in free mapped pages in page heap. |
mongodb.tcmalloc.tcmalloc.pageheap_unmapped_bytes | The number of bytes in free unmapped pages in page heap. |
mongodb.tcmalloc.tcmalloc.spinlock_total_delay_ns | Gives the spinlock delay time. |
mongodb.tcmalloc.tcmalloc.thread_cache_free_bytes | The number of free bytes in thread caches. |
mongodb.tcmalloc.tcmalloc.transfer_cache_free_bytes | The number of free bytes that are waiting to be transferred between the central cache and a thread cache. |
mongodb.asserts.msgps
Number of message assertions raised per second.
mongodb.asserts.regularps
Number of regular assertions raised per second.
mongodb.asserts.rolloversps
Number of times that the rollover counters roll over per second. The
counters rollover to zero every 2^30 assertions.
mongodb.asserts.userps
Number of user assertions raised per second.
mongodb.asserts.warningps
Number of warnings raised per second.
mongodb.backgroundflushing.average_ms
Average time for each flush to disk.
mongodb.backgroundflushing.flushesps
Number of times the database has flushed all writes to disk.
mongodb.backgroundflushing.last_ms
Amount of time that the last flush operation took to complete.
mongodb.backgroundflushing.total_ms
Total number of time that the `mongod` processes have spent writing
(i.e. flushing) data to disk.
mongodb.connections.available
Number of unused available incoming connections the database can
provide.
mongodb.connections.current
Number of connections to the database server from clients.
mongodb.connections.totalcreated
Total number of connections created.
mongodb.cursors.timedout
Total number of cursors that have timed out since the server process
started.
mongodb.cursors.totalopen
Number of cursors that MongoDB is maintaining for clients
mongodb.dbs
Total number of existing databases
mongodb.dur.commits
Number of transactions written to the journal during the last journal
group commit interval.
mongodb.dur.commitsinwritelock
Count of the commits that occurred while a write lock was held.
mongodb.dur.compression
Compression ratio of the data written to the journal.
mongodb.dur.earlycommits
Number of times MongoDB requested a commit before the scheduled journal
group commit interval.
mongodb.dur.journaledmb
Amount of data written to journal during the last journal group commit
interval.
mongodb.dur.timems.commits
Amount of time spent for commits.
mongodb.dur.timems.commitsinwritelock
Amount of time spent for commits that occurred while a write lock was
held.
mongodb.dur.timems.dt
Amount of time over which MongoDB collected the `dur.timeMS` data.
mongodb.dur.timems.preplogbuffer
Amount of time spent preparing to write to the journal.
mongodb.dur.timems.remapprivateview
Amount of time spent remapping copy-on-write memory mapped views.
mongodb.dur.timems.writetodatafiles
Amount of time spent writing to data files after journaling.
mongodb.dur.timems.writetojournal
Amount of time spent writing to the journal
mongodb.dur.writetodatafilesmb
Amount of data written from journal to the data files during the last
journal group commit interval.
Number of page faults per second that require disk operations.
mongodb.fsynclocked
Number of fsynclocked performed on a mongo instance.
mongodb.globallock.activeclients.readers
Count of the active client connections performing read operations.
mongodb.globallock.activeclients.total
Total number of active client connections to the database.
mongodb.globallock.activeclients.writers
Count of active client connections performing write operations.
mongodb.globallock.currentqueue.readers
Number of operations that are currently queued and waiting for the read
lock.
mongodb.globallock.currentqueue.total
Total number of operations queued waiting for the lock.
mongodb.globallock.currentqueue.writers
Number of operations that are currently queued and waiting for the write
lock.
mongodb.globallock.locktime
Time since the database last started that the globalLock has been held.
mongodb.globallock.ratio
Ratio of the time that the globalLock has been held to the total time
since it was created.
mongodb.globallock.totaltime
Time since the database last started and created the global lock.
mongodb.indexcounters.accessesps
Number of times that operations have accessed indexes per second.
mongodb.indexcounters.hitsps
Number of times per second that an index has been accessed and mongod is
able to return the index from memory.
mongodb.indexcounters.missesps
Number of times per second that an operation attempted to access an
index that was not in memory.
mongodb.indexcounters.missratio
Ratio of index hits to misses.
mongodb.indexcounters.resetsps
Number of times per second the index counters have been reset.
mongodb.locks.collection.acquirecount.exclusiveps
Number of times the collection lock type was acquired in the Exclusive
(X) mode.
mongodb.locks.collection.acquirecount.intent_exclusiveps
Number of times the collection lock type was acquired in the Intent
Exclusive (IX) mode.
mongodb.locks.collection.acquirecount.intent_sharedps
Number of times the collection lock type was acquired in the Intent
Shared (IS) mode.
mongodb.locks.collection.acquirecount.sharedps
Number of times the collection lock type was acquired in the Shared (S)
mode.
mongodb.locks.collection.acquirewaitcount.exclusiveps
Number of times the collection lock type acquisition in the Exclusive
(X) mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.collection.acquirewaitcount.sharedps
Number of times the collection lock type acquisition in the Shared (S)
mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.collection.timeacquiringmicros.exclusiveps
Wait time for the collection lock type acquisitions in the Exclusive (X)
mode.
mongodb.locks.collection.timeacquiringmicros.sharedps
Wait time for the collection lock type acquisitions in the Shared (S)
mode.
mongodb.locks.database.acquirecount.exclusiveps
Number of times the database lock type was acquired in the Exclusive (X)
mode.
mongodb.locks.database.acquirecount.intent_exclusiveps
Number of times the database lock type was acquired in the Intent
Exclusive (IX) mode.
mongodb.locks.database.acquirecount.intent_sharedps
Number of times the database lock type was acquired in the Intent Shared
(IS) mode.
mongodb.locks.database.acquirecount.sharedps
Number of times the database lock type was acquired in the Shared (S)
mode.
mongodb.locks.database.acquirewaitcount.exclusiveps
Number of times the database lock type acquisition in the Exclusive (X)
mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.database.acquirewaitcount.intent_exclusiveps
Number of times the database lock type acquisition in the Intent
Exclusive (IX) mode encountered waits because the locks were held in a
conflicting mode.
mongodb.locks.database.acquirewaitcount.intent_sharedps
Number of times the database lock type acquisition in the Intent Shared
(IS) mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.database.acquirewaitcount.sharedps
Number of times the database lock type acquisition in the Shared (S)
mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.database.timeacquiringmicros.exclusiveps
Wait time for the database lock type acquisitions in the Exclusive (X)
mode.
mongodb.locks.database.timeacquiringmicros.intent_exclusiveps
Wait time for the database lock type acquisitions in the Intent
Exclusive (IX) mode.
mongodb.locks.database.timeacquiringmicros.intent_sharedps
Wait time for the database lock type acquisitions in the Intent Shared
(IS) mode.
mongodb.locks.database.timeacquiringmicros.sharedps
Wait time for the database lock type acquisitions in the Shared (S)
mode.
mongodb.locks.global.acquirecount.exclusiveps
Number of times the global lock type was acquired in the Exclusive (X)
mode.
mongodb.locks.global.acquirecount.intent_exclusiveps
Number of times the global lock type was acquired in the Intent
Exclusive (IX) mode.
mongodb.locks.global.acquirecount.intent_sharedps
Number of times the global lock type was acquired in the Intent Shared
(IS) mode.
mongodb.locks.global.acquirecount.sharedps
Number of times the global lock type was acquired in the Shared (S)
mode.
mongodb.locks.global.acquirewaitcount.exclusiveps
Number of times the global lock type acquisition in the Exclusive (X)
mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.global.acquirewaitcount.intent_exclusiveps
Number of times the global lock type acquisition in the Intent Exclusive
(IX) mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.global.acquirewaitcount.intent_sharedps
Number of times the global lock type acquisition in the Intent Shared
(IS) mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.global.acquirewaitcount.sharedps
Number of times the global lock type acquisition in the Shared (S) mode
encountered waits because the locks were held in a conflicting mode.
mongodb.locks.global.timeacquiringmicros.exclusiveps
Wait time for the global lock type acquisitions in the Exclusive (X)
mode.
mongodb.locks.global.timeacquiringmicros.intent_exclusiveps
Wait time for the global lock type acquisitions in the Intent Exclusive
(IX) mode.
mongodb.locks.global.timeacquiringmicros.intent_sharedps
Wait time for the global lock type acquisitions in the Intent Shared
(IS) mode.
mongodb.locks.global.timeacquiringmicros.sharedps
Wait time for the global lock type acquisitions in the Shared (S) mode.
Number of times the metadata lock type was acquired in the Exclusive (X)
mode.
Number of times the metadata lock type was acquired in the Shared (S)
mode.
mongodb.locks.mmapv1journal.acquirecount.intent_exclusiveps
Number of times the MMAPv1 storage engine lock type was acquired in the
Intent Exclusive (IX) mode.
mongodb.locks.mmapv1journal.acquirecount.intent_sharedps
Number of times the MMAPv1 storage engine lock type was acquired in the
Intent Shared (IS) mode.
mongodb.locks.mmapv1journal.acquirewaitcount.intent_exclusiveps
Number of times the MMAPv1 storage engine lock type acquisition in the
Intent Exclusive (IX) mode encountered waits because the locks were held
in a conflicting mode.
mongodb.locks.mmapv1journal.acquirewaitcount.intent_sharedps
Number of times the MMAPv1 storage engine lock type acquisition in the
Intent Shared (IS) mode encountered waits because the locks were held in
a conflicting mode.
mongodb.locks.mmapv1journal.timeacquiringmicros.intent_exclusiveps
Wait time for the MMAPv1 storage engine lock type acquisitions in the
Intent Exclusive (IX) mode.
mongodb.locks.mmapv1journal.timeacquiringmicros.intent_sharedps
Wait time for the MMAPv1 storage engine lock type acquisitions in the
Intent Shared (IS) mode.
mongodb.locks.oplog.acquirecount.intent_exclusiveps
Number of times the oplog lock type was acquired in the Intent Exclusive
(IX) mode.
mongodb.locks.oplog.acquirecount.sharedps
Number of times the oplog lock type was acquired in the Shared (S) mode.
mongodb.locks.oplog.acquirewaitcount.intent_exclusiveps
Number of times the oplog lock type acquisition in the Intent Exclusive
(IX) mode encountered waits because the locks were held in a conflicting
mode.
mongodb.locks.oplog.acquirewaitcount.sharedps
Number of times the oplog lock type acquisition in the Shared (S) mode
encountered waits because the locks were held in a conflicting mode.
mongodb.locks.oplog.timeacquiringmicros.intent_exclusiveps
Wait time for the oplog lock type acquisitions in the Intent Exclusive
(IX) mode.
mongodb.locks.oplog.timeacquiringmicros.sharedps
Wait time for the oplog lock type acquisitions in the Shared (S) mode.
mongodb.mem.bits
Size of the in-memory storage engine.
mongodb.mem.mapped
Amount of mapped memory by the database.
mongodb.mem.mappedwithjournal
The amount of mapped memory, including the memory used for journaling.
mongodb.mem.resident
Amount of memory currently used by the database process.
mongodb.mem.virtual
Amount of virtual memory used by the mongod process.
mongodb.metrics.commands.count.failed
Number of times count failed
mongodb.metrics.commands.count.total
Number of times count executed
mongodb.metrics.commands.createIndexes.failed
Number of times createIndexes failed
mongodb.metrics.commands.createIndexes.total
Number of times createIndexes executed
mongodb.metrics.commands.delete.failed
Number of times delete failed
mongodb.metrics.commands.delete.total
Number of times delete executed
mongodb.metrics.commands.eval.failed
Number of times eval failed
mongodb.metrics.commands.eval.total
Number of times eval executed
mongodb.metrics.commands.findAndModify.failed
Number of times findAndModify failed
mongodb.metrics.commands.findAndModify.total
Number of times findAndModify executed
mongodb.metrics.commands.insert.failed
Number of times insert failed
mongodb.metrics.commands.insert.total
Number of times insert executed
mongodb.metrics.commands.update.failed
Number of times update failed
mongodb.metrics.commands.update.total
Number of times update executed
mongodb.metrics.cursor.open.notimeout
Number of open cursors with the option `DBQuery.Option.noTimeout` set
to prevent timeout after a period of inactivity.
mongodb.metrics.cursor.open.pinned
Number of pinned open cursors.
mongodb.metrics.cursor.open.total
Number of cursors that MongoDB is maintaining for clients.
mongodb.metrics.cursor.timedoutps
Number of cursors that time out, per second.
mongodb.metrics.document.deletedps
Number of documents deleted per second.
mongodb.metrics.document.insertedps
Number of documents inserted per second.
mongodb.metrics.document.returnedps
Number of documents returned by queries per second.
mongodb.metrics.document.updatedps
Number of documents updated per second.
Number of getLastError operations per second with a specified write
concern (i.e. w) that wait for one or more members of a replica set to
acknowledge the write operation.
mongodb.metrics.getlasterror.wtime.totalmillisps
Fraction of time (ms/s) that the mongod has spent performing
getLastError operations with write concern (i.e. w) that wait for one or
more members of a replica set to acknowledge the write operation.
mongodb.metrics.getlasterror.wtimeoutsps
Number of times per second that write concern operations have timed out
as a result of the wtimeout threshold to getLastError
mongodb.metrics.operation.fastmodps
Number of update operations per second that neither cause documents to
grow nor require updates to the index.
mongodb.metrics.operation.idhackps
Number of queries per second that contain the _id field.
mongodb.metrics.operation.writeconflictsps
Number of times per second that write concern operations has encounter a
conflict.
mongodb.metrics.operation.scanandorderps
Number of queries per second that return sorted numbers that cannot
perform the sort operation using an index.
mongodb.metrics.queryexecutor.scannedps
Number of index items scanned per second during queries and query-plan
evaluation.
mongodb.metrics.record.movesps
Number of times per second documents move within the on-disk
representation of the MongoDB data set.
mongodb.metrics.repl.apply.batches.numps
Number of batches applied across all databases per second.
mongodb.metrics.repl.apply.batches.totalmillisps
Fraction of time (ms/s) the mongod has spent applying operations from
the oplog.
mongodb.metrics.repl.apply.opsps
Number of oplog operations applied per second.
mongodb.metrics.repl.buffer.count
Number of operations in the oplog buffer.
mongodb.metrics.repl.buffer.maxsizebytes
Maximum size of the buffer.
mongodb.metrics.repl.buffer.sizebytes
Current size of the contents of the oplog buffer.
mongodb.metrics.repl.network.bytesps
Amount of data read from the replication sync source per second.
mongodb.metrics.repl.network.getmores.numps
Number of getmore operations per second.
mongodb.metrics.repl.network.getmores.totalmillisps
Fraction of time (ms/s) required to collect data from getmore
operations.
mongodb.metrics.repl.network.opsps
Number of operations read from the replication source per second.
mongodb.metrics.repl.network.readerscreatedps
Number of oplog query processes created per second.
mongodb.metrics.repl.preload.docs.numps
Number of documents loaded during the pre-fetch stage of replication.
mongodb.metrics.repl.preload.docs.totalmillisps
Amount of time spent loading documents as part of the pre-fetch stage of
replication.
mongodb.metrics.repl.preload.indexes.numps
Number of index entries loaded by members before updating documents as
part of the pre-fetch stage of replication.
mongodb.metrics.repl.preload.indexes.totalmillisps
Amount of time spent loading documents as part of the pre-fetch stage of
replication.
mongodb.metrics.ttl.deleteddocumentsps
Number of documents deleted from collections with a ttl index per
second.
mongodb.metrics.ttl.passesps
Number of times per second the background process removes documents from
collections with a ttl index.
mongodb.network.bytesinps
The number of bytes that reflects the amount of network traffic received
by this database.
mongodb.network.bytesoutps
The number of bytes that reflects the amount of network traffic sent
from this database.
mongodb.network.numrequestsps
Number of distinct requests that the server has received.
mongodb.opcounters.commandps
Total number of commands per second issued to the database.
mongodb.opcounters.deleteps
Number of delete operations per second.
mongodb.opcounters.getmoreps
Number of getmore operations per second.
mongodb.opcounters.insertps
Number of insert operations per second.
mongodb.opcounters.queryps
Total number of queries per second.
mongodb.opcounters.updateps
Number of update operations per second.
mongodb.opcountersrepl.commandps
Total number of replicated commands issued to the database per second.
mongodb.opcountersrepl.deleteps
Number of replicated delete operations per second.
mongodb.opcountersrepl.getmoreps
Number of replicated getmore operations per second.
mongodb.opcountersrepl.insertps
Number of replicated insert operations per second.
mongodb.opcountersrepl.queryps
Total number of replicated queries per second.
mongodb.opcountersrepl.updateps
Number of replicated update operations per second.
mongodb.oplog.logsizemb
Total size of the oplog.
mongodb.oplog.timediff
Oplog window: difference between the first and last operation in the
oplog.
mongodb.oplog.usedsizemb
Total amount of space used by the oplog.
mongodb.replset.health
Member health value of the replica set: conveys if the member is up
(i.e. 1) or down (i.e. 0).
mongodb.replset.replicationlag
Delay between a write operation on the primary and its copy to a
secondary.
mongodb.replset.state
State of a replica that reflects its disposition within the set.
mongodb.replset.votefraction
Fraction of votes a server will cast in a replica set election.
mongodb.replset.votes
The number of votes a server will cast in a replica set election.
mongodb.stats.datasize
Total size of the data held in this database including the padding
factor.
mongodb.stats.indexes
Total number of indexes across all collections in the database.
mongodb.stats.indexsize
Total size of all indexes created on this database.
mongodb.stats.objects
Number of objects (documents) in the database across all collections.
mongodb.stats.storagesize
Total amount of space allocated to collections in this database for
document storage.
mongodb.uptime
Number of seconds that the mongos or mongod process has been active.
mongodb.wiredtiger.cache.bytes_currently_in_cache
Size of the data currently in cache.
mongodb.wiredtiger.cache.failed_eviction_of_pages_exceeding_the_in_memory_maximumps
Number of failed eviction of pages that exceeded the in-memory maximum,
per second.
mongodb.wiredtiger.cache.in_memory_page_splits
In-memory page splits.
Maximum cache size.
mongodb.wiredtiger.cache.maximum_page_size_at_eviction
Maximum page size at eviction.
mongodb.wiredtiger.cache.modified_pages_evicted
Number of pages, that have been modified, evicted from the cache.
mongodb.wiredtiger.cache.pages_currently_held_in_cache
Number of pages currently held in the cache.
mongodb.wiredtiger.cache.pages_evicted_by_application_threadsps
Number of page evicted by application threads per second.
mongodb.wiredtiger.cache.pages_evicted_exceeding_the_in_memory_maximumps
Number of pages evicted because they exceeded the cache in-memory
maximum, per second.
mongodb.wiredtiger.cache.tracked_dirty_bytes_in_cache
Size of the dirty data in the cache.
mongodb.wiredtiger.cache.unmodified_pages_evicted
Number of pages, that were not modified, evicted from the cache.
mongodb.wiredtiger.concurrenttransactions.read.available
Number of available read tickets (concurrent transactions) remaining.
mongodb.wiredtiger.concurrenttransactions.read.out
Number of read tickets (concurrent transactions) in use.
mongodb.wiredtiger.concurrenttransactions.read.totaltickets
Total number of read tickets (concurrent transactions) available.
mongodb.wiredtiger.concurrenttransactions.write.available
Number of available write tickets (concurrent transactions) remaining.
mongodb.wiredtiger.concurrenttransactions.write.out
Number of write tickets (concurrent transactions) in use.
mongodb.wiredtiger.concurrenttransactions.write.totaltickets
Total number of write tickets (concurrent transactions) available.
mongodb.collection.size
The total size in bytes of the data in the collection plus the size of
every indexes on the mongodb.collection.
mongodb.collection.avgObjSize
The size of the average object in the collection in bytes.
mongodb.collection.count
Total number of objects in the collection.
mongodb.collection.capped
Whether or not the collection is capped.
mongodb.collection.max
Maximum number of documents in a capped collection.
mongodb.collection.maxSize
Maximum size of a capped collection in bytes.
mongodb.collection.storageSize
Total storage space allocated to this collection for document storage.
mongodb.collection.nindexes
Total number of indices on the collection.
mongodb.collection.indexSizes
Size of index in bytes.
mongodb.collection.indexes.accesses.ops
Number of time the index was used.
mongodb.usage.commands.countps
Number of commands per second
mongodb.usage.commands.count
Number of commands since server start (deprecated)
mongodb.usage.commands.time
Total time spent performing commands in microseconds
mongodb.usage.getmore.countps
Number of getmore per second
mongodb.usage.getmore.count
Number of getmore since server start (deprecated)
mongodb.usage.getmore.time
Total time spent performing getmore in microseconds
mongodb.usage.insert.countps
Number of inserts per second
mongodb.usage.insert.count
Number of inserts since server start (deprecated)
mongodb.usage.insert.time
Total time spent performing inserts in microseconds
mongodb.usage.queries.countps
Number of queries per second
mongodb.usage.queries.count
Number of queries since server start (deprecated)
mongodb.usage.queries.time
Total time spent performing queries in microseconds
mongodb.usage.readLock.countps
Number of read locks per second
mongodb.usage.readLock.count
Number of read locks since server start (deprecated)
mongodb.usage.readLock.time
Total time spent performing read locks in microseconds
mongodb.usage.remove.countps
Number of removes per second
mongodb.usage.remove.count
Number of removes since server start (deprecated)
mongodb.usage.remove.time
Total time spent performing removes in microseconds
mongodb.usage.total.countps
Number of operations per second
mongodb.usage.total.count
Number of operations since server start (deprecated)
mongodb.usage.total.time
Total time spent performing operations in microseconds
mongodb.usage.update.countps
Number of updates per second
mongodb.usage.update.count
Number of updates since server start (deprecated)
mongodb.usage.update.time
Total time spent performing updates in microseconds
mongodb.usage.writeLock.countps
Number of write locks per second
mongodb.usage.writeLock.count
Number of write locks since server start (deprecated)
mongodb.usage.writeLock.time
Total time spent performing write locks in microseconds
6.3.2.16 - MySQL Metrics
See Application Integrations for more information.
mysql.galera.wsrep_cluster_size
The current number of nodes in the Galera cluster.
mysql.innodb.buffer_pool_free
The number of free pages in the InnoDB Buffer Pool.
mysql.innodb.buffer_pool_total
The total number of pages in the InnoDB Buffer Pool.
mysql.innodb.buffer_pool_used
The number of used pages in the InnoDB Buffer Pool.
mysql.innodb.buffer_pool_utilization
The utilization of the InnoDB Buffer Pool.
mysql.innodb.current_row_locks
The number of current row locks.
mysql.innodb.data_reads
The rate of data reads.
mysql.innodb.data_writes
The rate of data writes.
mysql.innodb.mutex_os_waits
The rate of mutex OS waits.
mysql.innodb.mutex_spin_rounds
The rate of mutex spin rounds.
mysql.innodb.mutex_spin_waits
The rate of mutex spin waits.
mysql.innodb.os_log_fsyncs
The rate of fsync writes to the log file.
mysql.innodb.row_lock_time
The fraction of time spent (ms/s) acquring row locks.
mysql.innodb.row_lock_waits
The number of times per second a row lock had to be waited for.
mysql.net.connections
The rate of connections to the server.
mysql.net.max_connections
The maximum number of connections that have been in use simultaneously
since the server started.
The rate of delete statements.
The rate of delete-multi statements.
The rate of insert statements.
The rate of insert-select statements.
The rate of replace-select statements.
The rate of select statements.
The rate of update statements.
The rate of update-multi.
The rate of internal on-disk temporary tables created by second by the
server while executing statements.
The rate of temporary files created by second.
The rate of internal temporary tables created by second by the server
while executing statements.
The percentage of CPU time spent in kernel space by MySQL.
The key cache utilization ratio.
The number of open files.
The number of of tables that are open.
The rate of query cache hits.
The rate of queries.
The rate of statements executed by the server.
The rate of slow queries.
The total number of times that a request for a table lock could not be
granted immediately and a wait was needed.
The number of currently open connections.
The number of threads that are not sleeping.
The percentage of CPU time spent in user space by MySQL.
mysql.replication.seconds_behind_master
The lag in seconds between the master and the slave.
mysql.replication.slave_running
A boolean showing if this server is a replication slave that is
connected to a replication master.
mysql.replication.slaves_connected
The number of slaves connected to a replication master.
6.3.2.17 - NGINX and NGINX Plus Metrics
Contents
6.3.2.17.1 - NGINX Metrics
See Application Integrations for more information.
nginx.net.conn_dropped_per_s
The rate of connections dropped.
nginx.net.conn_opened_per_s
The rate of connections opened.
nginx.net.connections
The total number of active connections.
nginx.net.reading
The number of connections reading client requests.
nginx.net.request_per_s
The rate of requests processed.
nginx.net.waiting
The number of keep-alive connections waiting for work.
nginx.net.writing
The number of connections waiting on upstream responses and/or writing
responses back to the client.
6.3.2.17.2 - NGINX Plus Metrics
See Application Integrations for more information.
nginx.plus.cache.bypass.bytes
The total number of bytes read from the proxied server.
nginx.plus.cache.bypass.bytes_written
The total number of bytes written to the cache.
nginx.plus.cache.bypass.responses
The total number of responses from the cache.
nginx.plus.cache.bypass.responses_written
The total number of responses written to the cache.
nginx.plus.cache.cold
Boolean. Defines whether the cache loader process is still loading data
from the disk into the cache or not.
nginx.plus.cache.expired.bytes
The total number of bytes read from the proxied server.
nginx.plus.cache.expired.bytes_written
The total number of bytes written to the cache.
nginx.plus.cache.expired.responses
The total number of responses not taken from the cache
nginx.plus.cache.expired.responses_written
The total number of responses written to the cache
nginx.plus.cache.hit.bytes
The total number of bytes read from the cache
nginx.plus.cache.hit.responses
The total number of responses read from the cache
nginx.plus.cache.max_size
The limit on the maximum size of the cache specified in the
configuration
nginx.plus.cache.miss.bytes
The total number of bytes read from the proxied server
nginx.plus.cache.miss.bytes_written
The total number of bytes written to the cache
nginx.plus.cache.miss.responses
The total number of responses not taken from the cache
nginx.plus.cache.miss.responses_written
The total number of responses written to the cache
nginx.plus.cache.revalidated.bytes
The total number of bytes read from the cache
nginx.plus.cache.revalidated.response
The total number of responses read from the cache
nginx.plus.cache.size
The current size of the cache
nginx.plus.cache.stale.bytes
The total number of bytes read from the cache
nginx.plus.cache.stale.responses
The total number of responses read from the cache
nginx.plus.cache.updating.bytes
The total number of bytes read from the cache
nginx.plus.cache.updating.responses
The total number of responses read from the cache
nginx.plus.connections.accepted
The total number of accepted client connections.
nginx.plus.connections.active
The current number of active client connections.
nginx.plus.connections.dropped
The total number of dropped client connections.
nginx.plus.connections.idle
The current number of idle client connections.
nginx.plus.generation
The total number of configuration reloads
nginx.plus.load_timestamp
Time of the last reload of configuration (time since Epoch).
nginx.plus.pid
The ID of the worker process that handled status request.
nginx.plus.plus.upstream.peers.fails
The total number of unsuccessful attempts to communicate with the
server.
nginx.plus.ppid
The ID of the master process that started the worker process
nginx.plus.processes.respawned
The total number of abnormally terminated and re-spawned child
processes.
nginx.plus.requests.current
The current number of client requests.
nginx.plus.requests.total
The total number of client requests.
nginx.plus.server_zone.discarded
The total number of requests completed without sending a response.
nginx.plus.server_zone.processing
The number of client requests that are currently being processed.
nginx.plus.server_zone.received
The total amount of data received from clients.
nginx.plus.server_zone.requests
The total number of client requests received from clients.
nginx.plus.server_zone.responses.1xx
The number of responses with 1xx status code.
nginx.plus.server_zone.responses.2xx
The number of responses with 2xx status code.
nginx.plus.server_zone.responses.3xx
The number of responses with 3xx status code.
nginx.plus.server_zone.responses.4xx
The number of responses with 4xx status code.
nginx.plus.server_zone.responses.5xx
The number of responses with 5xx status code.
nginx.plus.server_zone.responses.total
The total number of responses sent to clients.
nginx.plus.server_zone.sent
The total amount of data sent to clients.
nginx.plus.slab.pages.free
The current number of free memory pages
nginx.plus.slab.pages.used
The current number of used memory pages
nginx.plus.slab.slots.fails
The number of unsuccessful attempts to allocate memory of specified size
nginx.plus.slab.slots.free
The current number of free memory slots
nginx.plus.slab.slots.reqs
The total number of attempts to allocate memory of specified size
nginx.plus.slab.slots.used
The current number of used memory slots
nginx.plus.ssl.handshakes
The total number of successful SSL handshakes.
nginx.plus.ssl.handshakes_failed
The total number of failed SSL handshakes.
nginx.plus.ssl.session_reuses
The total number of session reuses during SSL handshake.
nginx.plus.stream.server_zone.connections
The total number of connections accepted from clients
nginx.plus.stream.server_zone.connections
The total number of connections accepted from clients
nginx.plus.stream.server_zone.discarded
The total number of requests completed without sending a response.
nginx.plus.stream.server_zone.discarded
The total number of requests completed without sending a response.
nginx.plus.stream.server_zone.processing
The number of client requests that are currently being processed.
nginx.plus.stream.server_zone.processing
The number of client requests that are currently being processed.
nginx.plus.stream.server_zone.received
The total amount of data received from clients.
nginx.plus.stream.server_zone.received
The total amount of data received from clients.
nginx.plus.stream.server_zone.sent
The total amount of data sent to clients.
nginx.plus.stream.server_zone.sent
The total amount of data sent to clients.
nginx.plus.stream.server_zone.sessions.1xx
The number of responses with 1xx status code.
nginx.plus.stream.server_zone.sessions.2xx
The number of responses with 2xx status code.
nginx.plus.stream.server_zone.sessions.3xx
The number of responses with 3xx status code.
nginx.plus.stream.server_zone.sessions.4xx
The number of responses with 4xx status code.
nginx.plus.stream.server_zone.sessions.5xx
The number of responses with 5xx status code.
nginx.plus.stream.server_zone.sessions.total
The total number of responses sent to clients.
nginx.plus.stream.upstream.peers.active
The current number of connections
nginx.plus.stream.upstream.peers.backup
A boolean value indicating whether the server is a backup server.
nginx.plus.stream.upstream.peers.connections
The total number of client connections forwarded to this server.
nginx.plus.stream.upstream.peers.downstart
The time (time since Epoch) when the server became “unavail” or
“checking” or “unhealthy”
nginx.plus.stream.upstream.peers.downtime
Total time the server was in the “unavail” or “checking” or “unhealthy”
states.
nginx.plus.stream.upstream.peers.fails
The total number of unsuccessful attempts to communicate with the
server.
nginx.plus.stream.upstream.peers.health_checks.checks
The total number of health check requests made.
nginx.plus.stream.upstream.peers.health_checks.fails
The number of failed health checks.
nginx.plus.stream.upstream.peers.health_checks.last_passed
Boolean indicating if the last health check request was successful and
passed tests.
nginx.plus.stream.upstream.peers.health_checks.unhealthy
How many times the server became unhealthy (state “unhealthy”).
nginx.plus.stream.upstream.peers.id
The ID of the server.
nginx.plus.stream.upstream.peers.received
The total number of bytes received from this server.
The time (time since Epoch) when the server was last selected to process
a connection.
The total number of bytes sent to this server.
nginx.plus.stream.upstream.peers.unavail
How many times the server became unavailable for client connections
(state “unavail”).
nginx.plus.stream.upstream.peers.weight
Weight of the server.
nginx.plus.stream.upstream.zombies
The current number of servers removed from the group but still
processing active client connections.
nginx.plus.timestamp
Current time since Epoch.
nginx.plus.upstream.keepalive
The current number of idle keepalive connections.
nginx.plus.upstream.peers.active
The current number of active connections.
nginx.plus.upstream.peers.backup
A boolean value indicating whether the server is a backup server.
nginx.plus.upstream.peers.downstart
The time (since Epoch) when the server became “unavail” or “unhealthy”.
nginx.plus.upstream.peers.downtime
Total time the server was in the “unavail” and “unhealthy” states.
nginx.plus.upstream.peers.health_checks.checks
The total number of health check requests made.
nginx.plus.upstream.peers.health_checks.fails
The number of failed health checks.
nginx.plus.upstream.peers.health_checks.last_passed
Boolean indicating if the last health check request was successful and
passed tests.
nginx.plus.upstream.peers.health_checks.unhealthy
How many times the server became unhealthy (state “unhealthy”).
nginx.plus.upstream.peers.id
he ID of the server.
nginx.plus.upstream.peers.received
The total amount of data received from this server.
nginx.plus.upstream.peers.requests
The total number of client requests forwarded to this server.
nginx.plus.upstream.peers.responses.1xx
The number of responses with 1xx status code.
nginx.plus.upstream.peers.responses.1xx_count
The number of responses with 1xx status code (shown as count).
nginx.plus.upstream.peers.responses.2xx
The number of responses with 2xx status code.
nginx.plus.upstream.peers.responses.2xx_count
The number of responses with 2xx status code (shown as count).
nginx.plus.upstream.peers.responses.3xx
The number of responses with 3xx status code.
nginx.plus.upstream.peers.responses.3xx_count
The number of responses with 3xx status code (shown as count).
nginx.plus.upstream.peers.responses.4xx
The number of responses with 4xx status code.
nginx.plus.upstream.peers.responses.4xx_count
The number of responses with 4xx status code (shown as count).
nginx.plus.upstream.peers.responses.5xx
The number of responses with 5xx status code.
nginx.plus.upstream.peers.responses.5xx_count
The number of responses with 5xx status code (shown as count).
nginx.plus.upstream.peers.responses.total
The total number of responses obtained from this server.
The time (since Epoch) when the server was last selected to process a
request (1.7.5).
The total amount of data sent to this server.
nginx.plus.upstream.peers.unavail
How many times the server became unavailable for client requests (state
“unavail”) due to the number of unsuccessful attempts reaching the
max_fails threshold.
nginx.plus.upstream.peers.weight
The weight of the server.
nginx.plus.version
The NGINX version.
6.3.2.18 - NTP Metrics
See Application Integrations for more information.
ntp.offset
The time difference between the local clock and the NTP reference clock,
in seconds.
6.3.2.19 - PGBouncer Metrics
See Application Integrations for more information.
pgbouncer.pools.cl_active
The number of client connections linked to a server connection and able
to process queries.
pgbouncer.pools.cl_waiting
The number of client connections waiting on a server connection.
pgbouncer.pools.maxwait
The age of the oldest unserved client connection.
pgbouncer.pools.sv_active
The number of server connections linked to a client connection.
pgbouncer.pools.sv_idle
The number of server connections idle and ready for a client query.
pgbouncer.pools.sv_login
The number of server connections currently in the process of logging in.
pgbouncer.pools.sv_tested
The number of server connections currently running either
server_reset_query or server_check_query.
pgbouncer.pools.sv_used
The number of server connections idle more than server_check_delay,
needing server_check_query.
pgbouncer.stats.avg_query
The average query duration.
pgbouncer.stats.avg_recv
The average amount of client network traffic received.
pgbouncer.stats.avg_req
The average number of requests per second in the last stat period.
pgbouncer.stats.avg_sent
The average amount of client network traffic sent.
pgbouncer.stats.bytes_received_per_second
The total network traffic received.
pgbouncer.stats.bytes_sent_per_second
The total network traffic sent.
pgbouncer.stats.requests_per_second
The request rate.
pgbouncer.stats.total_query_time
The time spent by PgBouncer actively querying PostgreSQL.
6.3.2.20 - PHP-FPM Metrics
See Application Integrations for more information.
php_fpm.listen_queue.size
The size of the socket queue of pending connections.
php_fpm.processes.active
The total number of active processes.
php_fpm.processes.idle
The total number of idle processes.
php_fpm.processes.max_reached
The number of times the process limit has been reached.
php_fpm.processes.total
The total number of processes.
php_fpm.requests.accepted
The total number of accepted requests.
php_fpm.requests.slow
The total number of slow requests.
6.3.2.21 - PostgreSQL Metrics
See Application Integrations for more information.
Metric Name | Type | Description |
---|
postgresql.seq_scans | gauge | The number of sequential scans initiated on this table. |
postgresql.index_scans | gauge | The number of index scans initiated on this table. |
postgresql.index_rows_fetched | gauge | The number of live rows fetched by index scans. |
postgresql.rows_hot_updated | gauge | The number of rows HOT updated, meaning no separate index update was needed. |
postgresql.live_rows | gauge | The estimated number of live rows. |
postgresql.dead_rows | gauge | The estimated number of dead rows. |
postgresql.index_rows_read | gauge | The number of index entries returned by scans on this index. |
postgresql.table_size | gauge | The total disk space used by the specified table. Includes TOAST, free space map, and visibility map. Excludes indexes. |
postgresql.index_size | gauge | The total disk space used by indexes attached to the specified table. |
postgresql.total_size | gauge | The total disk space used by the table, including indexes and TOAST data. |
postgresql.heap_blocks_read | gauge | The number of disk blocks read from this table. |
postgresql.heap_blocks_hit | gauge | The number of buffer hits in this table. |
postgresql.index_blocks_read | gauge | The number of disk blocks read from all indexes on this table. |
postgresql.index_blocks_hit | gauge | The number of buffer hits in all indexes on this table. |
postgresql.toast_blocks_read | gauge | The number of disk blocks read from this table’s TOAST table. |
postgresql.toast_blocks_hit | gauge | The number of buffer hits in this table’s TOAST table. |
postgresql.toast_index_blocks_read | gauge | The number of disk blocks read from this table’s TOAST table index. |
postgresql.toast_index_blocks_hit | gauge | The number of buffer hits in this table’s TOAST table index. |
postgresql.active_queries | gauge | The number of active queries in this database. |
postgresql.archiver.archived_count | gauge | The number of WAL files that have been successfully archived. |
postgresql.archiver.failed_count | gauge | The number of failed attempts for archiving WAL files. |
postgresql.before_xid_wraparound | gauge | The number of transactions that can occur until a transaction wraparound. |
postgresql.index_rel_rows_fetched | rate | The number of live rows fetched by index scans. |
postgresql.transactions.idle_in_transaction | gauge | The number of ‘idle in transaction’ transactions in this database. |
postgresql.transactions.open | gauge | The number of open transactions in this database. |
postgresql.waiting_queries | gauge | The number of waiting queries in this database. |
postgresql.waiting_queries | gauge | The number of buffers allocated |
postgresql.bgwriter.buffers_backend | gauge | The number of buffers written directly by a backend. |
postgresql.bgwriter.buffers_backend_fsync | gauge | The of times a backend had to execute its own fsync call instead of the background writer. |
postgresql.bgwriter.buffers_checkpoint | gauge | The number of buffers written during checkpoints. |
postgresql.bgwriter.buffers_clean | gauge | The number of buffers written by the background writer. |
postgresql.bgwriter.checkpoints_requested | gauge | The number of requested checkpoints that were performed. |
postgresql.bgwriter.checkpoints_timed | gauge | The number of scheduled checkpoints that were performed. |
postgresql.bgwriter.maxwritten_clean | gauge. | The number of times the background writer stopped a cleaning scan due to writing too many buffers. |
postgresql.bgwriter.sync_time | gauge | The total amount of checkpoint processing time spent synchronizing files to disk. |
postgresql.bgwriter.write_time | gauge | The total amount of checkpoint processing time spent writing files to disk. |
postgresql.buffer_hit | gauge | The number of times disk blocks were found in the buffer cache, preventing the need to read from the database. |
postgresql.commits | gauge | The number of transactions that have been committed in this database. |
postgresql.connections | gauge | The number of active connections to this database. |
postgresql.database_size | gauge | The disk space used by this database. |
postgresql.deadlocks | gauge | The number of deadlocks detected in this database |
postgresql.disk_read | gauge | The number of disk blocks read in this database. |
postgresql.locks | gauge | The number of locks active for this database. |
postgresql.max_connections | gauge | The maximum number of client connections allowed to this database. |
postgresql.percent_usage_connections | gauge | The number of connections to this database as a fraction of the maximum number of allowed connections. |
postgresql.replication_delay | gauge | The current replication delay in seconds. Only available with PostgreSQL 9.1 and newer. |
postgresql.replication_delay_bytes | gauge | The current replication delay in bytes. Only available with PostgreSQL 9.2 and newer. |
postgresql.rollbacks | gauge | The number of transactions that have been rolled back in this database. |
postgresql.rows_deleted | gauge | The number of rows deleted by queries in this database. |
postgresql.rows_fetched | gauge | The number of rows fetched by queries in this database. |
postgresql.rows_inserted | gauge | The number of rows inserted by queries in this database. The metrics can be segmented by ‘db’ or ’table’ and can be viewed per-relation. |
postgresql.rows_returned | gauge | The number of rows returned by queries in this database. The metrics can be segmented by ‘db’ or ’table’ and can be viewed per-relation. |
postgresql.rows_updated | gauge | The number of rows updated by queries in this database. |
postgresql.rows_deleted | gauge | The number of rows deleted by queries in this database. The metrics can be segmented by ‘db’ or ’table’ and can be viewed per-relation. |
postgresql.table.count | gauge | The number of user tables in this database. |
postgresql.temp_bytes | gauge | The amount of data written to temporary files by queries in this database. |
postgresql.temp_files | gauge | The number of temporary files created by queries in this database. |
postgresql.toast_blocks_read | gauge | The number of disk blocks read from this table’s TOAST table. |
postgresql.transactions.idle_in_transaction | gauge | The number of ‘idle in transaction’ transactions in this database. |
postgresql.transactions.open | gauge | The number of open transactions in this database. |
6.3.2.22 - RabbitMQ Metrics
See Application Integrations for more information.
rabbitmq.connections
The number of current connections to a given rabbitmq vhost. Each
connection is tagged as rabbitmq_vhost:<vhost_name>
.
rabbitmq.connections.state
The number of connections in the specified connection state.
rabbitmq.exchange.messages.ack.count
The number of messages delivered to clients and acknowledged.
rabbitmq.exchange.messages.ack.rate
The rate of messages delivered to clients and acknowledged per second.
rabbitmq.exchange.messages.confirm.count
The number of messages confirmed.
rabbitmq.exchange.messages.confirm.rate
The rate of messages confirmed per second.
rabbitmq.exchange.messages.deliver_get.count
The sum of messages delivered in acknowledgement mode to consumers, in
no-acknowledgement mode to consumers, in acknowledgement mode in
response to basic.get, and in no-acknowledgement mode in response to
basic.get.
rabbitmq.exchange.messages.deliver_get.rate
The rate per second of the sum of messages delivered in acknowledgement
mode to consumers, in no-acknowledgement mode to consumers, in
acknowledgement mode in response to basic.get, and in no-acknowledgement
mode in response to basic.get.
rabbitmq.exchange.messages.publish_in.count
The number of messages published from channels into this exchange.
rabbitmq.exchange.messages.publish_in.rate
The amount of messages published from channels into this exchange per
second.
rabbitmq.exchange.messages.publish_out.count
The number of messages published from this exchange into queues.
rabbitmq.exchange.messages.publish_out.rate
The amount of messages published from this exchange into queues per
second.
rabbitmq.exchange.messages.publish.count
The number of messages published.
rabbitmq.exchange.messages.publish.rate
The amount of messages published per second.
rabbitmq.exchange.messages.redeliver.count
The number of subset of messages in deliver_get
which had the
redelivered flag set.
rabbitmq.exchange.messages.redeliver.rate
The amount of subset of messages in deliver_get
which had the
redelivered flag set per second.
rabbitmq.exchange.messages.return_unroutable.count
The number of messages returned to the publisher as unroutable.
rabbitmq.exchange.messages.return_unroutable.rate
The amount of messages returned to publisher as unroutable per second.
rabbitmq.node.disk_alarm
Defines whether the node has a disk alarm configured.
rabbitmq.node.disk_free
The current free disk space.
rabbitmq.node.fd_used
Used file descriptors.
rabbitmq.node.mem_alarm
Defines whether the node has a memory alarm configured.
rabbitmq.node.mem_used
The total memory used in bytes.
rabbitmq.node.partitions
The number of network partitions this node is seeing.
rabbitmq.node.run_queue
The average number of Erlang processes waiting to run.
rabbitmq.node.running
Defines whether the node is running or not.
rabbitmq.node.sockets_used
The number of file descriptors used as sockets.
rabbitmq.overview.messages.ack.count
The number of messages delivered to clients and acknowledged.
rabbitmq.overview.messages.ack.rate
The rate of messages delivered to clients and acknowledged per second.
rabbitmq.overview.messages.confirm.count
The number of messages confirmed.
rabbitmq.overview.messages.confirm.rate
The rate of messages confirmed per second.
rabbitmq.overview.messages.deliver_get.count
The sum of messages delivered in acknowledgement mode to consumers, in
no-acknowledgement mode to consumers, in acknowledgement mode in
response to basic.get, and in no-acknowledgement mode in response to
basic.get.
rabbitmq.overview.messages.deliver_get.rate
The rate per second of the sum of messages delivered in acknowledgement
mode to consumers, in no-acknowledgement mode to consumers, in
acknowledgement mode in response to basic.get, and in no-acknowledgement
mode in response to basic.get.
rabbitmq.overview.messages.publish_in.count
The number of messages published from channels into this overview.
rabbitmq.overview.messages.publish_in.rate
The rate of messages published from channels into this overview per
second.
rabbitmq.overview.messages.publish_out.count
The number of messages published from this overview into queues.
rabbitmq.overview.messages.publish_out.rate
The rate of messages published from this overview into queues per
second.
rabbitmq.overview.messages.publish.count
The number of messages published.
rabbitmq.overview.messages.publish.rate
The rate of messages published per second.
rabbitmq.overview.messages.redeliver.count
The number of subset of messages in deliver_get
which had the
redelivered flag set.
rabbitmq.overview.messages.redeliver.rate
The rate of subset of messages in deliver_get
which had the
redelivered flag set per second.
rabbitmq.overview.messages.return_unroutable.count
The number of messages returned to publisher as unroutable.
rabbitmq.overview.messages.return_unroutable.rate
The rate of messages returned to publisher as unroutable per second.
rabbitmq.overview.object_totals.channels
The total number of channels.
rabbitmq.overview.object_totals.connections
The total number of connections.
rabbitmq.overview.object_totals.consumers
The total number of consumers.
rabbitmq.overview.object_totals.queues
The total number of queues.
rabbitmq.overview.queue_totals.messages_ready.count
The number of messages ready for delivery.
rabbitmq.overview.queue_totals.messages_ready.rate
The rate of messages ready for delivery.
rabbitmq.overview.queue_totals.messages_unacknowledged.count
The number of unacknowledged messages.
rabbitmq.overview.queue_totals.messages_unacknowledged.rate
The rate of unacknowledged messages.
rabbitmq.overview.queue_totals.messages.count
The total number of messages (ready plus unacknowledged).
rabbitmq.overview.queue_totals.messages.rate
The rate of messages (ready plus unacknowledged).
rabbitmq.queue.active_consumers
The number of active consumers, consumers that can immediately receive
any messages sent to the queue.
rabbitmq.queue.bindings.count
The number of bindings for a specific queue.
rabbitmq.queue.consumer_utilisation
The ratio of time that a queue’s consumers can take new messages.
rabbitmq.queue.consumers
The number of consumers.
rabbitmq.queue.memory
The number of bytes of memory consumed by the Erlang process associated
with the queue, including stack, heap and internal structures.
rabbitmq.queue.messages
The total number of messages in the queue.
rabbitmq.queue.messages_ready
The number of messages ready to be delivered to clients.
rabbitmq.queue.messages_ready.rate
The number of messages ready to be delivered to clients per second.
rabbitmq.queue.messages_unacknowledged
The number of messages delivered to clients but not yet acknowledged.
rabbitmq.queue.messages_unacknowledged.rate
The number of messages delivered to clients but not yet acknowledged per
second.
rabbitmq.queue.messages.ack.count
The number of messages delivered to clients and acknowledged.
rabbitmq.queue.messages.ack.rate
The number of messages delivered to clients and acknowledged per second.
rabbitmq.queue.messages.deliver_get.count
The sum of messages delivered in acknowledgement mode to consumers, in
no-acknowledgement mode to consumers, in acknowledgement mode in
response to basic.get, and in no-acknowledgement mode in response to
basic.get.
rabbitmq.queue.messages.deliver_get.rate
The sum of messages delivered in acknowledgement mode to consumers, in
no-acknowledgement mode to consumers, in acknowledgement mode in
response to basic.get, and in no-acknowledgement mode in response to
basic.get per second.
rabbitmq.queue.messages.deliver.count
The number of messages delivered in acknowledgement mode to consumers.
rabbitmq.queue.messages.deliver.rate
The number of messages delivered in acknowledgement mode to consumers.
rabbitmq.queue.messages.publish.count
The number of messages published.
rabbitmq.queue.messages.publish.rate
The rate of messages published per second.
rabbitmq.queue.messages.rate
The total number of messages in the queue per second.
rabbitmq.queue.messages.redeliver.count
The number of subset of messages in deliver_get
which had the
redelivered flag set.
rabbitmq.queue.messages.redeliver.rate
The rate per second of subset of messages in deliver_get
which had the
redelivered flag set.
6.3.2.23 - Supervisord Metrics
See Application Integrations for more information.
supervisord.process.count
The number of supervisord monitored processes.
supervisord.process.uptime
The process uptime.
6.3.2.24 - TCP Metrics
See Application Integrations for more information.
network.tcp.response_time
The response time of a given host and TCP port.
6.3.2.25 - Varnish Metrics
See Application Integrations for more information.
All Varnish metrics have the type gauge except varnish.n_purgesps,
which has the type rate.
varnish.accept_fail
Accept failures. This metric is only provided by varnish 3.x.
varnish.backend_busy
Maximum number of connections to a given backend.
varnish.backend_conn
Successful connections to a given backend.
varnish.backend_fail
Failed connections for a given backend.
varnish.backend_recycle
Backend connections with keep-alive that are returned to the pool of
connections.
varnish.backend_req
Backend requests.
varnish.backend_retry
Backend connection retries.
varnish.backend_reuse
Recycled connections that has were reused.
Backend connections closed because they were idle too long.
varnish.backend_unhealthy
Backend connections not tried because the backend was unhealthy.
varnish.bans
Bans in system, including bans superseded by newer bans and bans already
checked by the ban-lurker. This metric is only provided by varnish 4.x.
varnish.bans_added
Bans added to ban list. This metric is only provided by varnish 4.x.
varnish.bans_completed
Bans which are no longer active, either because they got checked by the
ban-lurker or superseded by newer identical bans. This metric is only
provided by varnish 4.x.
varnish.bans_deleted
Bans deleted from ban list. This metric is only provided by varnish 4.x.
varnish.bans_dups
Bans replaced by later identical bans. This metric is only provided by
varnish 4.x.
varnish.bans_lurker_contention
Times the ban-lurker waited for lookups. This metric is only provided by
varnish 4.x.
varnish.bans_lurker_obj_killed
Objects killed by ban-lurker. This metric is only provided by varnish
4.x.
varnish.bans_lurker_tested
Bans and objects tested against each other by the ban-lurker. This
metric is only provided by varnish 4.x.
varnish.bans_lurker_tests_tested
Tests and objects tested against each other by the ban-lurker. ‘ban
req.url == foo && req.http.host == bar’ counts as one in ‘bans_tested’
and as two in ‘bans_tests_tested’. This metric is only provided by
varnish 4.x.
varnish.bans_obj
Bans which use obj.* variables. These bans can possibly be washed by
the ban-lurker. This metric is only provided by varnish 4.x.
varnish.bans_obj_killed
Objects killed by bans during object lookup. This metric is only
provided by varnish 4.x
varnish.bans_persisted_bytes
Bytes used by the persisted ban lists. This metric is only provided by
varnish 4.x.
varnish.bans_persisted_fragmentation
Extra bytes accumulated through dropped and completed bans in the
persistent ban lists. This metric is only provided by varnish 4.x.
varnish.bans_req
Bans which use req.* variables. These bans can not be washed by the
ban-lurker. This metric is only provided by varnish 4.x.
varnish.bans_tested
Bans and objects tested against each other during hash lookup. This
metric is only provided by varnish 4.x.
varnish.bans_tests_tested
Tests and objects tested against each other during lookup. ‘ban req.url
== foo && req.http.host == bar’ counts as one in ‘bans_tested’ and as
two in ‘bans_tests_tested’. This metric is only provided by varnish
4.x.
varnish.busy_sleep
Requests sent to sleep without a worker thread because they found a busy
object. This metric is only provided by varnish 4.x.
varnish.busy_wakeup
Requests taken off the busy object sleep list and and rescheduled. This
metric is only provided by varnish 4.x.
varnish.cache_hit
Requests served from the cache.
varnish.cache_hitpass
Requests passed to a backend where the decision to pass them found in
the cache.
varnish.cache_miss
Requests fetched from a backend server.
varnish.client_conn
Client connections accepted. This metric is only provided by varnish
3.x.
varnish.client_drop
Client connection dropped, no session. This metric is only provided by
varnish 3.x.
varnish.client_drop_late
Client connection dropped late. This metric is only provided by varnish
3.x.
varnish.client_req
Parseable client requests seen.
varnish.client_req_400
Requests that were malformed in some drastic way. This metric is only
provided by varnish 4.x.
varnish.client_req_411
Requests that were missing a Content-Length: header. This metric is only
provided by varnish 4.x.
varnish.client_req_413
Requests that were too big. This metric is only provided by varnish 4.x.
varnish.client_req_417
Requests with a bad Expect: header. This metric is only provided by
varnish 4.x.
varnish.dir_dns_cache_full
DNS director full DNS cache. This metric is only provided by varnish
3.x.
varnish.dir_dns_failed
DNS director failed lookup. This metric is only provided by varnish 3.x.
varnish.dir_dns_hit
DNS director cached lookup hit. This metric is only provided by varnish
3.x.
varnish.dir_dns_lookups
DNS director lookups. This metric is only provided by varnish 3.x.
varnish.esi_errors
Edge Side Includes (ESI) parse errors.
varnish.esi_warnings
Edge Side Includes (ESI) parse warnings.
varnish.exp_mailed
Objects mailed to expiry thread for handling. This metric is only
provided by varnish 4.x.
varnish.exp_received
Objects received by expiry thread for handling. This metric is only
provided by varnish 4.x.
varnish.fetch_1xx
Back end response with no body because of 1XX response (Informational).
varnish.fetch_204
Back end response with no body because of 204 response (No Content).
varnish.fetch_304
Back end response with no body because of 304 response (Not Modified).
varnish.fetch_bad
Back end response’s body length could not be determined and/or had bad
headers.
varnish.fetch_chunked
Back end response bodies that were chunked.
varnish.fetch_close
Fetch wanted close.
varnish.fetch_eof
Back end response bodies with EOF.
varnish.fetch_failed
Back end response fetches that failed.
varnish.fetch_head
Back end HEAD requests.
varnish.fetch_length
Back end response bodies with Content-Length.
varnish.fetch_no_thread
Back end fetches that failed because no thread was available. This
metric is only provided by varnish 4.x.
varnish.fetch_oldhttp
Number of responses served by backends with http < 1.1
varnish.fetch_zero
Number of responses that have zero length.
varnish.hcb_insert
HCB inserts.
varnish.hcb_lock
HCB lookups with lock.
varnish.hcb_nolock
HCB lookups without lock.
varnish.LCK.backend.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.backend.creat
Created locks.
varnish.LCK.backend.destroy
Destroyed locks.
varnish.LCK.backend.locks
Lock operations.
varnish.LCK.ban.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.ban.creat
Created locks.
varnish.LCK.ban.destroy
Destroyed locks.
varnish.LCK.ban.locks
Lock operations.
varnish.LCK.busyobj.creat
Created locks. This metric is only provided by varnish 4.x.
varnish.LCK.busyobj.destroy
Destroyed locks. This metric is only provided by varnish 4.x.
varnish.LCK.busyobj.locks
Lock operations. This metric is only provided by varnish 4.x.
varnish.LCK.cli.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.cli.creat
Created locks.
varnish.LCK.cli.destroy
Destroyed locks.
varnish.LCK.cli.locks
Lock operations.
varnish.LCK.exp.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.exp.creat
Created locks.
varnish.LCK.exp.destroy
Destroyed locks.
varnish.LCK.exp.locks
Lock operations.
varnish.LCK.hcb.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.hcb.creat
Created locks.
varnish.LCK.hcb.destroy
Destroyed locks.
varnish.LCK.hcb.locks
Lock operations.
varnish.LCK.hcl.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.hcl.creat
Created locks.
varnish.LCK.hcl.destroy
Destroyed locks.
varnish.LCK.hcl.locks
Lock operations.
varnish.LCK.herder.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.herder.creat
Created locks.
varnish.LCK.herder.destroy
Destroyed locks.
varnish.LCK.herder.locks
Lock operations.
varnish.LCK.hsl.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.hsl.creat
Created locks.
varnish.LCK.hsl.destroy
Destroyed locks.
varnish.LCK.hsl.locks
Lock operations.
varnish.LCK.lru.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.lru.creat
Created locks.
varnish.LCK.lru.destroy
Destroyed locks.
varnish.LCK.lru.locks
Lock operations.
varnish.LCK.mempool.creat
Created locks. This metric is only provided by varnish 4.x.
varnish.LCK.mempool.destroy
Destroyed locks. This metric is only provided by varnish 4.x.
varnish.LCK.mempool.locks
Lock operations. This metric is only provided by varnish 4.x.
varnish.LCK.nbusyobj.creat
Created locks. This metric is only provided by varnish 4.x.
varnish.LCK.nbusyobj.destroy
Destroyed locks. This metric is only provided by varnish 4.x.
varnish.LCK.nbusyobj.locks
Lock operations. This metric is only provided by varnish 4.x.
varnish.LCK.objhdr.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.objhdr.creat
Created locks.
varnish.LCK.objhdr.destroy
Destroyed locks.
varnish.LCK.objhdr.locks
Lock operations.
varnish.LCK.pipestat.creat
Created locks. This metric is only provided by varnish 4.x.
varnish.LCK.pipestat.destroy
Destroyed locks. This metric is only provided by varnish 4.x.
varnish.LCK.pipestat.locks
Lock operations. This metric is only provided by varnish 4.x.
varnish.LCK.sess.creat
Created locks. This metric is only provided by varnish 4.x.
varnish.LCK.sess.destroy
Destroyed locks. This metric is only provided by varnish 4.x.
varnish.LCK.sess.locks
Lock operations. This metric is only provided by varnish 4.x.
varnish.LCK.sessmem.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.sessmem.creat
Created locks.
varnish.LCK.sessmem.destroy
Destroyed locks.
varnish.LCK.sessmem.locks
Lock operations.
varnish.LCK.sma.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.sma.creat
Created locks.
varnish.LCK.sma.destroy
Destroyed locks.
varnish.LCK.sma.locks
Lock operations.
varnish.LCK.smf.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.smf.creat
Created locks.
varnish.LCK.smf.destroy
Destroyed locks.
varnish.LCK.smf.locks
Lock operations.
varnish.LCK.smp.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.smp.creat
Created locks.
varnish.LCK.smp.destroy
Destroyed locks.
varnish.LCK.smp.locks
Lock operations.
varnish.LCK.sms.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.sms.creat
Created locks.
varnish.LCK.sms.destroy
Destroyed locks.
varnish.LCK.sms.locks
Lock operations.
varnish.LCK.stat.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.stat.creat
Created locks. This metric is only provided by varnish 3.x.
varnish.LCK.stat.destroy
Destroyed locks. This metric is only provided by varnish 3.x.
varnish.LCK.stat.locks
Lock operations. This metric is only provided by varnish 3.x.
varnish.LCK.vbe.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.vbe.creat
Created locks. This metric is only provided by varnish 3.x.
varnish.LCK.vbe.destroy
Destroyed locks. This metric is only provided by varnish 3.x.
varnish.LCK.vbe.locks
Lock operations. This metric is only provided by varnish 3.x.
varnish.LCK.vbp.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.vbp.creat
Created locks.
varnish.LCK.vbp.destroy
Destroyed locks.
varnish.LCK.vbp.locks
Lock operations.
varnish.LCK.vcapace.creat
Created locks. This metric is only provided by varnish 4.x.
varnish.LCK.vcapace.destroy
Destroyed locks. This metric is only provided by varnish 4.x.
varnish.LCK.vcapace.locks
Lock operations. This metric is only provided by varnish 4.x.
varnish.LCK.vcl.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.vcl.creat
Created locks.
varnish.LCK.vcl.destroy
Destroyed locks.
varnish.LCK.vcl.locks
Lock operations.
varnish.LCK.vxid.creat
Created locks. This metric is only provided by varnish 4.x.
varnish.LCK.vxid.destroy
Destroyed locks. This metric is only provided by varnish 4.x.
varnish.LCK.vxid.locks
Lock operations. This metric is only provided by varnish 4.x.
varnish.LCK.wq.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.wq.creat
Created locks.
varnish.LCK.wq.destroy
Destroyed locks.
varnish.LCK.wq.locks
Lock operations.
varnish.LCK.wstat.colls
Collisions. This metric is only provided by varnish 3.x.
varnish.LCK.wstat.creat
Created locks.
varnish.LCK.wstat.destroy
Destroyed locks.
varnish.LCK.wstat.locks
Lock operations.
varnish.losthdr
HTTP header overflows.
varnish.MEMPOOL.busyobj.allocs
Allocations. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.frees
Frees. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.live
In use. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.pool
In pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.randry
Pool ran dry. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.recycle
Recycled from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.surplus
Too many for pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.sz_needed
Size allocated. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.sz_wanted
Size requested. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.timeout
Timed out from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.busyobj.toosmall
Too small to recycle. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.allocs
Allocations. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.frees
Frees. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.live
In use. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.pool
In pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.randry
Pool ran dry. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.recycle
Recycled from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.surplus
Too many for pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.sz_needed
Size allocated. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.sz_wanted
Size requested. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.timeout
Timed out from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req0.toosmall
Too small to recycle. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.allocs
Allocations. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.frees
Frees. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.live
In use. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.pool
In pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.randry
Pool ran dry. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.recycle
Recycled from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.surplus
Too many for pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.sz_needed
Size allocated. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.sz_wanted
Size requested. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.timeout
Timed out from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.req1.toosmall
Too small to recycle. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.allocs
Allocations. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.frees
Frees. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.live
In use. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.pool
In pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.randry
Pool ran dry. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.recycle
Recycled from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.surplus
Too many for pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.sz_needed
Size allocated. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.sz_wanted
Size requested. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.timeout
Timed out from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess0.toosmall
Too small to recycle. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.allocs
Allocations. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.frees
Frees. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.live
In use. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.pool
In pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.randry
Pool ran dry. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.recycle
Recycled from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.surplus
Too many for pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.sz_needed
Size allocated. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.sz_wanted
Size requested. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.timeout
Timed out from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.sess1.toosmall
Too small to recycle. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.allocs
Allocations. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.frees
Frees. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.live
In use. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.pool
In pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.randry
Pool ran dry. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.recycle
Recycled from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.surplus
Too many for pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.sz_needed
Size allocated. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.sz_wanted
Size requested. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.timeout
Timed out from pool. This metric is only provided by varnish 4.x.
varnish.MEMPOOL.vbc.toosmall
Too small to recycle. This metric is only provided by varnish 4.x.
varnish.MGT.child_died
Child processes that died due to signals. This metric is only provided
by varnish 4.x.
varnish.MGT.child_dump
Child processes that produced core dumps. This metric is only provided
by varnish 4.x.
varnish.MGT.child_exit
Child processes the were cleanly stopped. This metric is only provided
by varnish 4.x.
varnish.MGT.child_panic
Child processes that panicked. This metric is only provided by varnish
4.x.
varnish.MGT.child_start
Child processes that started. This metric is only provided by varnish
4.x.
varnish.MGT.child_stop
Child processes that exited with an unexpected return code. This metric
is only provided by varnish 4.x.
varnish.MGT.uptime
This metric is only provided by varnish 4.x.
varnish.n_backend
Number of backends.
varnish.n_ban
Active bans. This metric is only provided by varnish 3.x.
varnish.n_ban_add
New bans added. This metric is only provided by varnish 3.x.
varnish.n_ban_dups
Duplicate bans removed. This metric is only provided by varnish 3.x.
varnish.n_ban_obj_test
Objects tested. This metric is only provided by varnish 3.x.
varnish.n_ban_re_test
Regexps tested against. This metric is only provided by varnish 3.x.
varnish.n_ban_retire
Old bans deleted. This metric is only provided by varnish 3.x.
varnish.n_expired
Objects that expired from cache because of TTL.
varnish.n_gunzip
Gunzip operations.
varnish.n_gzip
Gzip operations.
varnish.n_lru_moved
Move operations done on the LRU list.
varnish.n_lru_nuked
Objects forcefully evicted from storage to make room for new objects.
varnish.n_obj_purged
Purged objects. This metric is only provided by varnish 4.x.
varnish.n_object
object structs made.
varnish.n_objectcore
objectcore structs made.
varnish.n_objecthead
objecthead structs made.
varnish.n_objoverflow
Objects overflowing workspace. This metric is only provided by varnish
3.x.
varnish.n_objsendfile
Objects sent with sendfile. This metric is only provided by varnish 3.x.
varnish.n_objwrite
Objects sent with write. This metric is only provided by varnish 3.x.
varnish.n_purges
Purges executed. This metric is only provided by varnish 4.x.
varnish.n_sess
sess structs made. This metric is only provided by varnish 3.x.
varnish.n_sess_mem
sess_mem structs made. This metric is only provided by varnish 3.x.
varnish.n_vampireobject
Unresurrected objects.
varnish.n_vbc
vbc structs made. This metric is only provided by varnish 3.x.
varnish.n_vcl
Total VCLs loaded.
varnish.n_vcl_avail
Available VCLs.
varnish.n_vcl_discard
Discarded VCLs.
varnish.n_waitinglist
waitinglist structs made.
varnish.n_wrk
Worker threads. This metric is only provided by varnish 3.x.
varnish.n_wrk_create
Worker threads created. This metric is only provided by varnish 3.x.
varnish.n_wrk_drop
Dropped work requests. This metric is only provided by varnish 3.x.
varnish.n_wrk_failed
Worker threads not created. This metric is only provided by varnish 3.x.
varnish.n_wrk_lqueue
Work request queue length. This metric is only provided by varnish 3.x.
varnish.n_wrk_max
Worker threads limited. This metric is only provided by varnish 3.x.
varnish.n_wrk_queued
Queued work requests. This metric is only provided by varnish 3.x.
varnish.pools
Thread pools. This metric is only provided by varnish 4.x.
varnish.s_bodybytes
Total body size. This metric is only provided by varnish 3.x.
varnish.s_fetch
Backend fetches.
varnish.s_hdrbytes
Total header size. This metric is only provided by varnish 3.x.
varnish.s_pass
Passed requests.
varnish.s_pipe
Pipe sessions seen.
varnish.s_pipe_hdrbytes
Total request bytes received for piped sessions. This metric is only
provided by varnish 4.x.
varnish.s_pipe_in
Total number of bytes forwarded from clients in pipe sessions. This
metric is only provided by varnish 4.x.
varnish.s_pipe_out
Total number of bytes forwarded to clients in pipe sessions. This metric
is only provided by varnish 4.x.
varnish.s_req
Requests.
varnish.s_req_bodybytes
Total request body bytes received. This metric is only provided by
varnish 4.x.
varnish.s_req_hdrbytes
Total request header bytes received. This metric is only provided by
varnish 4.x.
varnish.s_resp_bodybytes
Total response body bytes transmitted. This metric is only provided by
varnish 4.x.
varnish.s_resp_hdrbytes
Total response header bytes transmitted. This metric is only provided by
varnish 4.x.
varnish.s_sess
Client connections.
varnish.s_synth
Synthetic responses made. This metric is only provided by varnish 4.x.
varnish.sess_closed
Client connections closed.
varnish.sess_conn
Client connections accepted. This metric is only provided by varnish
4.x.
varnish.sess_drop
Client connections dropped due to lack of worker thread. This metric is
only provided by varnish 4.x.
varnish.sess_dropped
Client connections dropped due to a full queue. This metric is only
provided by varnish 4.x.
varnish.sess_fail
Failures to accept a TCP connection. Either the client changed its mind,
or the kernel ran out of some resource like file descriptors. This
metric is only provided by varnish 4.x.
varnish.sess_herd varnish.sess_linger
This metric is only provided by varnish 3.x.
varnish.sess_pipe_overflow
This metric is only provided by varnish 4.x.
varnish.sess_pipeline varnish.sess_queued
Client connections queued to wait for a thread. This metric is only
provided by varnish 4.x.
varnish.sess_readahead varnish.shm_cont
SHM MTX contention.
varnish.shm_cycles
SHM cycles through buffer.
varnish.shm_flushes
SHM flushes due to overflow.
varnish.shm_records
SHM records.
varnish.shm_writes
SHM writes.
varnish.SMA.s0.c_bytes
Total space allocated by this storage.
varnish.SMA.s0.c_fail
Times the storage has failed to provide a storage segment.
varnish.SMA.s0.c_freed
Total space returned to this storage.
varnish.SMA.s0.c_req
Times the storage has been asked to provide a storage segment.
varnish.SMA.s0.g_alloc
Storage allocations outstanding.
varnish.SMA.s0.g_bytes
Space allocated from the storage.
varnish.SMA.s0.g_space
Space left in the storage.
varnish.SMA.Transient.c_bytes
Total space allocated by this storage.
varnish.SMA.Transient.c_fail
Times the storage has failed to provide a storage segment.
varnish.SMA.Transient.c_freed
Total space returned to this storage.
varnish.SMA.Transient.c_req
Times the storage has been asked to provide a storage segment.
varnish.SMA.Transient.g_alloc
Storage allocations outstanding.
varnish.SMA.Transient.g_bytes
Space allocated from the storage.
varnish.SMA.Transient.g_space
Space left in the storage.
varnish.sms_balloc
SMS space allocated.
varnish.sms_bfree
SMS space freed.
varnish.sms_nbytes
SMS outstanding space.
varnish.sms_nobj
SMS outstanding allocations.
varnish.sms_nreq
SMS allocator requests.
varnish.thread_queue_len
Length of session queue waiting for threads. This metric is only
provided by varnish 4.x.
varnish.threads
Number of threads. This metric is only provided by varnish 4.x.
varnish.threads_created
Threads created. This metric is only provided by varnish 4.x.
varnish.threads_destroyed
Threads destroyed. This metric is only provided by varnish 4.x.
varnish.threads_failed
Threads that failed to get created. This metric is only provided by
varnish 4.x.
varnish.threads_limited
Threads that were needed but couldn’t be created because of a thread
pool limit. This metric is only provided by varnish 4.x.
varnish.uptime
varnish.vmods
Loaded VMODs. This metric is only provided by varnish 4.x.
varnish.vsm_cooling
Space which will soon (max 1 minute) be freed in the shared memory used
to communicate with tools like varnishstat, varnishlog etc. This metric
is only provided by varnish 4.x.
varnish.vsm_free
Free space in the shared memory used to communicate with tools like
varnishstat, varnishlog etc. This metric is only provided by varnish
4.x.
varnish.vsm_overflow
Data which does not fit in the shared memory used to communicate with
tools like varnishstat, varnishlog etc. This metric is only provided by
varnish 4.x.
varnish.vsm_overflowed
Total data which did not fit in the shared memory used to communicate
with tools like varnishstat, varnishlog etc. This metric is only
provided by varnish 4.x.
varnish.vsm_used
Used space in the shared memory used to communicate with tools like
varnishstat, varnishlog etc. This metric is only provided by varnish
4.x.
varnish.n_purgesps
Purges executed. This metric is only provided by varnish 4.x.
6.3.3 - Benchmarks and Compliance
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible one. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between legacy Sysdig and Prometheus naming conventions.
Compliance metrics are generated from scheduled CIS Benchmark scans that
occur in Sysdig Secure. These metrics cover aggregate results of the
various CIS Benchmark sections, as well as granular details about how
many running containers are failing specific run-time compliance checks.
Contents
6.3.3.1 - Docker/CIS Benchmarks
compliance.docker-bench.container-images-and-build-file.pass_pct
The percentage of successful Docker benchmark tests run on the container
images and build files.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.container-images-and-build-file.tests_fail
The number of failed Docker benchmark tests run against the container
images and build file.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.container-images-and-build-file.tests_pass
The number of successful Docker benchmark tests run against the
container images and build file.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.container-images-and-build-file.tests_total
The total number of tests run against the container images and build
file.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.container-runtime.pass_pct
The percentage of successful container runtime Docker benchmark tests.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.container-runtime.tests_fail
The number of failed container runtime benchmark tests.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.container-runtime.tests_pass
The number of successful container runtime Docker benchmark tests.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.container-runtime.tests_total
The total number of Docker benchmark tests run against container
runtimes.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-caps-added
The number of containers running without kernel restrictions in place.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-maxretry-not-set
The number of containers configured to not limit installation retries if
the initial attempt fails.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-mount-prop-shared
The number of containers that use mount propagation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-networking-host
The number of containers that share the host’s network namespace.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-apparmor
The number of containers running without an AppArmor profile.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-cpu-limits
The number of containers running with no CPU limits configured.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-health-check
The number of containers that have no HEALTHCHECK
instruction
configured.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-mem-limits
The number of containers configured to run without memory limitations.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-pids-cgroup-limit
The number of containers that do not use a cgroup
for PIDs.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-restricted-privs
The number of containers running that can have additional privileges
configured.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-seccomp
The number of containers that disable the default seccomp
profile.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-securityopts
The number of containers running without SELinux options configured.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-no-ulimit-override
The number of containers running that override the default ulimit
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-privileged-ports
The number of containers that have privileged ports mapped into them.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-root-mounted-rw
The number of containers that mount the host’s root filesystem with
read/write privileges.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-running-privileged
The number of containers running with the --privileged
configuration
option set.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-sensitive-dirs
The number of containers that have mounted a sensitive directory from
the host.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-sharing-docker-sock
The number of containers that share the host’s docker socket.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-sharing-host-devs
The number of containers that share one or more host devices.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-sharing-host-ipc-ns
The number of containers that share the host’s IPC namespace.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-sharing-host-pid-ns
The number of containers that share the host’s PID namespace.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-sharing-host-user-ns
The number of containers that share the host’s user namespace.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-sharing-host-uts-ns
The number of containers that share the host’s UTS namespace.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-sshd-docker-exec-failures
The number of containers running an SSH daemon.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-unexpected-cgroup
The number of containers running without a dedicated cgroup
configured.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-using-docker0-net
The number of containers using the default docker bridge network
docker0
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.c-wildcard-bound-port
The number of containers that do not bind incoming traffic to a specific
interface.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-daemon-configuration.pass_pct
The percentage of successful Docker benchmark tests run against the
Docker daemon configuration.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-daemon-configuration.tests_fail
The number of benchmark tests run against the Docker daemon
configuration that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-daemon-configuration.tests_pass
The number of benchmark tests run against the Docker daemon
configuration that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-daemon-configuration.tests_total
The total number of benchmark tests run against the Docker daemon
configuration.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-daemon-configuration-files.pass_pct
The percentage of successful Docker benchmark tests run against the
Docker daemon configuration files.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-daemon-configuration-files.tests_fail
The number of benchmark tests run against the Docker daemon
configuration files that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-daemon-configuration-files.tests_pass
The number of benchmark tests run against the Docker daemon
configuration files that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-daemon-configuration-files.tests_total
The total number of benchmark tests run against the Docker daemon
configuration files.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-security-operations.pass_pct
The percentage of benchmark tests run against Docker security operations
that were successful.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-security-operations.tests_fail
The number of benchmark tests run against Docker security operations
that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-security-operations.tests_pass
The number of benchmark tests run against Docker security operations
that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-security-operations.tests_total
The total number of benchmark tests run against Docker security
operations.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-swarm-configuration.pass_pct
The percentage of benchmark tests run against the Docker swarm
configuration that were successful.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-swarm-configuration.tests_fail
The number of benchmark tests run against the Docker swarm configuration
that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Maxv |
compliance.docker-bench.docker-swarm-configuration.tests_pass
The number of benchmark tests run against the Docker swarm configuration
that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-swarm-configuration.tests_total
The total number of benchmark tests run against the Docker swarm
configuration.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.docker-users
The number of user accounts with permission to access the Docker daemon
socket.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.host-configuration.pass_pct
The percentage of benchmark tests run against the host configuration
that were successful.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.host-configuration.tests_fail
The number of benchmark tests run against the host configuration that
failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.host-configuration.tests_pass
The number of benchmark tests run against the host configuration that
passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.host-configuration.tests_total
The total number of benchmark tests run against the host configuration.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.img-images-using-add
The number of images that use the COPY
function rather than the ADD
function in Dockerfile.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.img-no-healthcheck
The number of images with no HEALTHCHECK
instruction configured.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.img-running-root
The number of images that use the root user.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.img-update-insts-found
The number of images that run a package update step without a package
installation step.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.pass_pct
The percentage of Docker benchmark tests run that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.score
The current pass/fail score for Docker benchmark tests run. The value of
this metric is calculated by starting at zero, and incrementing once for
every successful test, and decrementing once for every test that returns
a WARN
result or worse.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.tests_fail
The total number of Docker benchmark tests that have failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.tests_pass
The total number of Docker benchmark tests that have passed
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.docker-bench.tests_total
The total number of Docker benchmark tests that have been run.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.3.2 - Kubernetes Benchmarks
compliance.k8s-bench.api-server.pass_pct
The percentage of Kubernetes benchmark tests run on the API server that
passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.api-server.tests_fail
The number of Kubernetes benchmark tests run on the API server that
failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.api-server.tests_pass
The number of Kubernetes benchmark tests run on the API server that
passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.api-server.tests_total
The total number of Kubernetes benchmark tests run on the API server.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.api-server.tests_warn
The number of Kubernetes benchmark tests run on the API server that
returned a result of WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.configuration-files.pass_pct
The percentage of Kubernetes benchmark tests run on the configuration
files of non-master nodes that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.configuration-files.tests_fail
The number of Kubernetes benchmark tests run on the configuration files
of non-master nodes that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.configuration-files.tests_pass
The number of Kubernetes benchmark tests run on the configuration files
that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.configuration-files.tests_total
The total number of Kubernetes benchmark tests run on the configuration
files of non-master nodes.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.configuration-files.tests_warn
The number of Kubernetes benchmark tests run on the configuration files
of non-master nodes that returned a result of WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
The percentage of Kubernetes benchmark tests run on the master node
configuration files that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
The number of Kubernetes benchmark tests run on the master node
configuration files that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
The number of Kubernetes benchmark tests run on the master node
configuration files that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
The total number of Kubernetes benchmark tests run on the master node
configuration files.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
The number of Kubernetes benchmark tests run on the master node
configuration files that returned a result of WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.controller-manager.pass_pct
The percentage of Kubernetes benchmark tests run on the controller
manager that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.controller-manager.tests_fail
The number of Kubernetes benchmark tests run on the controller manager
that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.controller-manager.tests_pass
The number of Kubernetes benchmark tests run on the controller manager
that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.controller-manager.tests_total
The total number of Kubernetes benchmark tests run on the controller
manager.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.controller-manager.tests_warn
The number of Kubernetes benchmark tests run on the controller manager
that returned a result of WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.etcd.pass_pct
The percentage of Kubernetes benchmark tests run on the etcd key value
store that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.etcd.tests_fail
The number of Kubernetes benchmark tests run on the etcd key value store
that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.etcd.tests_pass
The number of Kubernetes benchmark tests run on the etcd key value store
that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.etcd.tests_total
The total number of Kubernetes benchmark tests run on the etcd key value
store.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.etcd.tests_warn
The number of Kubernetes benchmark tests run on the etcd key value store
that returned a result of WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.general-security-primitives.pass_pct
The percentage of Kubernetes benchmark tests run on the security
primitives that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.general-security-primitives.tests_fail
The number of Kubernetes benchmark tests run on the security primitives
that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.general-security-primitives.tests_pass
The number of Kubernetes benchmark tests run on the security primitives
that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.general-security-primitives.tests_total
The total number of Kubernetes benchmark tests run on the security
primitives.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.general-security-primitives.tests_warn
The number of Kubernetes benchmark tests run on the security primitives
that returned a result of WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.kubelet.pass_pct
The percentage of Kubernetes benchmark tests run on the non-master node
Kubernetes agent that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.kubelet.tests_fail
The number of Kubernetes benchmark tests run on the non-master node
Kubernetes agent that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.kubelet.tests_pass
The number of Kubernetes benchmark tests run on the non-master node
Kubernetes agent that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.kubelet.tests_total
The total number of Kubernetes benchmark tests run on the non-master
node Kubernetes agent.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.kubelet.tests_warn
The number of Kubernetes benchmark tests run on the non-master node
Kubernetes agent that returned a result of WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.pass_pct
The percentage of Kubernetes benchmark tests that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.scheduler.pass_pct
The percentage of Kubernetes benchmark tests run on the scheduler that
passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Container |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.scheduler.tests_fail
The number of Kubernetes benchmark tests run on the scheduler that
failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.scheduler.tests_pass
The number of Kubernetes benchmark tests run on the scheduler that
passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.scheduler.tests_total
The total number of Kubernetes benchmark tests run on the scheduler.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.scheduler.tests_warn
The number of Kubernetes benchmark tests run on the scheduler that
returned a result of WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.tests_fail
The number of Kubernetes benchmark tests that failed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.tests_pass
The number of Kubernetes benchmark tests that passed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.tests_total
The total number of Kubernetes benchmark tests run.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
compliance.k8s-bench.tests_warn
The number of Kubernetes benchmark tests that returned a result of
WARN
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Container |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.4 - Containers
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
This topic introduces you to the Container metrics.
container.count
The number of containers in the infrastructure.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Time Aggregation Formats | Avg, Sum, Min, Max |
container.id
The container’s identifier.
For Docker containers, this value is a 12 digit hex number.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Container |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Time Aggregation Formats | N/A |
container.image
The name of the image used to run the container.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Container |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Time Aggregation Formats | N/A |
container.name
The name of the container.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Container |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Time Aggregation Formats | N/A |
container.type
The type of container (for example, Docker, LXC, or Mesos).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Container |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Time Aggregation Formats | N/A |
cpu.quota.used.percent
The percentage of CPU quota a container actually used over a defined
period of time.
CPU quotas are a common way of creating a CPU limit for a container. A
container can only spend its quota of time on CPU cycles across a given
time period. The default time period is 100ms.
Unlike CPU shares, CPU quota is a hard limit for the amount of CPU the
container can use. For this reason, the CPU quota should not exceed 100%
for an extended period of time. For a shorter time, containers are
allowed to consume higher than the CPU quota.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Time Aggregation Formats | Avg, Sum, Min, Max |
cpu.shares.count
The amount of CPU shares assigned to the container’s cgroup. CPU shares
represent a relative weight used by the kernel to distribute CPU cycles
across different containers. Each container receives its own allocation
of CPU cycles, based on the ratio of share allocation for the container
versus the total share allocation for all containers. For example, if an
environment has three containers, each with 1024 shares, then each will
receive 1/3 of the CPU cycles.
The default value for a container is 1024.
Defining a CPU shares count is a common way to create a CPU limit for a
container.
The CPU shares count is not a hard limit. A container can consume more
than its allocation, as long as the CPU has cycles that are not being
consumed by the container they were originally allocated to.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Time Aggregation Formats | Avg, Sum, Min, Max |
cpu.shares.used.percent
The percentage of a container’s allocated CPU shares that are used. CPU
shares are a common way of creating a CPU limit for a container, as they
represent a relative weight used by the kernel to distribute CPU cycles
across different containers. Each container receives its own allocation
of CPU cycles, according to the ratio of share count vs the total number
of shares claimed by all containers. For example, in an infrastructure
with three containers, each with 1024 shares, each container receives
1/3 of the CPU cycles.
A container can use more CPU cycles than allocated if the CPU has cycles
that are not being consumed by the container they were originally
allocated to. This means that the value of cpu.shares.used.percent
can
exceed 100%.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Time Aggregation Formats | Avg, Sum, Min, Max |
memory.limit.bytes
The RAM limit assigned to a container. The default value is 0.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Byte |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Time Aggregation Formats | Avg, Sum, Min, Max |
memory.limit.used.percent
The percentage of the memory limit used by a container.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Time Aggregation Formats | Avg, Sum, Min, Max |
swap.limit.bytes
The swap limit assigned to a container.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Byte |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Time Aggregation Formats | Avg, Sum, Min, Max |
swap.limit.used.percent
The percentage of swap limit used by the container.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Time Aggregation Formats | Avg, Sum, Min, Max |
6.3.5 - Cloud Provider
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
At this time, all cloudProvider metrics are AWS-related.
cloudProvider.account.id
The cloud provider instance account number.
This metric is useful if there are multiple accounts linked with Sysdig
Monitor.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.availabilityZone
The AWS Availability Zone where the entity or entities are located. Each
availability zone is an isolated subsection of an AWS region. See
cloudProvider.region
.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.host.ip.private
The private IP address allocated by the cloud provider for the instance.
This address can be used for communication between instances in the same
network.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.host.ip.public
Public IP address of the selected host.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.host.name
The name of the host as reported by the cloud provider.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.id
The ID number as assigned and reported by the cloud provider.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.instance.type
The type of instance (for example, AWS or Rackspace).
This metric is extremely useful to segment instances and compare their
resource usage and saturation. You can use it as a grouping criteria for
the explore table to quickly explore AWS usage on a per-instance-type
basis. You can also use it to compare things like CPU usage, number of
requests or network utilization for different instance types.
Use this grouping criteria in conjunction with the host.count metric to
easily create a report on how many instances of each type you have.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.name
The name of the instance (for example, AWS or Rackspace).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.region
The region the cloud provider host (or group of hosts) is located in.
Use this grouping criteria in conjunction with the host.count metric to
easily create a report on how many instances you have in each region.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.resource.endPoint
The DNS name for which the resource can be accessed.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.resource.name
The cloud provider service name (for example, Amazon EC2 or Amazon ELB).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.resource.type
The cloud provider service type (for example, INSTANCE
,
LOAD_BALANCER
, DATABASE
).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cloudProvider.status
Resource status.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
6.3.5.1 - AWS
For information about how Sysdig licensing affects the AWS metrics
displayed in the Monitor UI, see About AWS Cloudwatch
Licensing .
Contents
6.3.5.1.1 - Elasticache
Amazon ElastiCache is a cloud-caching service that increases the
performance, speed, and redundancy with which applications can retrieve
data by providing an in-memory database caching system.
aws.elasticache.CPUUtilization
The percentage of CPU utilization.
When reaching high utilization and your main workload is from read
requests, scale your cache cluster out by adding read replicas. If the
main workload is from write requests, scale up by using a larger cache
instance type.
For more information, refer to the
ElastiCache
documentation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | CloudProvider |
Default Time Aggregation | Averave |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elasticache.FreeableMemory
The amount of memory considered free, or that could be made available,
for use by the node.
For more information, refer to the
ElastiCache
documentation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elasticache.NetworkBytesIn
The number of bytes the host has read from the network.
For more information, refer to the
ElastiCache
documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elasticache.NetworkBytesOut
The number of bytes the host has written to the network.
For more information, refer to the
ElastiCache
documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elasticache.SwapUsage
The amount of swap space used on the host.
If swap is being utilized, the node probably needs more memory than is
available and cache performance may be negatively impacted. Consider
adding more nodes or using larger ones to reduce or eliminate swapping.
For more information, refer to the
ElastiCache
documentation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.5.1.2 - Elastic Application Load Balancing (ALB)
Application Load Balancer is best suited for load balancing of HTTP and
HTTPS traffic and provides advanced request routing targeted at the
delivery of modern application architectures, including microservices
and containers. For more information, refer to the Elastic Application
Load
Balancer
documentation.
aws.alb.ActiveConnectionCount
The total number of concurrent TCP connections active from clients to
the load balancer and from the load balancer to the targets.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.ClientTLSNegotiationErrorCount
The number of TLS connections initiated by the client that did not
establish a session with the load balancer.
Possible causes include a mismatch of ciphers or protocols.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.ConsumedLCUs
The number of load balancer capacity units (LCU) used by the load
balancer.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.HTTPCode_ELB_4XX_Count
The number of HTTP 4XX client error codes that originate form the load
balancer. Client errors are generated when requests are malformed or
incomplete. These requests have not been received by the target.
This count does not include any response codes generated by the targets.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.HTTPCode_ELB_5XX_Count
The number of HTTP 5XX server error codes that originate from the load
balancer. Server errors are generated when requests are malformed or
incomplete. These requests have not been received by the target.
This count does not include any response codes generated by the targets.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.HTTPCode_Target_2XX_Count
The number of HTTP 2XX response codes generated by the target.
This count does not include any response codes generated by the load
balancer.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.HTTPCode_Target_3XX_Count
The number of HTTP 3XX response codes generated by the target.
This count does not include any response codes generated by the load
balancer.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.HTTPCode_Target_4XX_Count
The number of HTTP 4XX response codes generated by the target.
This count does not include any response codes generated by the load
balancer.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.HTTPCode_Target_5XX_Count
The number of HTTP 5XX response codes generated by the target.
This count does not include any response codes generated by the load
balancer.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.HealthyHostCount
The number of targets that are considered healthy.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.IPv6ProcessedBytes
The total number of bytes processed by the load balancer over IPv6.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.IPv6RequestCount
The total number of data requested by the load balancer over IPv6.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.NewConnectionCount
The total number of new TCP connections established from clients to the
load balancer and from the load balancer to targets.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.ProcessedBytes
The total number of bytes processed by the load balancer over IPv4 and
IPv6.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.RejectedConnectionCount
The number of connections that were rejected because the load balancer
had reached its maximum number of connections.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.RequestCount
The number of requests processed over IPv4 and IPv6. This count only
includes the requests with a response generated by a target of the load
balancer.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.RequestCountPerTarget
The average number of requests received by each target in a target
group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.RuleEvaluations
The number of rules processed by the load balancer given a request rate
averaged over an hour.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.TargetConnectionErrorCount
The number of connections that were not successfully established between
the load balancer and target.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.TargetResponseTime
The time elapsed, in seconds, after the request leaves the load balancer
until a response from the target is received.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.TargetTLSNegotiationErrorCount
The number of TLS connections initiated by the load balancer that did
not establish a session with the target.
Possible causes include a mismatch of ciphers or protocols.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.alb.UnHealthyHostCount
The number of targets that are considered unhealthy.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.5.1.3 - Elastic Cloud Compute (EC2)
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides
secure, resizable compute capacity in the cloud. It is designed to make
web-scale cloud computing easier for developers.
aws.ec2.CPUCreditBalance
The CPU credit balance of an instance, based on what has accrued since
it started. For more information, refer to the Elastic Compute
Cloud
metric definition table.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.ec2.CPUCreditUsage
The CPU credit usage by the instance. For more information, refer to the
Elastic Compute
Cloud
metric definition documentation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.ec2.CPUUtilization
The percentage of allocated EC2 compute units currently in use on the
instance. For more information, refer to the Elastic Compute
Cloud
metric definition documentation.
This metric identifies the processing power required to run an
application upon a selected instance.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.ec2.DiskReadBytes
The total bytes read from all ephemeral disks available to the instance.
This metric is used to determine the volume of the data the application
reads from the disk and can be used to determine the speed of the
application.
The number reported is the number of bytes received during a specified
period. For a basic (five-minute) monitoring, divide this number by 300
to find Bytes/second. For a detailed (one-minute) monitoring, divide it
by 60.
For more information, refer to the Elastic Compute
Cloud
metric definition documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.ec2.DiskReadOps
Total completed read operations from all ephemeral disks available to
the instance in a specified period of time. For more information, refer
to the Elastic Compute
Cloud
metric definition documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.ec2.DiskWriteBytes
It is the total bytes written to all ephemeral disks available to the
instance. This metric is used to determine the volume of the data the
application writes to the disk and can be used to determine the speed of
the application.
The number reported is the number of bytes received during a specified
period. For a basic (five-minute) monitoring, divide this number by 300
to find Bytes/second. For a detailed (one-minute) monitoring, divide it
by 60.
For more information, refer to the Elastic Compute
Cloud
metric definition documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.ec2.DiskWriteOps
The completed write operations to all ephemeral disks available to the
instance in a specified period of time. If your instance uses Amazon EBS
volumes, see Amazon EBS Metrics. For more information, refer to the
Elastic Compute
Cloud
metric definition documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.ec2.NetworkIn
The number of bytes received on all network interfaces by the instance.
For more information, refer to the Elastic Compute
Cloud
metric definition documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.ec2.NetworkOut
The number of bytes sent out on all network interfaces by the instance.
For more information, refer to the Elastic Compute
Cloud
metric definition documentation.
This metric identifies the volume of outgoing network traffic to an
application on a single instance.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.5.1.4 - Elastic Container Service (ECS)
Amazon Elastic Container Service (Amazon ECS) is a highly scalable,
high-performance container orchestration service that supports Docker
containers and allows you to easily run and scale containerized
applications on AWS. Amazon ECS eliminates the need for you to install
and operate your own container orchestration software, manage and scale
a cluster of virtual machines, or schedule containers on those virtual
machines.
ecs.clusterName
The name of the cluster. For more information, refer to the AWS
CloudFormation
documentation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
ecs.serviceName
The name of the Elastic Container Service (Amazon ECS) service. For more
information, refer to the AWS
CloudFormation
documentation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
ecs.taskFamilyName
The name of the task definition family. For more information, refer to
the AWS
CloudFormation
documentation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | CloudProvider |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
6.3.5.1.5 - Elastic Load Balancing (ELB)
Elastic Load Balancing automatically distributes incoming application
traffic across multiple targets, such as Amazon EC2 instances,
containers, IP addresses, and Lambda functions.
aws.elb.BackendConnectionErrors
The number of errors encountered by the load balancer while attempting
to connect to your application.
For high error counts, look for network related issues or check that
your servers are operating correctly. The ELB is having problems
connecting to them.
For more information, refer to the Elastic Load
Balancing
documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.HealthyHostCount
A count of the number of healthy instances that are bound to the load
balancer.
Hosts are declared healthy if they meet the threshold for the number of
consecutive health checks that are successful. Hosts that have failed
more health checks than the value of the unhealthy threshold are
considered unhealthy. If cross-zone is enabled, the count of the number
of healthy instances is calculated for all Availability Zones.
For more information, refer to the Elastic Load
Balancing
documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.HTTPCode_Backend_2XX
The count of the number of HTTP 2XX response codes generated by back-end
instances. This metric does not include any response codes generated by
the load balancer.
The 2XX class status codes represent successful actions (e.g., 200-OK,
201-Created, 202-Accepted, 203-Non-Authoritative Info).
For more information, refer to the Elastic Load
Balancing
documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.HTTPCode_Backend_3XX
The count of the number of HTTP 3XX response codes generated by back-end
instances. This metric does not include any response codes generated by
the load balancer.
The 3XX class status code indicates that the user agent requires action
(e.g., 301-Moved Permanently, 302-Found, 305-Use Proxy, 307-Temporary
Redirect).
For more information, refer to the Elastic Load
Balancing
documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.HTTPCode_Backend_4XX
The count of the number of HTTP 4XX response codes generated by back-end
instances. This metric does not include any response codes generated by
the load balancer. For more information, refer to the Elastic Load
Balancing
documentation.
The 4XX class status code represents client errors (e.g., 400-Bad
Request, 401-Unauthorized, 403-Forbidden, 404-Not Found).
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.HTTPCode_Backend_5XX
The count of the number of HTTP 5XX response codes generated by back-end
instances. This metric does not include any response codes generated by
the load balancer. For more information, refer to the Elastic Load
Balancing
documentation.
The 5XX class status code represents back-end server errors e.g.,
500-Internal Server Error, 501-Not implemented, 503-Service
Unavailable).
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.HTTPCode_ELB_4XX
The count of the number of HTTP 4XX client error codes generated by the
load balancer when the listener is configured to use HTTP or HTTPS
protocols. For more information, refer to the Elastic Load
Balancing
documentation.
Client errors are generated when a request is malformed or is
incomplete.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.HTTPCode_ELB_5XX
The count of the number of HTTP 5XX server error codes generated by the
load balancer when the listener is configured to use HTTP or HTTPS
protocols. This metric does not include any responses generated by
back-end instances.For more information, refer to the Elastic Load
Balancing
documentation.
The metric is reported if there are no back-end instances that are
healthy or registered to the load balancer, or if the request rate
exceeds the capacity of the instances or the load balancers.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.Latency
A measurement of the time backend requests require to process. For more
information, refer to the Elastic Load
Balancing
documentation.
Latency metrics from the ELB are good indicators of the overall
performance of your application.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.RequestCount
The number of requests handled by the load balancer. For more
information, refer to the Elastic Load
Balancing
documentation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.SpilloverCount
A count of the total number of requests that were rejected due to the
queue being full. For more information, refer to the Elastic Load
Balancing
documentation.
Positive numbers indicate some requests are not being forwarded to any
server. Clients are not notified that their request was dropped.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.SurgeQueueLength
A count of the total number of requests that are pending submission to a
registered instance. For more information, refer to the Elastic Load
Balancing
documentation.
Positive numbers indicate clients are waiting for their requests to be
forwarded to a server for processing.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.elb.UnHealthyHostCount
The count of the number of unhealthy instances that are bound to the
load balancer. For more information, refer to the Elastic Load
Balancing
documentation.
Hosts are declared healthy if they meet the threshold for the number of
consecutive health checks that are successful. Hosts that have failed
more health checks than the value of the unhealthy threshold are
considered unhealthy.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.5.1.6 - DynamoDB
DynamoDB is a fully managed proprietary NoSQL database service that
supports key-value and document data structures and is offered by Amazon
as part of the Amazon Web Services portfolio. Amazon CloudWatch
aggregates the DynamoDB metrics at one-minute intervals.
In DynamoDB, provisioned throughput requirements are specified in terms
of capacity units: Read Capacity unit and Write Capacity unit. A unit of
read capacity represents one strongly consistent read per second for
items up to 4 KB in size. One write capacity unit represents one write
per second for items up to 1 KB in size. Larger items will require more
capacity. You can calculate the number of units of read and write
capacity by estimating the number of reads or writes required per second
and multiplying by the size of the items rounded up to the nearest KB.
For more information, see the Amazon
DynamoDB
documentation.
aws.dynamodb.ConditionalCheckFailedRequests
The number of failed attempts to perform conditional writes.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ConsumedReadCapacityUnits
The amount of read capacity units consumed over the defined time period.
Amazon CloudWatch aggregates the metrics at one-minute intervals. Use
the Sum aggregation to calculate the consumed throughput. For example,
get the Sum value over a span of one minute, and divide it by the number
of seconds in a minute (60) to calculate the
average ConsumedReadCapacityUnits
per second.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ConsumedWriteCapacityUnits
The amount of write capacity units consumed over the specified time
interval. Amazon CloudWatch aggregates the metrics at one-minute
intervals. Use the Sum aggregation to calculate the consumed throughput.
For example, get the Sum value over a span of one minute, and divide it
by the number of seconds in a minute (60) to calculate the
average ConsumedWriteCapacityUnits
per second.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ProvisionedReadCapacityUnits
The number of read capacity units provisioned for a table or a global
secondary index.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ProvisionedWriteCapacityUnits
The number of write capacity units provisioned for a table or global
secondary table.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ReadThrottleEvents
The number of DynamoDB requests that exceed the amount of read capacity
units provisioned.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ReturnedBytes.GetRecords
The number of bytes returned by GetRecords
operation during the
specified time period.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ReturnedItemCount
The number of items returned by query or scan operations during the
specified time period.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ReturnedRecordsCount.GetRecords
The number of stream records returned by the GetRecords operations
during the specific period.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.SuccessfulRequestLatency
The number of successful requests to DynamoDB or Amazon DynamoDB Streams
during the specified time period. The time period is in milliseconds.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.SystemErrors
The number of requests made to DynamoDB or Amazon DynamoDB Streams that
resulted in an HTTP 500 status code during the specified time period.
HTTP 500 usually indicates an internal service error.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.ThrottledRequests
The number of requests to DynamoDB that exceed the provisioned
throughput limits on a resource, such as a table or an index.
ThrottledRequests is incremented by one if any event within a request
exceeds a provisioned throughput limit.
If any individual request for read or write events within the batch is
throttled, ReadThrottleEvents
metrics or WriteThrottleEvents
metrics
is incremented respectively.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.UserErrors
The number of requests to DynamoDB or Amazon DynamoDB Streams that
returned an HTTP 400 status code during the specified time period. HTTP
400 usually indicates a client-side error.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.dynamodb.WriteThrottleEvents
The number of requests to DynamoDB that exceed the provisioned write
capacity units for a table or a global secondary index.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.5.1.7 - Relational Database Service (RDS)
Amazon Relational Database Service (Amazon RDS) is a managed SQL
database service provided by Amazon Web Services (AWS). Amazon RDS
supports an array of database engines to store and organize data and
helps with database management tasks, such as migration, backup,
recovery, and patching.
aws.rds.BinLogDiskUsage
The amount of disk space occupied by binary logs on the master. Applies
to MySQL read replicas.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.CPUUtilization
The percentage of CPU utilization.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.DatabaseConnections
The number of database connections in use.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.DiskQueueDepth
The number of outstanding I/Os (read/write requests) waiting to access
the disk.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.FreeableMemory
The amount of available random access memory, in megabytes.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.FreeStorageSpace
The amount of available storage space in bytes.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.NetworkReceiveThroughput
The incoming (Receive) network traffic on the DB instance, including
both customer database traffic and Amazon RDS traffic used for
monitoring and replication. The metric is measured in bytes per second.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.NetworkTransmitThroughput
The outgoing (Transmit) network traffic on the DB instance, including
both customer database traffic and Amazon RDS traffic used for
monitoring and replication. The metric is measured in bytes per second.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.ReadIOPS
The average number of read I/O operations per second.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.ReadLatency
The average amount of seconds taken per read I/O operation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.ReadThroughput
The average number of bytes read from disk per second.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.ReplicaLag
The amount of time, in nanoseconds, a Read Replica DB instance lags
behind the source DB instance.
This metric applies to MySQL read replicas.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.SwapUsage
The amount of swap space used by the database, measured in megabytes.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Byte |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.WriteIOPS
The average number of write I/O operations per second.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.WriteLatency
The average amount of time taken per write I/O operation.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.rds.WriteThroughput
The average number of bytes written to disk per second.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.5.1.8 - Simple Queue Service (SQS)
Amazon Simple Queue Service (Amazon SQS) is a pay-per-use web service
for storing messages in transit between computers. Developers use SQS to
build distributed applications with decoupled components without having
to deal with the overhead of creating and maintaining message queues.
Amazon Simple Queue Service (Amazon SQS) is a pay-per-use web service
for storing messages in transit between computers. Developers use SQS to
build distributed applications with decoupled components without having
to deal with the overhead of creating and maintaining message queues.
For more information, see Amazon SQS
Resources.
aws.sqs.ApproximateNumberOfMessagesDelayed
The number of messages in the queue that are delayed or currently
unavailable for reading. Messages are stuck like this when the queue is
configured as a delay queue or when a message has been sent with a delay
parameter.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Avg |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.sqs.ApproximateNumberOfMessagesNotVisible
The number of undelivered messages. These messages are still in the
queue, on their way to a client (in flight), but have not yet been
deleted or have not yet reached the destination.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Avg |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.sqs.ApproximateNumberOfMessagesVisible
The number of messages available for retrieval from the queue. These are
the messages which have not yet been locked by an SQS worker.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Avg |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.sqs.NumberOfEmptyReceives
The number of ReceiveMessage
API calls that did not return a message.
This metric is populated every 5 minutes.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.sqs.NumberOfMessagesDeleted
The number of messages deleted from the queue. Amazon SQS considers
every successful deletion that uses a valid receipt
handle,
including duplicate deletions, to generate the NumberOfMessagesDeleted
metric. Therefore, this number could include duplicate deletions.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.sqs.NumberOfMessagesReceived
The number of messages returned by calls to the ReceiveMessage
API
action.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.sqs.NumberOfMessagesSent
The number of messages added to a queue.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
aws.sqs.SentMessageSize
The size of messages in bytes added to a queue. The SentMessageSize
does not display as an available metric in the CloudWatch console until
at least one message is sent to the corresponding queue.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.6 - Deprecated Metrics and Labels
Below is the list of metrics and labels that are discontinued with the introduction of new metric store. We made an effort to not deprecate any metrics or labels that are used in existing alerts, but in case you encounter any issues, contact Sysdig Support.
We have applied automatic mapping of all net.*.request.time.worst
metrics to net.*.request.time
, because the maximum aggregation gives equivalent results and it was almost exclusively used in combination with these metrics.
Deprecated Metrics
The following metrics are no longer supported.
net.request.time.file
net.request.time.file.percent
net.request.time.local
net.request.time.local.percent
net.request.time.net
net.request.time.net.percent
net.request.time.nextTiers
net.request.time.nextTiers.percent
net.request.time.processing
net.request.time.processing.percent
net.request.time.worst.in
net.request.time.worst.out
net.incomplete.connection.count.total
net.http.request.time.worst
net.mongodb.request.time.worst
net.sql.request.time.worst
net.link.clientServer.bytes
net.link.delay.perRequest
net.link.serverClient.bytes
capacity.estimated.request.stolen.count
capacity.estimated.request.total.count
capacity.stolen.percent
capacity.total.percent
capacity.used.percent
Deprecated Labels
The following labels are no longer supported:
net.connection.client
net.connection.client.pid
net.connection.direction
net.connection.endpoint.tcp
net.connection.udp.inverted
net.connection.errorCode
net.connection.l4proto
net.connection.server
net.connection.server.pid
net.connection.state
net.role
cloudProvider.resource.endPoint
host.container.mappings
host.ip.all
host.ip.private
host.ip.public
host.server.port
host.isClientServer
host.isInstrumented
host.isInternal
host.procList.main
proc.id
proc.name.client
proc.name.server
program.environment
program.usernames
mesos_cluster
mesos_node
mesos_pid
In addition to this list, the composite labels ending with ‘.label’ string will no longer be supported. For example kubernetes.service.label
will be deprecated, but kubernetes.service.label.*
labels are supported.
6.3.7 - File
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
file.bytes.in
The number of bytes read from the file. By default, this metric displays
the total value for the defined scope. For example, if the scope is set
to a group of machines, the metric value will be the total value for the
whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.bytes.out
The number of bytes written to the file. By default, this metric
displays the total value for the defined scope. For example, if the
scope is set to a group of machines, the metric value will be the total
value for the whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.bytes.total
The total number of bytes written to, and read from, the file. By
default, this metric displays the total value for the defined scope. For
example, if the scope is set to a group of machines, the metric value
will be the total value for the whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.error.open.count
The number of errors that occurred when opening files. By default, this
metric displays the total value for the defined scope. For example, if
the scope is set to a group of machines, the metric value will be the
total value for the whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.error.total.count
The number of errors encountered by file system calls, such as open()
,
close()
, and create()
. By default, this metric displays the total
value for the defined scope. For example, if the scope is defined as a
group of machines, the metric value will be the total value for the
whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.iops.in
The number of file read operations per second. This metric is calculated
by measuring the actual number of read requests made by a process. By
default, this metric displays the total value for the defined scope. For
example, if the scope is set to a group of machines, the metric value
will be the total value for the whole group.
The value of file.iops.in
can differ from the value other tools show,
as they are usually based on interpolating this value from the number of
bytes read and written to the file system.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.iops.out
The number of file write operations per second. This metric is
calculated by measuring the actual number of write requests made by a
process. By default, this metric displays the total value for the
defined scope. For example, if the scope is set to a group of machines,
the metric value will be the total value for the whole group.
The value of file.iops.out
can differ from the value other tools show,
as they are usually based on interpolating this value from the number of
bytes read and written to the file system.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.iops.total
The number of file read and write operations per second. This metric is
calculated by measuring the actual number of read/write requests made by
a process. By default, this metric displays the total value for the
defined scope. For example, if the scope is set to a group of machines,
the metric value will be the total value for the whole group.
The value of file.iops.total
can differ from the value other tools
show, as they are usually based on interpolating this value from the
number of bytes read and written to the file system.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.name
The name of the file.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
file.open.count
The number of times the file has been opened.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.time.in
The time spent reading the file. By default, this metric displays the
total value for the defined scope. For example, if the scope is set to a
group of machines, the metric value will be the total value for the
whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.time.out
The time spent writing in the file. By default, this metric displays the
total value for the defined scope. For example, if the scope is set to a
group of machines, the metric value will be the total value for the
whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
file.time.total
The time spent during file I/O. By default, this metric displays the
total value for the defined scope. For example, if the scope is set to a
group of machines, the metric value will be the total value for the
whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.8 - File System
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
fs.used.percent
Specifies what percentage of the file system has been used.
Metric Type | Gauge |
Value Type | Percent |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
fs.free.percent
Specifies what percentage of the file system is free.
Metric Type | Gauge |
Value Type | Percent |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
fs.bytes.free
The number of bytes free in the file system.
Metric Type | gauge |
Value Type | Byte |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
fs.bytes.used
The number of bytes used in the file system.
Metric Type | Gauge |
Value Type | Byte |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
fs.bytes.total
The size of the file system.
Metric Type | Gauge |
Value Type | Byte |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
fs.inodes.total.count
The number of inodes in the file system.
Metric Type | Gauge |
Value Type | Integer |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
fs.inodes.used.count
The number of inodes used in the file system.
Metric Type | Gauge |
Value Type | Integer |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
fs.inodes.used.percent
Percentage of filesystem inodes usage.
Metric Type | Gauge |
Value Type | Percent |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
fs.root.used.percent
Percentage of root filesystem usage.
Metric Type | Gauge |
Value Type | Percent |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
fs.largest.used.percent
Percentage of the largest filesystem.
Metric Type | Gauge |
Value Type | Percent |
Scope | Host, Container |
Segment By | agent.tag cloudProvider.account.id cloudProvider.availabilityZone cloudProvider.region cloudProvider.tag container.id container.image container.name ecs.clusterName ecs.serviceName ecs.taskFamilyName fs.device fs.mountDir fs.type host.hostName host.mac |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Average, Rate, Sum, Minimum, Maximum |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Average, Sum, Minimum, Maximum |
6.3.9 - Host
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
agent.id
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
agent.mode
For more information on agent modes, see Configure Agent
Modes.
Metadata | Description |
---|
Metric Type | String |
Value Type | String |
Segment By | Host |
Default Time Aggregation | concat |
Available Time Aggregation Formats | concat, distinct, count |
Default Group Aggregation | concat |
Available Group Aggregation Formats | concat, distinct, count |
agent.version
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
cpu.core
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.container.mappings
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.count
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Host, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
host.domain
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.hostName
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.ip.all
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.ip.private
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.ip.public
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.isClientServer
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.isInstrumented
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.isInternal
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.mac
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.procList.main
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
host.uname
host.uname
provides the following system information:
kernel name
kernel release number
kernel version
machine hardware name
Agents send this metric along with a number of labels
that map with
the uname
information. host.uname
is supported on agent versions
10.1 and above.
Metrics Details
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | See Segmentation Details. |
Default Time Aggregation | Average |
Available Time Aggregation | Average, Rate, Sum, Min, Max, Rate of Change |
Default Group Aggregation | Average |
Available Group Rollup | Average, Sum, Min, Max |
Segmentation Details
The labels are given below:
Label | Description | Mapping to the uname tooling | Example |
---|
host.uname.kernel.name | The kernel name | uname -s | Linux |
host.uname.kernel.release | The kernel release | uname -r | 5.4.0-31-generic |
host.uname.kernel.version | The kernel version | uname -v | #35-Ubuntu SMP Thu May 7 20:20:34 UTC 2020 |
host.machine | The hardware name of the machine | uname -m | x86_64 |
Example: Kernel Versions in the Infrastructure
The image depicts host.uname
being segmented by
host.uname.kernel.version
. The resulting dashboard gives the
distribution of kernel versions in the infrastructure.

Count Limits StasD Metrics
The count limits metrics report the upper limit of the number of metrics
of the same type. The values the metrics report can be changed by
modifying the dragent.yaml
file.
Metric Name | Configuration Parameter in the dragent.yaml file | Default Value |
---|
metricCount.limit.appCheck | app_checks_limit | 500 |
metricCount.limit.statsd | statsd.limit | 100 |
metricCount.limit.jmx | jmx.limit | 500 |
metricCount.limit.prometheus | prometheus.max+metrics | 3000 |
metricCount.appCheck
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
metricCount.jmx
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
metricCount.statsd
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
metricCount.prometheus
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
6.3.10 - JVM
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
jvm.class.loaded
The number of classes currently loaded in the JVM. By default, this
metric shows the total value of the selected scope. For example, if
applied to a group of machines, the value will be the total value for
the whole group.
jvm.class.unloaded
jvm.gc.ConcurrentMarkSweep.count
The number of times the Concurrent Mark-Sweep garbage collector has run.
jvm.gc.ConcurrentMarkSweep.time
The total time the Concurrent Mark-Sweep garbage collector has run.
jvm.gc.Copy.count
jvm.gc.Copy.time
jvm.gc.G1_Old_Generation.count
jvm.gc.G1_Old_Generation.time
jvm.gc.G1_Young_Generation.count
jvm.gc.G1_Young_Generation.time
jvm.gc.global.time
The total time the garbage collection has run.
jvm.gc.MarkSweepCompact.count
jvm.gc.MarkSweepCompact.time
jvm.gc.PS_MarkSweep.count
The number of times the parallel scavenge Mark-Sweep old generation
garbage collector has run.
jvm.gc.PS_MarkSweep.time
The total time the parallel scavenge Mark-Sweep old generation garbage
collector has run.
jvm.gc.PS_Scavenge.count
The number of times the parallel eden/survivor space garbage collector
has run.
jvm.gc.PS_Scavenge.time
The total time the parallel eden/survivor space garbage collector has
run.
jvm.gc.ParNew.count
The number of times the parallel garbage collector has run.
jvm.gc.ParNew.time
The total time the parallel garbage collector has run.
jvm.gc.scavenge.time
The total time the scavenge collector has run.
jvm.heap.committed
The amount of memory that is currently allocated to the JVM for heap
memory. Heap memory is the storage area for Java objects. By default,
this metric shows the total value of the selected scope. For example, if
applied to a group of machines, the value will be the total value for
the whole group.
The JVM may release memory to the system and Heap Committed could
decrease below Heap Init; but Heap Committed can never increase above
Heap Max.
jvm.heap.init
The initial amount of memory that the JVM requests from the operating
system for heap memory during startup (defined by the –Xms option).The
value of Heap Init may be undefined. By default, this metric shows the
total value of the selected scope. For example, if applied to a group of
machines, the value will be the total value for the whole group.
The JVM may request additional memory from the operating system and may
also release memory to the system over time.
jvm.heap.max
The maximum size allocation of heap memory for the JVM (defined by the
–Xmx option). By default, this metric shows the total value of the
selected scope. For example, if applied to a group of machines, the
value will be the total value for the whole group.
Any memory allocation attempt that would exceed this limit will cause an
OutOfMemoryError exception to be thrown.
jvm.heap.used
The amount of allocated heap memory (ie Heap Committed) currently in
use. The number of classes currently loaded in the JVM. By default, this
metric shows the total value of the selected scope. For example, if
applied to a group of machines, the value will be the total value for
the whole group.
Heap memory is the storage area for Java objects.
An object in the heap that is referenced by another object is ’live’,
and will remain in the heap as long as it continues to be referenced.
Objects that are no longer referenced are garbage and will be cleared
out of the heap to reclaim space.
jvm.heap.used.percent
The ratio between Heap Used and Heap Committed. By default, this metric
shows the total value of the selected scope. For example, if applied to
a group of machines, the value will be the total value for the whole
group.
jvm.nonHeap.committed
The amount of memory that is currently allocated to the JVM for non-heap
memory. By default, this metric shows the total value of the selected
scope. For example, if applied to a group of machines, the value will be
the total value for the whole group.
Non-heap memory is used by Java to store loaded classes and other
meta-data.
The JVM may release memory to the system and Non-Heap Committed could
decrease below Non-Heap Init; but Non-Heap Committed can never increase
above Non-Heap Max.
jvm.nonHeap.init
The initial amount of memory that the JVM requests from the operating
system for non-heap memory during startup. By default, this metric shows
the total value of the selected scope. For example, if applied to a
group of machines, the value will be the total value for the whole
group.
The value of Non-Heap Init may be undefined.
The JVM may request additional memory from the operating system and may
also release memory to the system over time.
jvm.nonHeap.max
The maximum size allocation of non-heap memory for the JVM. This memory
is used by Java to store loaded classes and other meta-data. By default,
this metric shows the total value of the selected scope. For example, if
applied to a group of machines, the value will be the total value for
the whole group.
jvm.nonHeap.used
The amount of allocated non-heap memory (Non-Heap Committed) currently
in use. By default, this metric shows the total value of the selected
scope. For example, if applied to a group of machines, the value will be
the total value for the whole group.
Non-heap memory is used by Java to store loaded classes and other
meta-data.
jvm.nonHeap.used.percent
The ratio between Non-Heap Used and Non-Heap Committed. By default, this
metric shows the total value of the selected scope. For example, if
applied to a group of machines, the value will be the total value for
the whole group.
jvm.thread.count
The current number of live daemon and non-daemon threads. By default,
this metric shows the total value of the selected scope. For example, if
applied to a group of machines, the value will be the total value for
the whole group.
jvm.thread.daemon
The current number of live daemon threads. By default, this metric shows
the total value of the selected scope. For example, if applied to a
group of machines, the value will be the total value for the whole
group.
Daemon threads are used for background supporting tasks and are only
needed while normal threads are executing.
6.3.11 - Prometheus Metrics Types
Sysdig Monitor transforms Prometheus metrics into usable, actionable
entries in two ways:
Calculated Metrics
The Prometheus metrics that are scraped by the Sysdig agent and
transformed into the traditional StatsD model are called calculated
metrics. In calculated metrics, the delta is stored with the previous
value. This delta is what Sysdig uses on the classic backend for metrics
analyzing and visualization. While generating the calculated metrics,
the gauge metrics are kept as they are, but the counter metrics are
transformed.
Prometheus calculated metrics cannot be used in PromQL.
The Histogram and Summary metrics are transformed into a different
format called Prometheus histogram and summary metrics respectively. The
transformations include:
All of the quantiles are transformed into a different metric, with
the quantile added as a suffix.
The count and sum of these summary metrics are exposed as different
metrics with names slightly changed. _
(underscore) in the name is
replaced with a period .
. For more information, see Mapping
Classic Metrics and PromQL
Metrics.
Prometheus calculated metrics (legacy metrics) are scheduled to be
deprecated in the coming months.
Raw Metrics
In Sysdig parlance, the Prometheus metrics that are scraped (by the
Sysdig agent), collected, sent, stored, visualized, and presented
exactly as Prometheus exposes them are called raw metrics. Raw metrics
are used with PromQL.
Sysdig counter is a StatsD type counter, where the difference in value
is kept, but not the raw value of the counter, whereas Prometheus raw
metrics are counters that are always monotonically increasing. A rate
function needs to be applied on Prometheus raw metrics to make sense of
it.
Time Aggregations Over Prometheus Metrics
The following time aggregations are supported for both the metric types:
Average: Returns an average of a set of data points, keeping all the
labels.
Maximum and Minimum: Returns a maximal or minimal value, keeping all
the labels.
Sum: Returns a sum of the values of data points, keeping all the
labels.
Rate (timeAvg
): Returns a sum of changes to the counter across
data points in a given time period and divides by time, keeping all
the labels as they are. For Prometheus raw metrics, timeAvg
is
calculated by taking the difference and dividing it by time.
Prometheus Calculated Metrics
Prometheus calculated metrics are treated as gauges by Sysdig, and there
the following time aggregations are available:
Rate (timeAvg
) is not available because they are not applicable to
gauge metrics.
Prometheus Raw Metrics
For the gauge type, the following types are available:
For the counter type, the following types are available:
6.3.12 - Kubernetes
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
Contents
6.3.12.1 - Kubernetes State
kubernetes.hpa.replicas.min
The lower limit for the number of pods that can be set by the
Horizontal Pod
Autoscaler.
The default value is 1.
The lower limit determines the minimum number of replicas that the
autoscaler can periodically adjust in a replication controller or
deployment to the target specified by the user in order to match the
observed average CPU utilization.
Metric Type: Gauge
Segmented by:
kubernetes.hpa.replicas.max
The upper limit for the number of pods that can be set by the
Horizontal Pod
Autoscaler.
This value cannot be smaller than that of kubernetes.hpa.replicas.min
.
The upper limit determines the maximum number of replicas that the
autoscaler can periodically adjust in a replication controller or
deployment to the target specified by the user in order to match the
observed average CPU utilization .
Metric Type: Gauge
Segmented by:
kubernetes.hpa.replicas.current
The current number of replicas of pods managed by the Horizontal Pod
Autoscaler.
Metric Type: Gauge
Segmented by:
kubernetes.hpa.replicas.desired
The desired number of replicas of pods managed by the Horizontal Pod
Autoscaler.
Metric Type: Gauge
Segmented by:
kubernetes.resourcequota.configmaps.hard
The number of config maps that can be created in each Kubernetes
namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.configmaps.used
The current number of config maps in each Kubernetes namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.limits.cpu.hard
The total CPU limit across all pods in a non-terminal state in the
cluster, determined by adding each pod’s CPU limit together.
Metric Type: Gauge - Integer
kubernetes.resourcequota.limits.cpu.used
The current amount of CPU used across all cluster pods in a non-terminal
state.
Metric Type: Gauge - Integer
kubernetes.resourcequota.limits.memory.hard
The total memory limit across all cluster pods in a non-terminal state.
Metric Type: Gauge - Integer
kubernetes.resourcequota.limits.memory.used
The current amount of memory used across all cluster pods in a
non-terminal state.
Metric Type: Gauge - Integer
kubernetes.resourcequota.persistentvolumeclaims.hard
The maximum number of persistent volume claims that can exist in the
Kubernetes namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.persistentvolumeclaims.used
The current number of persistent volume claims that exist in the
Kubernetes namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.cpu.hard
The maximum number of CPU cores assigned in the namespace or at the
resource quota scope level. Across all the pods in a non-terminal state,
the sum of CPU requests cannot exceed this value.
Metric Type: Gauge - Integer
Segmented by:
kubernetes.cluster
kubernetes.namespace
kubernetes.resourcequota
kubernetes.resourcequota.memory.hard
The maximum memory assigned in the namespace or at the resource quota
scope level. Across all the pods in a non-terminal state, the sum of
memory requests cannot exceed this value
Metric Type: Gauge - Integer
Segmented by:
kubernetes.cluster
kubernetes.namespace
kubernetes.resourcequota
kubernetes.resourcequota.pods.hard
The maximum number of pods in a non-terminal state that can exist in the
Kubernetes namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.pods.used
The current number of pods in a non-terminal state that exists in the
Kubernetes namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.replicationcontrollers.hard
The maximum number of replication controllers that can exist in the
Kubernetes namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.replicationcontrollers.used
The current number of replication controllers that can exist in the
Kubernetes namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.requests.cpu.hard
The maximum number of CPU requests allowed across all cluster pods in a
non-terminal state.
Metric Type: Gauge - Integer
kubernetes.resourcequota.requests.cpu.used
The current number of CPU requests across all cluster pods in a
non-terminal state.
Metric Type: Gauge - Integer
kubernetes.resourcequota.requests.memory.hard
The maximum number of memory requests allowed across all cluster pods in
a non-terminal state.
Metric Type: Gauge - Integer
kubernetes.resourcequota.requests.memory.used
The current total number of memory requests across all cluster pods in a
non-terminal state.
Metric Type: Gauge - Integer
kubernetes.resourcequota.requests.storage.hard
The maximum number of storage requests allowed across all persistent
volume claims in the cluster.
Metric Type: Gauge - Integer
kubernetes.resourcequota.requests.storage.used
The current total number of storage requests across all persistent
volume claims.
Metric Type: Gauge - Integer
kubernetes.resourcequota.resourcequotas.hard
The maximum number of resource quotas that can exist in the Kubernetes
namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.resourcequotas.used
The current number of resource quotas that exist in the Kubernetes
namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.secrets.hard
The maximum number of secrets that can exist in the namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.secrets.used
The current number of secrets that exist in the namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.services.hard
The maximum number of services that can exist in the namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.services.used
The current number of services that exist in the namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.services.loadbalancers.hard
The maximum number of load balancer services that can exist in the
namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.services.loadbalancers.used
The current number of load balancer services that exist in the
namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.services.nodeports.hard
The maximum number of node port services that can exist in the
namespace.
Metric Type: Gauge - Integer
kubernetes.resourcequota.services.nodeports.used
The current number of node port services that exist in the namespace.
Metric Type: Gauge - Integer
kubernetes.daemonSet.pods.desired
The number of nodes that should be running the daemon pod.
kubernetes.daemonSet.pods.misscheduled
The number of nodes running a daemon pod but are not supposed to.
kubernetes.daemonSet.pods.ready
The number of nodes that should be running the daemon pod and have one
or more of the daemon pod running and ready.
kubernetes.daemonSet.pods.scheduled
The number of nodes that running at least one daemon pod and are
supposed to.
kubernetes.deployment.replicas.available
The number of available pods per deployment.
kubernetes.deployment.replicas.desired
The number of desired pods per deployment.
kubernetes.deployment.replicas.paused
The number of paused pods per deployment. These pods will not be
processed by the deployment controller.
kubernetes.deployment.replicas.running
The number of running pods per deployment.
kubernetes.deployment.replicas.unavailable
The number of unavailable pods per deployment.
kubernetes.deployment.replicas.updated
The number of updated pods per deployment.
kubernetes.job.completions
The desired number of successfully finished pods that the job should be
run with.
kubernetes.job.numFailed
The number of pods which reached Phase Failed.
kubernetes.job.numSucceeded
The number of pods which reached Phase Succeeded.
kubernetes.job.parallelism
The maximum desired number of pods that the job should run at any given
time.
kubernetes.job.status.active
The number of actively running pods.
kubernetes.namespace.count
The number of namespaces.
kubernetes.namespace.deployment.count
The number of deployments per namespace.
kubernetes.namespace.job.count
The number of jobs per namespaces.
kubernetes.namespace.pod.status.count
Supported by Sysdig Agent 9.5.0 and above.
The metric gives the number of pods in each aggregate state per
Namespace. This is the value that the kubectl get pods
command returns
in the STATUS
column. This metric does not represent the
pod condition or the pod phase.
Segmentable by kubernetes.namespace.name
and
kubernetes.namespace.pod.status.name
.
Due to performance implications, Sysdig Monitor shows only a subset of
the pod aggregate statuses. The statuses displayed on the UI are:
Evicted
DeadlineExceeded
Error
ContainerCreating
CrashLoopBackOff
Pending
Running
To view other statuses, override the default list by adding the
following property in dragent.yaml
k8s_pod_status_reason_strings:
- Pending
- ImagePullBackOff
kubernetes.namespace.pod.running.count
Required: agent 9.6.0+
The number of all the running pods in a Namespace. The metric takes free
pods also into account, that is, pods that do not belong to any
controller. Therefore, its value is not the sum of
(statefulset|daemonset|deployment).pod.running.count
.
kubernetes.namespace.pod.running.count
is supported by Agent v9.6.0
and above.
Metric Type: Gauge
Segmented by: Namespace
kubernetes.namespace.replicaSet.count
The number of replicaSets per namespace.
kubernetes.namespace.service.count
The number of services per namespace.
kubernetes.node.allocatable.cpuCores
The CPU resources of a node that are available for scheduling.
kubernetes.node.allocatable.memBytes
The memory resources of a node that are available for scheduling.
kubernetes.node.allocatable.pods
The pod resources of a node that are available for scheduling.
kubernetes.node.capacity.cpuCores
The maximum CPU resources of the node.
kubernetes.node.capacity.memBytes
The maximum memory resources of the node.
kubernetes.node.capacity.pods
The maximum number of pods of the node.
kubernetes.node.diskPressure
The number of nodes with disk pressure.
kubernetes.node.memoryPressure
The number of nodes with memory pressure.
kubernetes.node.networkUnavailable
The number of nodes with network unavailable.
kubernetes.node.outOfDisk
The number of nodes that are out of disk space.
kubernetes.node.ready
The number of nodes that are ready.
kubernetes.node.unschedulable
The number of nodes unavailable to schedule new pods.
kubernetes.pod.containers.waiting
The number of containers waiting for a pod.
kubernetes.pod.resourceLimits.cpuCores
The limit on CPU cores to be used by a container.
kubernetes.pod.resourceLimits.memBytes
The limit on memory to be used by a container in bytes.
kubernetes.pod.resourceRequests.cpuCores
The number of CPU cores requested by containers in the pod.
kubernetes.pod.resourceRequests.memBytes
The number of memory bytes requested by containers in the pod.
kubernetes.pod.status.ready
The number of pods ready to serve requests.
kubernetes.replicaSet.replicas.fullyLabeled
The number of fully labeled pods per ReplicaSet.
kubernetes.replicaSet.replicas.ready
The number of ready pods per ReplicaSet.
kubernetes.statefulset.replicas
The desired number of pods per StatefulSet.
kubernetes.statefulset.status.replicas
The total number of pods created by the StatefulSet.
kubernetes.statefulset.status.replicas.current
The number of pods created by the current version of the StatefulSet.
kubernetes.statefulset.status.replicas.ready
The number of ready pods created by this StatefulSet.
kubernetes.statefulset.status.replicas.updated
The number of pods updated to the new version of this StatefulSet.
6.3.12.2 - Resource Usage
Compatibility Mapping
Before using Kubernetes resource metrics, review their compatibility
with Sysdig components. The newly supported Kubernetes metrics are not
available to older versions of Sysdig Agent.
Note also that you must edit the agent config file, dragent.yaml, to
enable these metrics. See Enable Kube State Metrics Collection with
K8s_extra_resources.
Metric Name | Agent | Platform |
---|
PVC metrics | 0.89.3 and beyond | Release 2172 |
Resource Quota metrics | 0.87.1 and beyond | Release 2172 |
HPA metrics | 0.79.0 and beyond | Release 2172 |
Kubernetes Resource Metrics
kubernetes.persistentvolumeclaim.storage | The storage capacity requested by the persistent volume claim. kubernetes.persistentvolumeclaim.storage provides Sysdig users with a single overarching metric for persistent volume claims (PVCs), rather than a series of metrics that often repeat/duplicate information. Each Kubernetes PVC metric is mapped to a kubernetes.persistentvolumeclaim.storage label, which can then be used to segment the overarching metric.
See Using Labels for more information on segmenting metrics. | Gauge | kubernetes.namespace.name kubernetes.persistentvolumeclaim.label.accessmode kubernetes.persistentvolumeclaim.label.app kubernetes.persistentvolumeclaim.label.status.phase kubernetes.persistentvolumeclaim.label.storage kubernetes.persistentvolumeclaim.label.storageclassname kubernetes.persistentvolumeclaim.label.volumename
|
kubernetes.pod.restart.count | The cumulative number of container restarts for the pod over its lifetime. This metric is not useful for alerts. Sysdig recommends using kubernetes.pod.restart.rate instead. | Counter - Integer | Kubernetes |
kubernetes.pod.restart.rate | The number of container restarts for the pod within the defined scope/time period. | Gauge - Integer | Kubernetes |
kubernetes.replicaSet.replicas.desired | The number of replica pods the replicaSet is configured to maintain. | Gauge - Integer | Kubernetes |
kubernetes.replicaSet.replicas.running | The current number of replica pods running in the replicaSet. | Gauge - Integer | Kubernetes |
kubernetes.replicationController.replicas.desired | The number of replica pods the replicationController is configured to maintain. | Gauge - Integer | Kubernetes |
kubernetes.replicationController.replicas.running | The current number of replica pods running in the replication controller. | Gauge - Integer | Kubernetes |
6.3.13 - Network
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
net.bytes.in
Inbound network bytes. By default, this metric displays the total value
for the defined scope. For example, if the scope is set to a group of
machines, the metric value will be the total value for the whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.bytes.out
Outbound network bytes. By default, this metric displays the total value
for the defined scope. For example, if the scope is set to a group of
machines, the metric value will be the total value for the whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.bytes.total
Total network bytes. By default, this metric displays the total value
for the defined scope. For example, if the scope is set to a group of
machines, the metric value will be the total value for the whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.client.ip
The client IP address.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.connection.count.in
The number of currently established client (inbound) connections.
This metric is especially useful when segmented by port, process, or
protocol.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Protocol, Port, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.connection.count.out
The number of currently established server (outbound) connections.
This metric is especially useful when segmented by port, process, or
protocol.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Port, Protocol, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.connection.count.total
The number of currently established connections. This value may exceed
the sum of the inbound and outbound metrics since it represents client
and server inter-host connections as well as internal only connections.
This metric is especially useful when segmented by port, process, or
protocol.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Port, Protocol, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.error.count
The number of errors encountered by network system calls, such as
connect()
, send()
, and recv()
. By default, this metric displays
the total value for the defined scope. For example, if the scope is
defined as a group of machines, the metric value will be the total value
for the whole group.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.http.error.count
net.http.error.count
is a heuristic metric.
The number of failed HTTP requests, determined by the total number of
4xx/5xx status codes.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.http.method
The HTTP request method.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.http.request.count
net.http.request.count
is a heuristic metric.
HTTP request count.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.http.request.time
net.http.request.time
is a heuristic metric.
Average HTTP request time.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.http.request.time.worst
The maximum time for HTTP requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.http.statusCode
The HTTP response status code.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.http.url
The HTTP request URL.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.link.clientServer.bytes
The number of bytes passing through the link from client to server.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | Host |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.link.delay.perRequest
Average delay in the network link per request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.link.serverClient.bytes
The number of bytes passing through the link from server to client.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Byte |
Segment By | Host |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.local.endpoint
The local endpoint for a connection. This metric is resolved to a
user-friendly host name, if available.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.local.service
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.mongodb.collection
The MongoDB collection.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.mongodb.error.count
net.mongodb.error.count
is a heuristic metric.
The number of Failed MongoDB requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.mongodb.operation
The MongoDB operation.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.mongodb.request.count
net.mongodb.request.count
is a heuristic metric.
The total number of MongoDB requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.mongodb.request.time
net.mongodb.request.time
is a heuristic metric.
The average time to complete a MongoDB request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.mongodb.request.time.worst (deprecated)
The maximum time to complete a MongoDB request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.protocol
The network protocol of a request (for example, HTTP or MySQL).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.remote.endpoint
The remote endpoint of a connection. This metric automatically resolves
as a user-friendly host name, if available.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.remote.service
Service (port number) of a remote node.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.request.count
net.request.count
is a heuristic metric.
Total number of network requests.
This value may exceed the sum of inbound and outbound requests, because
this count includes requests over internal connections.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.count.in
net.request.count.in
is a heuristic metric.
Number of inbound network requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.count.out
Number of outbound network requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time
net.request.time
is a heuristic metric.
A measure of response time which includes app + network latency. For
server side it is purely a measure of app latency. This is calculated by
measuring when we see the arrival of the last request buffer to when we
see the departure of the first response buffer.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.file (deprecated)
The amount of time for serving a request that is spent doing file I/O.
See also net.request.time.net (network
I/O time) and net.request.time.processing (CPU processing time).
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.file.percent
net.request.time.file.percent
is a heuristic metric.
The percentage of time for serving a request that is spent doing file
I/O. See also net.request.time.net
(network I/O time) and
net.request.time.processing
(CPU processing time).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.in
net.request.time.in
is a heuristic metric.
Average time to serve an inbound request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.local (deprecated)
Average per request delay introduced by this node when it serves
requests coming from the previous tiers. In other words, this is the
time spent serving incoming requests minus the time spent waiting for
outgoing requests to complete.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.local.percent
net.request.time.local.percent
is a heuristic metric.
The percentage of time spent in the local node versus the next tiers,
when serving requests that come from previous tiers.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.net (deprecated)
The amount of time for serving a request that is spent doing network
I/O. See also net.request.time.file
(file I/O time) and
net.request.time.processing
(CPU processing time).
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.net.percent
net.request.time.net.percent
is a heuristic metric.
The percent of time for serving a request that is spent doing network
I/O. See also net.request.time.file
(file I/O time) and
net.request.time.processing
(CPU processing time).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.nextTiers (deprecated)
Delay introduced by the successive tiers when serving requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.nextTiers.percent
net.request.time.nextTiers.percent
is a heuristic metric.
The percentage of time spent in the next tiers versus the local node,
when serving requests that come from previous tiers.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.out
net.request.time.out
is a heuristic metric.
Average time spent waiting for an outbound request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.processing (deprecated)
The amount of time for serving a request that is spent doing CPU
processing. See also net.request.time.fil
e (file I/O time) and
net.request.time.net
(network I/O time).
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.processing.percent
net.request.time.processing.percent
is a heuristic metric.
The percent of time for serving a request that is spent doing CPU
processing. See also net.request.time.file
(file I/O time) and
net.request.time.net
(network I/O time).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.worst.in
net.request.time.worst.in
is a heuristic metric.
Maximum time to serve an inbound request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.request.time.worst.out
net.request.time.worst.out
is a heuristic metric.
Maximum time spent waiting for an outbound request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.role
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.server.ip
Server IP address.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.server.port
TCP/UDP Server port number.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Host |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.sql.error.count
net.sql.error.count
is a heuristic metric.
The number of Failed SQL requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.sql.query
The full SQL query. If the query string is longer than 512 characters,
it will be truncated to 512 characters.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.sql.query.type
The SQL query type (for example, SELECT
, INSERT
, or DELETE
).
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.sql.request.count
net.sql.request.count
is a heuristic metric.
The number of SQL requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.sql.request.time
net.sql.request.time
is a heuristic metric.
Average time to complete an SQL request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.sql.request.time.worst (deprecated)
Maximum time to complete a SQL request.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | relativeTime |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
net.sql.table
The SQL query table name.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Host |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
net.tcp.queue.len
The length of the TCP request queue.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.14 - Process
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
fd.used.percent
The percentage of used file descriptors out of the maximum available. By
default, this metric displays the average value for the defined scope.
For example, if the scope is set to a group of machines, the metric
value will be the average value for the whole group.
This metric should be monitored carefully, and used for alerts, as when
a process reaches its file descriptor limit, the process will stop
operating correctly, and potentially crash.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
proc.commandLine
Command line used to start the process.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Process |
Default Time Aggregation | N/a |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
proc.count
The number of processes on host or container, excluding any processes
that do not have .exe
or command line parameters in the process table.
These processes typically are kernel or system level, and are typically
identified by square brackets (for example, [kthreadd]
).
As some processes are excluded, the host level proc.count
value will
be lower than the value reported by the ps -ef
command on the host.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Container, Process, Kubernetes, Mesos, Swarm, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
proc.name
Name of the process.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Process |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
proc.name.client
Name of the Client process.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Process |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
proc.name.server
Name of the server process.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | String |
Segment By | Process |
Default Time Aggregation | N/A |
Available Time Aggregation Formats | N/A |
Default Group Aggregation | N/A |
Available Group Aggregation Formats | N/A |
proc.start.count
Number of process starts on host or container.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | Integer |
Segment By | Host, CloudProvider |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
6.3.15 - RedisDB Metrics
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
See RedisDB integration
information.
redis.aof.buffer_length
The size of the AOF buffer.
redis.aof.last_rewrite_time
The duration of the last AOF rewrite.
redis.aof.rewrite
A flag indicating that a AOF rewrite operation is on-going.
The biggest input buffer among current client connections.
redis.clients.blocked
The number of connections waiting on a blocking call.
redis.clients.longest_output_list
The longest output list among current client connections.
redis.command.calls
The number of times a redis command has been called. The commands are
tagged with command
(for example, command:append
).
redis.command.usec_per_call
The CPU time consumed per redis command call. The commands are tagged
with command
(for example, command:append
).
redis.cpu.sys
The system CPU consumed by the Redis server.
redis.cpu.sys_children
The system CPU consumed by the background processes.
redis.cpu.user
The user CPU consumed by the Redis server.
redis.cpu.user_children
The user CPU consumed by the background processes.
redis.expires
The number of keys that have expired.
redis.expires.percent
The percentage of total keys that have been expired.
redis.info.latency_ms
The latency of the redis INFO command.
redis.key.length
The number of elements in a given key. Each element is tagged by key
(for example, key:mykeyname
).
redis.keys
The total number of keys.
redis.keys.evicted
The total number of keys evicted due to the maxmemory limit.
redis.keys.expired
The total number of keys expired from the database.
redis.mem.fragmentation_ratio
The ratio between used_memory_rss
and used_memory
.
redis.mem.lua
The amount of memory used by the Lua engine.
redis.mem.maxmemory
The maximum amount of memory allotted to the RedisDB system.
redis.mem.overhead
Sum of all the overheads allocated by Redis for managing its internal
data structures.
Supported by Sysdig Agent v9.7.0 and above.
redis.mem.peak
The peak amount of memory used by Redis.
redis.mem.startup
Amount of memory consumed by Redis while initializing.
Supported by Sysdig Agent v9.7.0 and above.
The amount of memory that Redis allocated as seen by the operating
system.
redis.mem.used
The amount of memory allocated by Redis.
redis.net.clients
The number of connected clients (excluding slaves).
redis.net.commands
The number of commands processed by the server.
redis.net.commands.instantaneous_ops_per_sec
The number of commands processed by the server per second.
redis.net.rejected
The number of rejected connections.
redis.net.slaves
The number of connected slaves.
redis.perf.latest_fork_usec
The duration of the latest fork.
redis.persist
The number of keys persisted. The formula for this metric is
redis.keys
- redis.expires
.
redis.persist.percent
Percentage of total keys that are persisted.
redis.pubsub.channels
The number of active pubsub channels.
redis.pubsub.patterns
The number of active pubsub patterns.
redis.rdb.bgsave
Determines whether a bgsave is in progress. The value is one if a bgsave
is in progress, and zero at all other times.
redis.rdb.changes_since_last
The number of changes since the last background save.
redis.rdb.last_bgsave_time
The duration of the last bg_save
operation.
redis.replication.backlog_histlen
The amount of data in the backlog sync buffer.
redis.replication.delay
The replication delay in offsets.
redis.replication.last_io_seconds_ago
The amount of time since the last interaction with master.
redis.replication.master_link_down_since_seconds
The amount of time that the master link has been down.
redis.replication.master_repl_offset
The replication offset reported by the master.
redis.replication.slave_repl_offset
The replication offset reported by the slave.
redis.replication.sync
Determines whether a sync is in progress. The value is one if a sync is
in progress, and zero at all other times.
redis.replication.sync_left_bytes
The amount of data left before syncing is complete.
redis.slowlog.micros.95percentile
The 95th percentile of the duration of queries reported in the slow log.
redis.slowlog.micros.avg
The average duration of queries reported in the slow log.
redis.slowlog.micros.count
The rate of queries reported in the slow log.
redis.slowlog.micros.max
The maximum duration of queries reported in the slow log.
The median duration of queries reported in the slow log.
redis.stats.keyspace_hits
The total number of successful lookups in the database.
redis.stats.keyspace_misses
The total number of missed lookups in the database.
6.3.16 - Security Policy Metrics
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
security.evts.k8s_audit
| The total number of policy events from a Kubernetes audit policy. | Gauge | host.mac
host.hostname
| 0.86.0 |
security.policy_evts.syscall
| The total number of policy events from a syscall policy. | | | |
security.policies.enabled
| The number of security policies enabled for a user. | | | |
security.policies.total
| The number of security policies that exist for a user. | | | |
security.policy_evts.container
| The total number of policy events from a container policy. | | | |
security.policy_evts.falco
| The total number of policy events from a Falco policy. | | | |
security.policy_evts.filesystem
| The total number of policy events from a filesystem policy. | | | |
security.policy_evts.high
| The number of policy events from a policy with high severity. | | | |
security.policy_evts.low
| The number of policy events from a policy with low severity. | | | |
security.policy_evts.medium
| The number of policy events from a policy with medium severity. | | | |
security.policy_evts.network
| The total number of policy events from a network policy. | | | |
security.policy_evts.process
| The total number of policy events from a process policy. | | | |
security.policy_evts.total
| The total number of policy events across all policy types. | | | |
security_policy_evts.by_name
| The number of events triggered with segment name available. | name
host.mac
host.hostname
| | |
6.3.17 - System
Note: Sysdig follows the Prometheus-compabtible naming convention for both metrics and labels as opposed to the previous statsd-compatible, legacy Sysdig naming convention. However, this page still shows metrics in the legacy Sysdig naming convention. Until this page is updated, see Metrics and Label Mapping for the mapping between Sysdig legacy and Prometheus naming conventions.
capacity.estimated.request.stolen.count (deprecated)
The number of requests the node cannot serve due to CPU steal time. This
metric is calculated by measuring the current number of requests the
machine is serving, and calculating how many more requests could be
served if there was no steal time.
This metric can be used to understand how steal time impacts the ability
to serve user requests.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Process |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
capacity.estimated.request.total.count (deprecated)
The estimated number of requests the node serves at full capacity. This
metric is calculated by measuring the number of requests that a machine
is serving, and the resources each request is using, and combining the
values to project how many requests the machine can serve.
This metric can help users determine if/when the infrastructure capacity
should be increased.
Metadata | Description |
---|
Metric Type | Counter |
Value Type | Integer |
Segment By | Host, Process |
Default Time Aggregation | Rate |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Sum |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
capacity.stolen.percent (deprecated)
The lost service request capacity due to stolen CPU. This metric
reflects the impact on other resource usage capabilities, including disk
I/O and network I/O.
capacity.stolen.percent is non-zero only if cpu.stolen.percent is also
non-zero.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Process |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
capacity.total.percent (deprecated)
The estimated current capacity usage, based on CPU and disk/network
utilization, with CPU stolen time added back in.
capacity.total.percent
can be used to show how the system would
perform with dedicated CPU usage.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Process |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
capacity.used.percent (deprecated)
The estimated current capacity usage, based on CPU and disk/network
utilization. This metric is calculated by adding the value of how many
resources each request coming to the machine is using, creating a score
that indicates how saturates the machine resources are.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, Process |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
cpu.idle.percent
The percentage of time that the CPU/s were idle and the system did not
have an outstanding disk I/O request. By default, this metric displays
the average value for the defined scope. For example, if the scope is
set to a group of machines, the metric value will be the average value
for the whole group.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
cpu.iowait.percent
The percentage of time that the CPU/s were idle during which the system
had an outstanding disk I/O request. By default, this metric displays
the average value for the defined scope. For example, if the scope is
set to a group of machines, the metric value will be the average value
for the whole group.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
cpu.nice.percent
The percentage of CPU utilization that occurred while executing at the
user level with Nice
priority. By default, this metric displays the
average value for the defined scope. For example, if the scope is set to
a group of machines, the metric value will be the average value for the
whole group.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
cpu.stolen.percent
Measures the percentage of time that a virtual machine’s CPU is in a
state of involuntary wait due to the fact that the physical CPU is
shared among virtual machines. In calculating steal time, the operating
system kernel detects when it has work available but does not have
access to the physical CPU to perform that work.
If the percent of steal time is consistently high, you may want to stop
and restart the instance (since it will most likely start on different
physical hardware) or upgrade to a virtual machine with more CPU power.
Also see capacity.total.percent
to see how steal time directly impacts
the number of server requests that could not be handled. On AWS EC2,
steal time does not depend on the activity of other virtual machine
neighbors. EC2 is simply making sure your instance is not using more CPU
cycles than paid for.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
cpu.system.percent
The percentage of CPU utilization that occurred while executing at the
system level (kernel). By default, this metric displays the average
value for the defined scope. For example, if the scope is set to a group
of machines, the metric value will be the average value for the whole
group.
Metadata | Description |
---|
Metric Type | Gauge |
Value Type | % |
Segment By | Host, CloudProvider |
Default Time Aggregation | Average |
Available Time Aggregation Formats | Avg, Rate, Sum, Min, Max |
Default Group Aggregation | Average |
Available Group Aggregation Formats | Avg, Sum, Min, Max |
cpu.cores.used
The CPU core usage of each container is obtained from cgroups, and is
equal to