Advisor
Advisor brings your metrics, alerts, and events into a focused and curated view to help you operate and troubleshoot Kubernetes infrastructure.
Advisor is available to only our SaaS users. The feature is not currently available for on-prem environments.
Advisor presents your infrastructure grouped by cluster, namespace, workload, and pod. You cannot currently configure a custom grouping. Depending on the selection, you will see different curated views and you can switch between the following:
- Advisories
- Triggered alerts
- Events from Kubernetes, container engines, and custom user events
- Cluster usage and capacity
- Key golden signals (requests, latency, errors) derived from system calls
- Kubernetes metrics about the health and status of Kubernetes objects
- Container live logs
- Process and network telemetry (CPU, memory, network connections, etc.)
- Monitoring Integrations
The time window of metrics displayed on Advisor is the last 1 hour of collected data. To see historical values for a metric, drill down to a related dashboard or explore a metric using the Explore UI.
Advisories
Advisories evaluate the thousands of data points being sent by the Sysdig agent, and display a prioritized view of key problems in your infrastructure that affect the health and availability of your clusters and the workloads running on them.
When you select an advisory, relevant information related to the issue is surfaced, such as metrics, events, live logs, and remediation guidance. This enables you to pinpoint and resolve problems faster. Following SRE best practices, they are not necessarily symptoms of a problem, but instead causes that may not necessarily want to be alerted upon.
Example Issues Detected
CrashLoopBackOff | A CrashLoopBackOff means that you have a pod starting, crashing, starting again, and then crashing again. This could cause applications to be degraded or unavailable. |
Container Error | Persistent application error resulting in containers being terminated. An application error, or exit code 1, means the container was terminated due to an application problem. |
CPU Throttling | Containers are hitting their CPU limit and being throttled. CPU throttling will not result in the container being killed, but will be starved of CPU resulting in application slow down. |
OOM Kill | When a container reaches its memory limit it is terminated with an OOMKilled status, or exit code 137. This can lead to application instability or unavailability. |
Image Pull Error | A container is failing to start as it cannot pull the image. |
Advisories are automatically resolved when the problem is no longer detected. You cannot customize the Advisories evaluated. These are fully managed by Sysdig.
Live Logs
Advisor can display live logs for a container, which is the equivalent of running kubectl logs
. This is useful for troubleshooting application errors or problems such as pods in a CrashLoopBackOff state.
When selecting a Pod, a Logs tab will appear. If there are multiple containers within a pod, you can select the container you wish to view logs for. Once requested, logs are streamed for 3 minutes before the session is automatically closed (you can simply re-start streaming if necessary).
Live logs are tailed on-demand and thus not persisted. After a session is closed they are no longer accessible.
Manage User Access to Live Logs
By default live logs is available to users within the scope of their Sysdig Team. Use Custom Roles to manage live logs permissions.
Live logs are enabled by default in agent 12.7.0 or newer versions. Older versions of the Sysdig agent do not support live logs.
Live logs can be enabled or disabled within the agent configuration.
To turn live logs off globally for a cluster, add the following in the dragent.yaml
file:
live_logs:
enabled: false
If using Helm, this is configured via sysdig.settings
. For example:
sysdig:
# Advanced settings. Any option in here will be directly translated into dragent.yaml in the Configmap
settings:
live_logs:
enabled: false
Troubleshoot Live Logs
If there is a problem with live logs, the following errors will be returned. Contact Sysdig Support for additional help and troubleshooting.
Error Code | Cause |
---|
401 | kubelet doesn’t have the bearer token authorization enabled. |
403 | The sysdig-agent ClusterRole doesn’t have the node/proxy permission. |
YAML Configuration
Advisor can display the YAML configuration for pods, which is the equivalent of running kubectl get pod <pod> -o yaml
. This is useful to see the applied configuration of a pod in a raw format, as well as metadata and status. To view the YAML, select a pod in Advisor and open the YAML tab.
Support for viewing YAML config is for pods only. Other object types are not yet supported.
Manage Access to YAML Configuration
By default, displaying YAML configuration is available to users within the scope of their Sysdig Team. Use Custom Roles to manage permissions. The permission for displaying YAML configuration is Advisor - Kubernetes API.
YAML configuration can be enabled in agent 12.9.0 or newer versions. Older versions of the Sysdig agent do not support YAML configuration.
You can use the agent configuration to enable the YAML configuration.
To turn support for YAML configuration on globally for a cluster, add the following in the dragent.yaml
file:
k8s_command:
enabled: true
If you are using helm, edit sysdig.settings
. For example:
sysdig:
# Advanced settings. Any option in here will be directly translated into dragent.yaml in the Configmap
settings:
k8s_command:
enabled: true
1 - Cost Advisor (Preview)
Cost Advisor provides predictable cost analysis and savings estimates for Kubernetes environments.
Cost Advisor is available to only our SaaS users. The feature is not currently available for on-prem environments.

Use Cases
Cost Advisor helps you get insights into the following use cases:
- What is the cost of running compute (eg. EC2 instances) within a Kubernetes cluster?
- What is the cost of running of compute required for an application / workload / namespace?
- How can I reduce the cost of running workloads by rightsizing?
Supported Environments
Currently only AWS is supported. We are actively working on adding support for GCP and Azure.
- The Sysdig Agent is required for Cost Advisor. The agent collects resource usage information that is augmented with billing data. There is no explicit configuration required for Cost Advisor.
- Kubernetes clusters must be running in AWS, GCP, or Azure. Both managed clusters (eg. EKS) and vanilla Kubernetes (eg. KOPS) are supported.
Concepts
Cost Allocation
Cost Allocation is applicable to workloads and their associated namespaces, and displays the current allocated costs depending on resource requirements. Note that it is different from infrastructure costs, as workload cost allocation is calculated independently and can be considered a “logical cost”.
As workloads can exceed their configured requests (ie. it’s overcommitted using more than the number of requests, but less than resource limits) Cost Allocation is currently calculated daily by evaluating requests and usage, and taking whichever is greater for the given time period.
Cost Allocation considers compute (memory and CPU). In future we will factor in other costs including storage, network / load balancer costs, and other associated infrastructure costs.
Example cost allocation for a workload that has requests set to 5 CPU cores and 16GB memory running on an t3.medium with a CPU cost of $0.02/hour and memory cost of $0.003/hour (on-demand pricing).
Day | Calculation | Cost |
---|
Day 1 | Requested CPU: 5 CPUs ($0.10/hr) Actual CPU Usage: 2 CPUs ($0.04/hr) Requested Memory: 16GB ($0.048/hr) Actual Memory Usage: 6GB ($0.018/hr)
Requests are greater than usage; therefore, actual usage is ignored. We consider requests for calculating the cost.
| CPU cost: $2.40 Memory cost: $1.15 Daily Cost: $3.55 |
Day 2 | Requested CPU: 5 CPUs ($0.10/hr) Actual CPU Usage: 12 CPUs ($0.24/hr) Requested Memory: 16GB ($0.048/hr) Actual Memory Usage: 6GB ($0.018/hr)
Memory requests are greater than usage; however actual CPU usage is higher than requests. In this case, we consider actual CPU usage and memory requests.
| CPU cost: $5.76 Memory cost: $1.15 Daily Cost: $6.91 |
Day 3 | Requested CPU: 5 CPUs ($0.10/hr) Actual CPU Usage: 12 CPUs ($0.024/hr) Requested Memory: 16GB ($0.048/hr) Actual Memory Usage: 25GB ($0.075/hr)
Both the actual memory and CPU usage are higher than requests (ie. overcommitted). Here, we consider actual CPU and memory usage.
| CPU cost: $5.76 Memory cost: $1.80 Daily Cost: $7.56 |
Efficiency Metrics

Resource Efficiency
Resource Efficiency is a calculation of both CPU and memory requests against usage, producing a single score. This indicates how well a workload is using its requested resources. The resource efficiency posture is put into the following brackets:
Value | Explanation |
---|
0 (no data) | No CPU or memory requests configured will show zero. |
0-20 | A low value indicates a workload is oversized and may be a good candidate for rightsizing. |
20-70 | Workload resource efficiency could be improved. |
70-120 | Good resource efficiency - improvements could be made, but this is a good score. |
120 or higher | High values (over 120) indicates that the workload can suffer resource starvation or pod eviction as it is consuming a lot more resources than requested. |
CPU Requests
Average usage of CPU against requests over the last 10 minutes. No requests configured will show zero. Example:
CPU Requests = sum workload CPU usage over the last 10 minutes / sum workload CPU requests
Memory Requests
Average usage of memory against requests over the last 10 minutes. No requests configured will show zero. Example:
Memory Requests = sum workload memory usage over the last 10 minutes / sum workload memory requests
Note that for CPU requests, memory requests, and resource efficiency, the calculation is made a the individual workload level. This means when looking at a namespace, these values are an aggregate of all workloads within the same space.
Cost Savings
Cost Advisor helps teams optimize costs by recommending changes to their infrastructure.
Workload Rightsizing
Cost Advisor will surface savings to help you prioritize rightsizing workloads with the highest saving potential.

For all workloads running on your clusters, Cost Advisor evaluates the usage against requests. For oversized workloads (usage is less than requests) you can use Cost Advisor to 1) quantify cost saving if you were to rightsize requests, and 2) see a recommendation on what values to rightsize workloads to.

Cost Advisor helps to baseline workload costs by recommending CPU and memory requests. The recommendation is calculated by looking at the P95 usage of all unique containers running within a workload over the past 1 day. The recommendation is provided per container (in the case of pods running multiple containers).
Currently the recommendation to achieve savings is based on P95 usage over the past 1 day. Support for customizing the methodology that produces this recommendation is coming soon.
2 - Overview
Overview leverages Sysdig’s unified data platform to monitor, secure,
and troubleshoot your hosts and Kubernetes clusters and workloads.
The module provides a unified view of the health, risk, and capacity of
your Kubernetes infrastructure— a single pane of glass for host machines
as well as Kubernetes Clusters, Nodes, Namespaces, and Workloads across
a multi- and hybrid-cloud environment. You can easily filter by any of
these entities and view associated events and health data.
Overview shows metrics prioritized by event count and severity, allowing
you to get to the root cause of the problem faster. Sysdig Monitor polls
the infrastructure data every 10 minutes and refreshes the metrics and
events on the Overview page with the system health.
Key Benefits
Overview provides the following benefits:
Show a unified view of the health, risk, resource use, and capacity
of your infrastructure environment at scale
Render metrics, security events, compliance CIS benchmark
results, and contextual events in a single location
Eliminate the need for stand-alone security, monitoring, and
forensics tools
View data on-the-fly by workload or by infrastructure
Display contextual live event stream from alerts, Kubernetes,
containers, policies, and image scanning results
Surface entities intelligently based on event count and severity
Drills down from Clusters to Nodes and Namespaces
Support Infrastructure monitoring of multi- and hybrid- cloud
environments
Expose relevant information based on core operational users :
DevOps / Platform Ops
Security Analyst
Service Owner
Accessing the Overview User Interface
You can access and set the scope of Overview in the Sysdig Monitor UI or
with the URL:
On-Prem: https://[Sysdig URL]/#/overview
SAAS: See SaaS Regions and IP
Ranges and identify the
correct domain URL associated with your Sysdig application and
region. For example, for US East is:
https://app.sysdigcloud.com/#/overview
For other regions, the format is https://<region>.app.sysdig.com/\#/overview
.
Replace <region>
with the region where your Sysdig
application is hosted. For example, for Sysdig Monitor in the EU,
you use
https://eu1.app.sysdig.com/#/overview.
Click Overview
in the left navigation, then select one of the
Kubernetes entities:
About the Overview User Interface
The Overview interface opens to the Clusters Overview page. This section describes the major components of the interface and the navigation options.

Though the default landing page is Clusters Overview, when you have no Kubernetes clusters configured, the Overview tab opens to the Hosts view. In addition, when you reopen the Overview menu, the default view will be your last visited Overview page as it retains the visit history.
Overview Rows
Each row represents a Kubernetes entity: a cluster, node, namespace, or
workload. In the screenshot above, each row shows a Kubernetes cluster.
Navigating rows is easy
Click on the Overview icon in the left navigation and choose an
Overview page, or drill down into the next Overview page to explore
the next granular level of data. Each Overview page shows 10 rows by
default and a maximum of 100 rows. Click Load More
to display
additional rows if there are more than 10 rows per page.
Ability to select a specific row in an Overview page
Each row contains the scope of the relevant entity that it is
showing data for. Clicking a specific row leads to deselecting the
rest of the rows (for instance, selecting staging deselects all
other rows in the screenshot above) to focus on the scope of the
selected entity, including the events which are scoped out by that
row. Pausing to focus on a single row provides a snapshot of what is
going on until at the moment with the entity under purview.
Entities are ranked according to the severity and the number of events detected in them
Rows are sorted by the count and severity level of the events
associated with the entity and are displayed in descending order.
The items with the highest number of high severity events are shown
first, followed by medium, low, and info. This organization helps to
highlight events demanding immediate attention and to streamline
troubleshooting efforts, in environments that may include thousands
of entities.
Scope Editor
Scope Editor allows targeting down to a specific entity, such as a
particular workload or namespace, from environments that may include
thousands of entities. The levels of scope, determined by Kubernetes
hierarchy, progresses from Workload to Cluster where Cluster being at
the top level. In smaller environments, using the Scope Editor is
equivalent to clicking a single row in an Overview page where no scope
has been applied.
Cluster: The highest level
in the hierarchy. The only scope applied to the page is Cluster. It
allows you to select a specific cluster from a list of available ones.
Node: The second level in
the hierarchy. The scope is determined by Cluster and Node. Selection is
narrowed down to a specific node in a selected cluster.
Namespace: The third level
in the hierarchy. The scope is determined by Cluster and Namespace.
Selection is narrowed down to a specific namespace in a selected
cluster.
Workloads: The last entity
in the hierarchy. The scope is initially determined by Cluster and
Namespace, then the selection is narrowed to a specific Deployment, DaemonSet, or StatefulSet. Choosing all three options are not allowed.
Time Navigation
The Overview feature is based around time. Sysdig Monitor polls the infrastructure data every 10 second and refreshes the metrics and events on the Overview page with the system health. The time range is fixed at 12 hours. However, the gauge and compliance score widgets display the latest data sample, not an aggregation over the entire 12-hour time range.
The Overview feed is always live and cannot be paused.
Unified Stream of Events
The right panel of Overview provides a context-sensitive events
feed.
Click an overview row to see relevant Events on the right. Each event is
intelligently populated with end-to-end metadata to give context and
enable troubleshooting.
Event Types
Overview renders the following event types:
Alert: See Alerts.
Custom: Ensure that Custom labels are enabled to view this type of
events.
Containers: Events associated with containers.
Kubernetes: Events associated with Kubernetes infrastructure.
Scanning: See Image
Scanning.
Policy: See Policies.

Event Statuses
Overview renders the following alert-generated event statuses:
Triggered: The alert condition has been met and still persists.
Resolved: A previously existed alert condition no longer
persists.
Acknowledged: The event has been acknowledged by the intended
recipient.
Un-acknowledged: The event has not been acknowledged by an
intended recipient. All events are by default marked as
Un-acknowledged.
Silenced: The alert event has been silenced for a specified
scope. No alert notification will be sent out to the channels during
the silenced window.
General Guidelines
First-Time Usage
If the environment is created for the first time, Sysdig Monitor
fetches data and generates associated pages. The Overview feature is
immediately enabled. However, wait for, at the maximum, 1 hour to
see the Overview pages with the necessary data.
Overview uses time windows in segments of 1H, 6H and 1D, and
therefore wait respectively for 1H, 6H and 1D to be able to see data
on the Overview pages.
If enough data is not available for the first 1 hour, the “No Data
Available” page will be presented until the first 1 hour passes.
Tuning Overview Data
Sysdig Monitor leverages a caching mechanism to fetch pre-computed data
for the Overview screens.
If pre-computed data is unavailable, data fetched will be non-computed
data, which must be calculated before displaying. This additional
computational time adds delays. Caching is enabled for Overview but for
optimum performance, you must wait for 1H, 6H, and 1D windows the first
time you use Overview. After the specified time has passed, the data
will be automatically be cached with every passing minute.
Enabling Overview for On-Prem Deployments
The Overview feature is not available by default on On-Prem deployments.
Use the following API to enable it:
Get the Beta settings as follows:
curl -X GET 'https://<Sysdig URL>/api/on-prem/settings/overviews' \
-H 'Authorization: Bearer <GLOBAL_SUPER_ADMIN_SDC_TOKEN>' \
-H 'X-Sysdig-Product: SDC' -k
Replace <Sysdig URL> with the Sysdig URL associated with
your deployment and <GLOBAL_SUPER_ADMIN_SDC_TOKEN> with
the SDC token associated with your deployment.
Copy the payload and change the desired values in the settings.
Update the settings as follows:
curl X PUT 'https://<Sysdig URL>/api/on-prem/settings/overview' \
-H 'Authorization: Bearer <GLOBAL_SUPER_ADMIN_SDC_TOKEN>' \
-H 'X-Sysdig-Product: SDC' \
-d '{ "overviews": true, "eventScopeExpansion": true}'
Feature Flags
2.1 - Clusters Data
This topic discusses the Clusters Overview page and helps you understand
its gauge charts and the data displayed on them.
About Clusters Overview
In Kubernetes, a pool of nodes combine together their resources to form
a more powerful machine, that is a Cluster. The Cluster Overview page
provides key metrics indicating the health, risk, capacity, and
compliance of each cluster. Your cluster can reside in any cloud or
multi-cloud environment of your choice.

Each row in the Clusters page represents a cluster. Clusters are sorted
by the severity of corresponding events in order to highlight the area
that needs attention. For example, a cluster with high severity events
is bubbled up to the top of the page to highlight the issue. You can
further drill down to the Nodes or Namespaces Overview page for
investigating at each level.
In environments where no Sysdig Secure is enabled, Network I/O is shown
instead of the Compliance score.
Interpret the Cluster Data
This topic gives insight into the metrics displayed on the Clusters
Overview screen.
Node Ready Status
The chart shows the latest value returned by
avg(min(kubernetes.node.ready))
.
What Is It?
The number shows the readiness for nodes to accept pods across the
entire cluster. The numeric availability indicates the percentage of
time the nodes are reported as ready by
Kubernetes.
For example:
100% is displayed when 10 out of 10 nodes are ready for the entire
time window, say, for the last one hour.
95% is displayed when 9 out of 10 nodes are ready for the entire
time window and one node is ready only for 50% of the time.
The bar chart displays the trend across the selected time window, and
each bar represents a time slice. For example, selecting the last 1-hour
window displays 6 bars, each indicating a 10-minute time slice. Each bar
represents the availability across the time slice (green) or the
unavailability (red).
For instance, the following image shows an average availability of 80%
across the last 1-hour, and each 10-minute time slice shows a constant
availability for the same time window:

What to Expect?
Expect a constant 100% at all times.
What to Do Otherwise?
If the value is less than 100%, determine whether a node is not
available at all, or one or more nodes are partially available.
Drill down either to the Nodes screen in Overview or to the
“Kubernetes Cluster Overview” in Explore to see the list of
nodes and their availability.
Check the Kubernetes Node Overview dashboard in Explore to
identify the problem that Kubernetes reports.
Pods Available vs Desired
The chart shows the latest value returned by
sum(avg(kubernetes.namespace.pod.available.count)) / sum(avg(kubernetes.namespace.pod.desired.count))
.
What Is It?
The chart displays the ratio between available and desired pods,
averaged across the selected time window, for all the pods in a given
Cluster. The upper bound shows the number of desired pods in the
Cluster.
For instance, the following image shows 42 desired pods are available to
use:

What to Expect?
You should typically expect 100%.
If certain pods take a long time to be available you might temporarily
see a value that is less than 100%. Pulling images, pod initialization,
readiness probe, and so on causes such delays.
What to Do Otherwise?
Identify one or more Namespaces that have lower availability. To do so,
drill down to the Namespaces screen, then drill down to the
Workloads screen to identify the unavailable pods.
If the number of unavailable pods is considerably higher (the ratio is
significantly low), check the status of the Nodes. A Node failure will
cause several pods to become unavailable across most of the Namespaces.
Several factors could cause the pods to stuck in the Pending state:
Pods make requests for resources that exceed what’s available across
the nodes (the remaining allocatable pods).
Pods make requests higher than the availability of every single
node. For example, you have 8-core Nodes and you create a pod with a
16-core request. These pods might require reconfiguration and
specific setup related to Node affinity and anti-affinity
constraints.
Namespace quota is reached before making a high resource request.
If a quota is enforced at the Namespace level, you may hit the limit
independent of the resource availability across the Nodes.
CPU Requests vs Allocatable
The chart shows the latest value returned by
sum(avg(kubernetes.pod.resourceRequests.cpuCores)) / sum(avg(kubernetes.node.allocatable.cpuCores))
.
What Is It?
The chart displays the ratio between CPU requests configured for all the
pods in a selected Cluster and allocatable CPUs across all the nodes.
The upper bound shows the number of allocatable CPU cores across all the
nodes in the Cluster.
For instance, the image below shows that out of 620 available CPU cores
across all the nodes (allocatable CPUs), 71% is requested by the pods:

What to Expect?
Your resource utilization strategy determines what ratio you can expect.
A healthy ratio falls between 50% and 80%.
Assuming all the nodes have the same amount of allocatable resources, a
reasonable upper bound is the value of
(node_count - 1) / node_count x 100
. For example, the ratio will be
90% if you have 9 nodes. Having this percentage protects you against a
node becoming unavailable.
What to Do Otherwise?
A lower ratio indicates under-utilized resources (and corresponding
cost) in your infrastructure. A higher ratio indicates insufficient
resources. As a result
To triage, do the following:
Drill down to the Nodes screen to get insights into how
resources are utilized across all nodes.
Drill down to the Namespaces screen to understand how resources
are requested across Namespaces.
Drill down to Explore and refer to the following dashboards:
Kubernetes CPU Allocation Optimization: Evaluate whether a
significant amount of resources are under-utilized in the
infrastructure.
Kubernetes Workloads CPU Usage and Allocation: Determine
whether pods are properly configured and are using resources as
expected.
Can the Value Be Higher than 100%?
Currently, the ratio accounts only for scheduled pods, while pending
pods are excluded from the calculation. This means pods have been
scheduled to run on Nodes out of the allocatable pods. Consequently, the
ratio cannot be higher than 100%.
In the case of over-commitment (pods requesting for more resources than
what’s available), you can expect a higher Requests vs Allocatable
ratio and a lower Pods Available vs Desired ratio. What it indicates
is that most of the available resources are being used, and what’s left
is not enough to schedule additional pods. Therefore, the Available vs
Desired ratio for pods will decrease.
When your environment has pods that are updated often or that are
deleted and created often (for example, testing Clusters), the total
requests might appear higher than what it is at any given time.
Consequently, the ratio becomes higher across the selected time window,
and you might see a value that is higher than 100%. This error is
rendered due to how the data engine calculates the aggregated ratio.
Drill down to Kubernetes Cluster Overview to see the CPU Cores
Usage vs Requests vs Allocatable time series to correctly evaluate the
trend of the request commitments.
Listed below are some of the factors that could cause the pods to stuck
in a Pending state:
Pods make requests that exceed what’s available across the nodes
(the remaining allocatable pods). The Requests vs Allocatable
ratio is an indicator of this issue.
Pods make requests that are higher than the availability of every
single Node. For example, you have 8-core Nodes and you create a pod
with a 16-core request. These pods might require reconfiguration and
specific setup related to Node affinity and anti-affinity
constraints.
The Quota set at the Namespace level is reached before a request is
configured. The Requests vs Allocatable ratio may not suggest
the problem, but the Pods Available vs Desired ratio would
decrease, especially for the specific Namespaces. See the
Namespaces screen in Overview.
Memory Requests vs Allocatable
The chart shows the latest value returned by
sum(avg(kubernetes.pod.resourceRequests.memBytes)) / sum(avg(kubernetes.node.allocatable.memBytes))
.
What Is It?
The chart displays the ratio between memory requests configured for all
the pods in the Cluster and allocatable memory available across all the
Nodes.
The upper bound shows the allocatable memory available across all Nodes.
The value is expressed in bytes, displayed in a specified unit.
For instance, the image below shows that out of 29.7 GiB available
across all Nodes (allocatable memory), 35% is requested by the pods:

What to Expect?
Your resource utilization strategy determines what ratio you can expect.
A healthy ratio falls between 50% and 80%.
Assuming all the nodes have the same amount of allocatable resources, a
reasonable upper bound is the value of
(node_count - 1) / node_count x 100
. For example, 90% if you have 9
nodes. This ratio protects your system against a node becoming
unavailable.
What to do Otherwise
A lower ratio indicates under-utilized resources (and corresponding
cost) in your infrastructure. A higher ratio indicates insufficient
resources. As a result
To troubleshoot, do the following:
Drill down to the Nodes screen to get insights into how
resources are utilized across all the Nodes.
Drill down to the Namespaces screen to understand how resources
are requested across Namespaces.
Drill down to Explore and refer to the following dashboards:
Kubernetes Memory Allocation Optimization: Evaluate whether
a significant amount of resources are under-utilized in the
infrastructure.
Kubernetes Workloads Memory Usage and Allocation: Determine
whether pods are properly configured and are using resources as
expected.
Can the Value be Higher than 100%?
The ratio currently accounts only for scheduled pods, while pending pods
are excluded from the calculation. What this implies is that pods have
been scheduled to run on Nodes out of the allocatable resources
available. Consequently, the ratio cannot be higher than 100%.
In the case of over-commitment (pods requesting for more resources than
what’s available), expect a higher Requests vs Allocatable ratio and
a lower Pods Available vs Desired ratio. What it indicates is that
most of the available resources have been used and what’s left is not
enough to schedule additional pods. Therefore, the Pods Available vs
Desired ratio will decrease.
When your environment has pods that are updated often or that are
deleted and created often (for example, testing Clusters), the total
requests might appear higher than what it is at any given time.
Consequently, the ratio becomes higher across the selected time window,
and you might see a value that is higher than 100%. This error is
rendered due to how the data engine calculates the aggregated ratio.
Drill down to Kubernetes Cluster Overview to see the Memory
Requests vs Allocatable time series to correctly evaluate the trend
for the request commitments.
Listed are some of the factors that could cause your pods to stuck in a
Pending state:
Pods make requests that exceed what’s available across the nodes
(the remaining allocatable pods). The Requests vs Allocatable
ratio is an indicator of this issue.
Pods make requests that are higher than the availability of every
single Node. For example, you have 8-core nodes and you create a pod
with a 16-core request. These pods might require configuration
changes and specific setup related to node affinity and
anti-affinity factors.
The Quota set at the Namespace-level is reached before a high
request is configured. The Requests vs Allocatable ratio might
not suggest the problem, but the Pods Available vs Desired ratio
would decrease, especially for the specific Namespaces. See the
Namespaces screen in Overview.
Compliance Score
Docker: The latest value returned by
avg(avg(compliance.k8s-bench.pass_pct))
.
Kubernetes: The latest value returned by
avg(avg(compliance.docker-bench.pass_pct))
.
What Is it?
The numbers show the percentage of benchmarks that succeeded in the
selected time window, respectively for Docker and Kubernetes entities.
What to Expect
If you do not have Sysdig Secure enabled, or you do not have benchmarks
scheduled, then you should expect no data available.
Otherwise, the higher the score, the more compliant your infrastructure
is.
What to Do Otherwise?
If the score is lower than expected, drill down to Docker Compliance
Report or Kubernetes Compliance Report to see further details
about benchmark checks and their results.
You may also want to use the Benchmarks / Results page in Sysdig
Secure to see the history of
checks.
2.2 - Nodes Data
This topic discusses the Nodes Overview page and helps you understand
its gauge charts and the data displayed on them.
About Nodes Overview
A node refers to a worker machine in Kubernetes. A physical machine or
VM can represent a node. The Nodes Overview page provides key metrics
indicating the health, capacity, and compliance of each node in your
cluster.

In environments where no Sysdig Secure is enabled, Network I/O is shown
instead of the Compliance score.
Interpret the Nodes Data
This topic gives insight into the metrics displayed on the Nodes
Overview page.
Node Ready Status
The chart shows the latest value returned by
avg(min(kubernetes.node.ready))
.
What Is It?
The number expresses the Node readiness to accept pods across the
Cluster. The numeric availability indicates the percentage of time the
Node is reported ready by
Kubernetes.
For example:
100% is displayed when a Node is ready for the entire time window,
say, for the last one hour.
95% when the Node is ready for 95% of the time window, say, 57 out
of 60 minutes.
The bar chart displays the trend across the selected time window, and
each bar represents a time slice. For example, selecting “last 1 hour”
displays 6 bars, each indicating a 10-minute time slice. Each bar shows
the availability across the time slice (green) and the unavailability
(red).
For instance, the image below indicates the Node has not been ready for
the entire last 1-hour time window:

What to Expect?
The chart should show a constant 100% at all times.
What to Do Otherwise?
If the number is less than 100%, review the status reported by
Kubernetes. Drill-down to the Kubernetes Node Overview Dashboard in
Explore to see details about the Node readiness:

If the Node Ready Status has an alternating behavior, as shown in
the image, the node is flapping. Flapping indicates that the kubelet is
not healthy. See specific conditions reported by Kubernetes that would
help determine the causes for the Node not being ready. Such conditions
include network issues and memory pressure.
Pods Ready vs Allocatable
The chart reports the latest value of
sum(avg(kubernetes.pod.status.ready)) / avg(avg(kubernetes.node.allocatable.pods))
.
What Is It?
It is the ratio between available and allocatable pods configured on the
node, averaged across the selected time window.
The Clusters page includes a similar chart named Pods Available vs
Desired. However, the meaning is different:
The Pods Available vs Desired chart for Clusters highlights how
many pods you expect and how many are actually available. See
IsPodAvailable
for a detailed definition.
The Pods Ready vs Allocatable chart for Nodes indicates how many
pods can be scheduled on each Node and how many are actually ready.
The upper bound shows the number of pods you can allocate in the node.
See node
configuration.
For instance, the image below indicates that you can allocate 110 pods
in the Node (default configuration), but only 11 pods are ready:

What to Expect?
The ratio does not relate to resource utilization, but it measures the
pod density on each node. The more pods you have on a single node, the
more effort the kubelet has to put in order to manage the pods, the
routing mechanism, and Kubernetes overall.
Given the allocatable is properly set, values lower than 80% indicate a
healthy status.
What to Do Otherwise?
Reviewing the default maximum pods configuration of the kubelet to
allow more pods, especially if the CPU and memory utilization is
healthy.
Adding more nodes to allow for more pods to be scheduled.
Reviewing kubelet process performance and Node resource utilization
in general. A higher ratio indicates high pressure on the operating
system and for Kubernetes itself.
CPU Requests vs Allocatable
The chart shows the latest value returned by
sum(avg(kubernetes.pod.resourceRequests.cpuCores)) / sum(avg(kubernetes.node.allocatable.cpuCores))
.
What Is It?
The chart shows the ratio between the number of CPU cores requested by
the pods scheduled on the Node and the number of cores available to
pods. The upper bound shows the CPU cores available to pods, which
corresponds to the user-defined configuration for allocatable
CPU.
For instance, the image below shows that the Node has 16 CPU cores
available, out of which, 84% are requested by the pods scheduled on the
Node:

What to Expect?
Expect a value up to 80%.
Assuming all the nodes have the same amount of allocatable resources, a
reasonable upper bound is the value of
(node_count - 1) / node_count x 100
. For example, 90% if you have 9
nodes. Having a high ratio protects your system against a Node becoming
unavailable.
What to Do Otherwise?
A low ratio indicates the Node is underutilized. Drill up to the
corresponding cluster in the Clusters page to determine whether
the number of pods currently running is lower, or if the pods cannot
run for other reasons.
A high ratio indicates a potential risk of being unable to schedule
additional pods on the Node.
Drill down to the Kubernetes Node Overview Dashboard to
evaluate what Namespaces, Workloads, and pods are running.
Additionally, drill up in the Clusters page to evaluate whether
you are over-committing the CPU resource. You might not have enough
resources to fulfill requests, and consequently, pods might not be
able to run on the Node. Consider adding Nodes or replacing Nodes
with additional CPU cores.
Can the Value Be Higher than 100%?
Kubernetes schedules pods on Nodes where sufficient allocatable
resources are available to fulfill the pod request. This means
Kubernetes does not allow having a total request higher than the
allocatable. Consequently, the ratio cannot be higher than 100%.
Over-committing (pods requesting resources higher than the capacity)
results in a high Requests vs Allocatable ratio and a low Pods
Available vs Desired ratio at the Cluster level. What it indicates is
that most of the available resources are being used, consequently,
what’s available is not sufficient to schedule additional pods.
Therefore, Pods Available vs Desired ratio will also decrease.
Memory Requests vs Allocatable
The chart highlights the latest value returned by
sum(avg(kubernetes.pod.resourceRequests.memBytes)) / sum(avg(kubernetes.node.allocatable.memBytes))
.
What Is It?
The ratio between the number of bytes of memory is requested by the pods
scheduled on the node and the number of bytes of memory available.The
upper bound shows the memory available to pods, which corresponds to the
user-defined allocatable memory
configuration.
For instance, the image below indicates the node has 62.8 GiB of memory
available, out of which, 37% is requested by the pods scheduled on the
Node:

What to Expect?
A healthy ratio falls under 80%.
Assuming all the nodes have the same amount of allocatable resources, a
reasonable upper bound is the value of
(node_count - 1) / node_count x 100
. For example, the ratio is 90% if
you have 9 nodes. Having a high ratio protects your system against a
node becoming unavailable.
What to Do Otherwise?
A low ratio indicates that the Node is underutilized. Drill up to
the corresponding cluster in the Clusters page to determine
whether the number of pods running is low, or if pods cannot run for
other reasons.
A high ratio indicates a potential risk of being unable to schedule
additional pods on the node.
Drill down to the Kubernetes Node Overview dashboard to
evaluate what Namespaces, Workloads, and pods are running.
Additionally, drill up in the Clusters page to evaluate
whether you are over-committing the memory resource.
Consequently, you don’t have enough resources to fulfill
requests, and pods might not be able to run. Consider adding
nodes or replacing nodes with more memory.
Can the Value be Higher than 100%?
Kubernetes schedules pods on nodes where sufficient allocatable
resources are available to fulfill the pod request. This means
Kubernetes does not allow having a total request higher than the
allocatable. Consequently, the ratio cannot be higher than 100%.
Over-committing (pods requesting for more resources than that are
available) results in a high Requests vs Allocatable ratio at the
Nodes level and a low Pods Available vs Desired ratio at the Cluster
level. What it indicates is that most of the resources are being used,
consequently, what’s available is not sufficient to schedule additional
pods. Therefore, Pods Available vs Desired ratio will also decrease.
Network I/O
The chart shows the latest value returned by
avg(avg(net.bytes.total))
.
What Is It?
The sparkline shows the trend of network traffic (inbound and outbound)
for a Node. The number indicates the most recent rate of restarts per
second.

For reference, the sparklines show the following number of steps
(sampling):
Last hour: 6 steps, each for a 10-minute time slice
Last 6 hours: 12 steps, each for a 20-minute time slice
Last day: 12 steps, each for a 2-hour time slice
What to Expect?
The metric highly depends on what type of applications run on the Node.
You should expect some network activity for Kubernetes related
operations.
Drilling down to the Kubernetes Node Overview Dashboard in
Explore will provide additional details, such as network activity
across pods.
2.3 - Namespaces Data
This topic discusses the Namespaces Overview page and helps you
understand its gauge charts and the data displayed on them.
About Namespaces Overview
Namespaces
are virtual clusters on a physical cluster. They provide logical
separation between the teams and their environments. The Namespaces
Overview page provides key metrics indicating the health, capacity, and
performance of each Namespace in your cluster.

Interpret the Namespaces Data
This topic gives insight into the metrics displayed on the Namespaces
Overview screen.
Pod Restarts
The chart highlights the latest value returned by
avg(timeAvg(kubernetes.pod.restart.rate))
.
What Is It?
The sparkline shows the trend of pod restarts rate across all the pods
in a selected Namespace. The number shows the most recent rate of
restarts per second.

For instance, the image shows a rate of 0.04 restarts per second for the
last 2-hours, given the selected time window is one day. The trend also
suggests a non-flat pattern (periodic crashes).
Last hour: 6 steps, each for a 10-minute time slice
Last 6 hours: 12 steps, each for a 20-minute time slice
Last day: 12 steps, each for a 2-hour time slice
What to Expect?
Expect 0 restarts for any pod.
What to Do Otherwise?
A few restarts across the last one hour or larger time windows might not
indicate a serious problem. In the event restart loop, identify the root
cause as follows:
Drill down to the Workloads page in Overview to identify the
Workloads that have been stuck at a restart loop.
Drill down to the Kubernetes Namespace Overview to see a
detailed trend broken down by pods:

Pods Available vs Desired
The chart shows the latest value returned by
sum(avg(kubernetes.namespace.pod.available.count)) / sum(avg(kubernetes.namespace.pod.desired.count))
.
What Is It?
The chart displays the ratio between available and desired pods,
averaged across the selected time window, in a given Namespace.
The upper bound shows the number of desired pods in the namespace.
For instance, the image below shows 42 desired pods that are available:

What to Expect?
Expect 100% on the chart.
If certain pods take a significant amount of time to become available
due to delays (image pull time, pod initialization, readiness probe) you
might temporarily see a ratio lower than 100%.
What to Do Otherwise?
Identify one or more Workloads that have low availability by
drilling down to the Workloads page.
Once you identify the Workload, drill down to the related dashboard
in Explore. For example, Kubernetes Deployment Overview to
determine the trend and the state of the pods.
For instance, in the following image, the ratio is 98% (3.93 / 4 x
100). The decline is due to an update that caused pods to be
terminated and consequently to be started with a newer version.

CPU Used vs Requests
The chart shows the latest value returned by
sum(avg(cpu.cores.used)) / sum(avg(kubernetes.pod.resourceRequests.cpuCores))
.
What Is It?
The chart shows the ratio between the total CPU usage across all the
pods in the Namespace and the total CPU requested by all the pods.
The upper bound shows the total CPU requested by all the pods. The value
is expressed as the number of CPU cores.
For instance, the image below shows the pods in a Namespace requests for
40 CPU cores, of which only 43% is being used (about 17 cores):

What to Expect?
The value you see depends on the type of Workloads running in the
Namespace.
Typically, values that fall between 80% and 120% is considered healthy.
Values higher than 100% is considered healthy relatively for a short
amount of time.
For applications whose resource usage is constant (such as background
processes), expect the ratio to be close to 100%.
For “bursty” applications, such as an API server, expect the ratio to be
less than 100%. Note that this value is averaged for the selected time
window, therefore, a usage spike would be compensated by an idle period.
What to Do Otherwise?
A low usage indicates that the application is not properly running (not
executing the expected functions) or the Workload configuration is not
accurate (requests are too high compared to what the pods actually
need).
A high usage indicates that the application is operating with a heavy
load or the workload configuration is not accurate (requests are too low
compared to what pods actually need).
In either case, drill down to the Workloads page to determine the
workload that requires a deeper analysis.
Can the Value Be Higher than 100%?
Yes, it can.
You can configure requests without limits, or requests lower than
the limits. In either case, you are allowing the containers to use
more resources than requested, typically to handle temporary
overloads.
Consider a Namespace with two Workloads with one pod each. Say, one
Workload is configured to request for 1 CPU core and uses 1 CPU core
(ratio of Used vs Request is 100%). The other Workload is
configured without any request and uses 1 CPU core. In this example,
2 CPU cores used to 1 CPU core requested ratio at the Namespace
level is 200%.
Memory Used vs Requests
The chart shows the latest value returned by
sum(avg(memory.bytes.used)) / sum(avg(kubernetes.pod.resourceRequests.memBytes))
.
What Is It?
The chart shows the ratio between the total memory usage across all pods
of the Namespace and the total memory requested by all pods.
The upper bound shows the total memory requested by all the pods,
expressed in a specified unit for bytes.
For instance, the image below shows that all the pods in the Namespace
requests for 120 GiB, of which only 24% is being used (about 29 GiB):

What to Expect?
It depends on the type of Workloads you run in the Namespace. Typically,
values that fall between 80% and 120% are considered healthy.
Values that are higher than 100% considered normal for a relatively
short amount of time.
What to Do Otherwise?
A low usage indicates the application is not properly running (not
executing the expected functions) or the workload configuration is not
accurate (high requests compared to what the pods actually need).
A high usage indicates the application is operating with a high load or
the Workload configuration is not accurate (Fewer requests compared to
what the pods actually need).
Given the configured limits for the Workloads and the memory pressure on
the nodes, if the Workloads use more memory than what’s requested they
are at risk of eviction. See Exceed a Container’s
Limit
for more information.
In both cases, you may want to drill down to the Workloads page to
determine which Workload requires a deeper analysis.
Can the Value Be Higher than 100%?
Yes, it can.
You can configure requests without limits, or requests lower than
the limits. In either case, you are allowing the containers to use
more resources than requested, typically to handle temporary
overloads.
Consider a Namespace with two Workloads with one pod each. Say, one
Workload is configured to request for 1 GiB of memory and uses 1 GiB
(Used vs Request ratio is 100%). The other Workload is configured
without any request and uses 1 GiB. In this example, 2 GiB of Memory
Used to1 GiB Requested ratio at the Namespace level is 200%.
Network I/O
The chart shows the latest value returned by
avg(avg(net.bytes.total))
.
What Is It?
The sparkline shows the trend of network traffic (inbound and outbound)
for all the pods in the Namespace. The number shows the most recent
rate, expressed in restarts per second.
For reference, the sparklines show the following number of steps
(sampling):

Last hour: 6 steps, each for a 10-minute time slice
Last 6 hours: 12 steps, each for a 30-minute time slice
Last day: 12 steps, each for a 2-hour time slice
What to Expect?
The type of applications run in the Namespace determine the metrics.
Drilling down to the Kubernetes Namespace Overview Dashboard in
Explore provides additional details, such as network activity across
pods.
2.4 - Workloads Data
This topic discusses the Workloads Overview page and helps you
understand its gauge charts and the data displayed on them.
About Workloads Overview
Workloads, in Kubernetes terminology, refers to your containerized
applications. Workloads comprise of Deployments, Statefulsets, and
Daemonsets within a Namespace.
In a Cluster, worker nodes run your application workloads, whereas the
master node provides the core Kubernetes services and orchestration for
application workloads. The Workloads Overview page provides the key
metrics indicating health, capacity, and compliance.

Interpret the Workloads Data
This topic gives insight into the metrics displayed on the Workloads
Overview page.
Pod Restarts
The chart displays the latest value returned by
sum(timeAvg(kubernetes.pod.restart.rate))
.
What Is It?
The sparkline shows the trend of Pod Restarts rate across all the pods
in a selected Workload. The number shows the most recent rate, expressed
in Restarts per Second.
For instance, the image below shows the trend for the last hour. The
number indicates that the rate of pod restarts is less than 0.01 for the
last 10 minutes.

For reference, the sparklines show the following number of steps
(sampling):
Last hour: 6 steps, each for a 10-minute time slice.
Last 6 hours: 12 steps, each for a 20-minute time slice.
Last day: 12 steps, each for a 2-hour time slice.
What to Expect?
A healthy pod will have 0 restarts at any given time.
What to Do Otherwise?
In most cases, fewer restarts in the last hour (or larger time windows)
do not indicate a serious problem. Drill down to the Kubernetes
Overview Dashboard related to the Workload in Explore. For
example, Kubernetes StatefulSet Overview provides a detailed trend
broken down by pods.

In this example, the number of restarts is constant (roughly every 5
minutes) and no pods are ready. This might indicate a crash loop
back-off .
Pods Available vs Desired
The chart shows the latest value of returned by
sum(avg(kubernetes.deployment.replicas.available)) / sum(avg(kubernetes.deployment.replicas.desired))
.
What Is It?
The chart displays the ratio between available and desired pods,
averaged across the selected time window, for all the pods in a given
Workload.
The upper bound shows the number of desired pods in the Workload.
For instance, the image below shows all the 42 desired pods are
available.

What to Expect?
You should typically expect 100%.
If certain pods take a significant amount of time to become available
(image pull time, pod initialization, readiness probe), then you may
temporarily see a ratio lower than 100%.
What to Do Otherwise?
Determine the Workloads that have low availability by drilling down to
the related Dashboard in Explore. For example, the Kubernetes
Deployment Overview helps understand the trend and the state of the
pods.

For instance, the image above shows that the ratio is 98% (3.93 / 4 x
100). The slight decline is due to an update that caused pods to be
terminated and consequently to be started with a newer version.
CPU Used vs Requests
The chart shows the latest value returned by
sum(avg(cpu.cores.used)) / sum(avg(kubernetes.pod.resourceRequests.cpuCores))
.
What Is It?
The chart shows the ratio between the total CPU usage across all pods of
a selected Workload and the total CPU requested by all the pods.
The upper bound shows the total CPU requested by all the pods. The value
denotes the number of CPU cores.

In this image, the pods in the Workload requests for 40 CPU cores, of
which 43% is actually used (about 17 cores).
What to Expect?
It depends on the type of workload.
For applications (background processes) whose resource usage is
constant, expect the ratio to be around 100%.
For “bursty” applications, such as an API server, expect the ratio to be
lower than 100%. Note that the value is averaged for the selected time
window, therefore, a usage spike would be compensated by an idle period.
Generally, values between 80% and 120% are considered normal. Values
that are higher than 100% deemed normal if it’s observed only for a
relatively short time.
What to Do Otherwise?
A low usage indicates that the application is not properly running
(not executing the expected functions) or the Workload configuration
is not accurate (requests are too high compared to what the pods
actually need).
A high usage indicates that the load is high for applications or the
Workload configuration is not accurate (low requests compared to
what the pods actually need).
In either case, drill down to the Kubernetes Overview Dashboard
corresponding to the Workload in Explore. For example, the
Kubernetes Deployment Overview Dashboard provides insight into
resource usage and configuration.
Can the Value Be Higher than 100%?
Yes, it can.
Configuring CPU requests without limits or requests lower than
limits is permissible. In these cases, you are allowing the
containers to use more resources than requested, typically to handle
temporary overloads.
Consider a Workload with two containers. Say, one container is
configured to request for 1 CPU core and uses 1 CPU core (Used vs
Request ratio is 100%). The other is configured without any request
and uses 1 CPU core. In this example, the 2 CPU core Used to 1 CPU
core Requested ratio is 200% at the Workload level.
What Does “No Data” Mean?
If the Workload is configured with no requests and limits, then the
Usage vs Requests ratio cannot be computed. In this case, the chart will
show “no data”. Drill down to the Dashboard in Explore to evaluate
the actual usage.
You must always configure requests. Setting requests helps to detect
Workloads that require reconfiguration.
Kubernetes itself might expose Workloads with no requests or limits
configured. For example, the kube-system
Namespace can have Workloads
without requests configured.
Memory Used vs Requests
The chart shows the latest value returned by
sum(avg(memory.bytes.used)) / sum(avg(kubernetes.pod.resourceRequests.memBytes))
.
What Is It?
The chart shows the ratio between the total memory usage across all the
pods in a Workload and the total memory requested by the Workload.
The upper bound shows the total memory requested by all the pods,
expressed in the specified unit of bytes.

For instance, the image shows that the pods in the selected Workload
requested for 120 GiB, of which 24% is actually used (about 29 GiB).
What to Expect?
The type of Workload determines the ratio. Values between 80% and 120%
are considered normal. Values that are higher than 100% is deemed normal
if it’s observed only for a relatively short time.
What to Do Otherwise?
A low memory usage indicates that the application is not properly
running (not executing the expected functions) or the Workload
configuration is not accurate (requests are too high compared to what
the pods actually need).
A high memory usage indicates that the load is higher for applications
or the Workload configuration is not accurate (low requests compared to
what the pods actually need).
Given the configured limits for the Workloads and the memory pressure on
the nodes, if the Workloads use more memory than what’s requested they
are at risk of eviction. For more information, see Container’s Memory
Limit.
In either case, drill down to the Workloads page to determine the
Workload that requires a deeper analysis.
Can the Value Be Higher than 100%?
Yes, it can.
Configuring memory requests without limits or requests lower than
limits is permissible. In these cases, you are allowing the
containers to use more resources than requested, typically to handle
temporary overloads.
Consider a Workload with two containers. Say, one container is
configured to request for 1 GiB of memory and uses 1 GiB (Used vs
Request ratio is 100%), while the other is configured without any
request and uses 1 GiB of memory. In this example, the 2 GiB of
memory used to 1 GiB requested ratio is 200% at the Workload level.
What Does “No Data” Mean?
If the Workload is configured with no memory requests and limits, then
the Usage vs Requests ratio cannot be computed. In this case, the chart
will show “no data”. Drill down to the Dashboard in Explore to
evaluate the actual usage.
You must configure requests. It helps to detect Workloads that require
reconfiguration.
Kubernetes itself might expose Workloads with no requests or limits
configured. For example, the kube-system
Namespace can have Workloads
without requests configured.
Network I/O
The chart shows the latest value returned by
avg(avg(net.bytes.total))
.
What Is It?
The sparkline shows the trend of network traffic (inbound and outbound)
for the Workload. The number shows the most recent rate, expressed in
bytes per second in a specific unit.

For reference, the sparklines show the following number of steps
(sampling):
Last hour: 6 steps, each for a 10-minute time slice
Last 6 hours: 12 steps, each for a 30-minute time slice
Last day: 12 steps, each for a 2-hour time slice
What to Expect?
The type of application runs in the Workload determines the metrics.
Drill down to the Kubernetes Overview Dashboard corresponding to the
Workload in Explore. For example, the Kubernetes Deployment
Overview Dashboard provides additional details, such as network
activity across pods.