Troubleshooting Metrics

Troubleshooting metrics are agent metrics that provide deep visibility into system behavior. While 10s metric granularity is retained for 7 days, troubleshooting metrics are retained for 4 days. Troubleshooting metrics include program metrics, connection-level network metrics, Kubernetes troubleshooting metrics, HTTP URL metrics, and some SQL metrics. This page lists the available troubleshooting metrics and the labels you can use to segment them.

Program Level Metrics

Program level metrics are defined on program level.

  • sysdig_program_cpu_cores_used
  • sysdig_program_cpu_cores_used_percent
  • sysdig_program_cpu_used_percent
  • sysdig_program_memory_used_bytes
  • sysdig_program_net_in_bytes
  • sysdig_program_net_out_bytes
  • sysdig_program_net_connection_in_count
  • sysdig_program_net_connection_out_count
  • sysdig_program_net_connection_total_count
  • sysdig_program_net_error_count
  • sysdig_program_net_request_count
  • sysdig_program_net_request_in_count
  • sysdig_program_net_request_out_count
  • sysdig_program_net_request_time
  • sysdig_program_net_request_in_time
  • sysdig_program_net_tcp_queue_len
  • sysdig_program_proc_count
  • sysdig_program_thread_count
  • sysdig_program_up

In addition to the user-defined labels and standard set of labels Sysdig provides, you can use the labels program_cmd_line and program_name to segment program metrics.

Connection-Level Network Metrics

Connection level metrics are based on individual TCP connections. Aggregated network traffic metrics, such as those beginning sysdig_container_net_, do not fall under this category.

  • sysdig_connection_net_in_bytes
  • sysdig_connection_net_out_bytes
  • sysdig_connection_net_total_bytes
  • sysdig_connection_net_connection_in_count
  • sysdig_connection_net_connection_out_count
  • sysdig_connection_net_connection_total_count
  • sysdig_connection_net_request_in_count
  • sysdig_connection_net_request_out_count
  • sysdig_connection_net_request_count
  • sysdig_connection_net_request_in_time
  • sysdig_connection_net_request_out_time
  • sysdig_connection_net_request_time

In addition to the user-defined labels and standard set of labels Sysdig provides, you can use following labels to segment connection level metrics: net_local_service, net_remote_service, net_local_endpoint, net_remote_endpoint, net_client_ip, net_server_ip, net_protocol

Kubernetes Troubleshooting Metrics

Kubernetes metrics relate to kubernetes itself, such as the workings of pods, containers and workloads. They are similar to OSS KSM Prometheus metrics, but are enriched by Sysdig for easy querying.

  • kube_workload_status_replicas_misscheduled
  • kube_workload_status_replicas_scheduled
  • kube_workload_status_replicas_updated
  • kube_pod_container_status_last_terminated_reason
  • kube_pod_container_status_ready
  • kube_pod_container_status_restarts_total
  • kube_pod_container_status_running
  • kube_pod_container_status_terminated
  • kube_pod_container_status_terminated_reason
  • kube_pod_container_status_waiting
  • kube_pod_container_status_waiting_reason
  • kube_pod_init_container_status_last_terminated_reason
  • kube_pod_init_container_status_ready
  • kube_pod_init_container_status_restarts_total
  • kube_pod_init_container_status_running
  • kube_pod_init_container_status_terminated
  • kube_pod_init_container_status_terminated_reason
  • kube_pod_init_container_status_waiting
  • kube_pod_init_container_status_waiting_reason

HTTP URL Metrics

HTTP URL metrics relate to individual HTTP aggregated per individual URL.

  • sysdig_host_net_http_url_error_count
  • sysdig_host_net_http_url_request_count
  • sysdig_host_net_http_url_request_time
  • sysdig_container_net_http_url_error_count
  • sysdig_container_net_http_url_request_count
  • sysdig_container_net_http_url_request_time

In addition to the user-defined labels and standard set of labels Sysdig provides, you can use the label net_http_url to segment HTTP URL level metrics.

SQL Query Metrics

SQL Query metrics are created per SQL query. Sysdig agent detects SQL query requests in network traffic and calculates metrics. This metric does not calculate queries on encrypted connections (for example, TLS).

  • sysdig_host_net_sql_query_error_count
  • sysdig_host_net_sql_query_request_count
  • sysdig_host_net_sql_query_request_time
  • sysdig_host_net_sql_querytype_error_count
  • sysdig_host_net_sql_querytype_request_count
  • sysdig_host_net_sql_querytype_request_time
  • sysdig_container_net_sql_query_error_count
  • sysdig_container_net_sql_query_request_count
  • sysdig_container_net_sql_query_request_time
  • sysdig_container_net_sql_querytype_error_count
  • sysdig_container_net_sql_querytype_request_count
  • sysdig_container_net_sql_querytype_request_time

In addition to the user-defined labels and standard set of labels Sysdig provides, you can use the label net_sql_querytype to segment SQL Query metrics by query type.