Portworx

Metrics, Dashboards, Alerts and more for Portworx Integration in Sysdig Monitor.
Portworx

This integration is enabled by default.

Versions supported: > v2.9.1.1

This integration is out-of-the-box, so it doesn’t require any exporter.

This integration has 75 metrics.

Timeseries generated: 1090 timeseries

List of Alerts

AlertDescriptionFormat
[Portworx] No QuorumPortworx No Quorum.Prometheus
[Portworx] Node Status Not OKPortworx Node Status Not OK.Prometheus
[Portworx] Offline NodesPortworx Offline Nodes.Prometheus
[Portworx] Nodes Storage Full or DownPortworx Nodes Storage Full or Down.Prometheus
[Portworx] Offline Storage NodesPortworx Offline Storage Nodes.Prometheus
[Portworx] Unhealthy Node KVDBPortworx Unhealthy Node KVDB.Prometheus
[Portworx] Cache read hit rate is lowPortworx Cache read hit rate is low.Prometheus
[Portworx] Cache write hit rate is lowPortworx Cache write hit rate is low.Prometheus
[Portworx] High Read Latency In DiskPortworx High Read Latency In Disk.Prometheus
[Portworx] High Write Latency In DiskPortworx High Write Latency In Disk.Prometheus
[Portworx] Low Cluster CapacityPortworx Low Cluster Capacity.Prometheus
[Portworx] Disk Full In 48HPortworx Disk Full In 48H.Prometheus
[Portworx] Disk Full In 12HPortworx Disk Full In 12H.Prometheus
[Portworx] Pool Status Not OnlinePortworx Node Status Not Online.Prometheus
[Portworx] High Write Latency In PoolPortworx High Write Latency In Pool.Prometheus
[Portworx] Pool Full In 48HPortworx Pool Full In 48H.Prometheus
[Portworx] Pool Full In 12HPortworx Pool Full In 12H.Prometheus
[Portworx] High Write Latency In VolumePortworx High Write Latency In Volume.Prometheus
[Portworx] High Read Latency In VolumePortworx High Read Latency In Volume.Prometheus
[Portworx] License ExpiryPortworx License Expiry.Prometheus

List of Dashboards

Portworx Cluster

The dashboard provides information on the status of the Portworx cluster. Portworx Cluster

Portworx Volumes

The dashboard provides information on the status of the Portworx volumes. Portworx Volumes

List of Metrics

Metric name
go_build_info
go_gc_duration_seconds
go_gc_duration_seconds_count
go_gc_duration_seconds_sum
go_goroutines
go_memstats_buck_hash_sys_bytes
go_memstats_gc_sys_bytes
go_memstats_heap_alloc_bytes
go_memstats_heap_idle_bytes
go_memstats_heap_inuse_bytes
go_memstats_heap_released_bytes
go_memstats_heap_sys_bytes
go_memstats_lookups_total
go_memstats_mallocs_total
go_memstats_mcache_inuse_bytes
go_memstats_mcache_sys_bytes
go_memstats_mspan_inuse_bytes
go_memstats_mspan_sys_bytes
go_memstats_next_gc_bytes
go_memstats_stack_inuse_bytes
go_memstats_stack_sys_bytes
go_memstats_sys_bytes
go_threads
process_cpu_seconds_total
process_max_fds
process_open_fds
px_cluster_disk_available_bytes
px_cluster_disk_total_bytes
px_cluster_status_nodes_offline
px_cluster_status_nodes_online
px_cluster_status_nodes_storage_down
px_cluster_status_quorum
px_cluster_status_size
px_cluster_status_storage_nodes_decommissioned
px_cluster_status_storage_nodes_offline
px_cluster_status_storage_nodes_online
px_disk_stats_num_reads_total
px_disk_stats_num_writes_total
px_disk_stats_read_bytes_total
px_disk_stats_read_latency_seconds
px_disk_stats_used_bytes
px_disk_stats_write_latency_seconds
px_disk_stats_written_bytes_total
px_kvdb_health_state_node_view
px_network_io_received_bytes_total
px_network_io_sent_bytes_total
px_node_status_license_expiry
px_node_status_node_status
px_pool_stats_available_bytes
px_pool_stats_flushed_bytes_total
px_pool_stats_num_flushes_total
px_pool_stats_num_writes
px_pool_stats_status
px_pool_stats_total_bytes
px_pool_stats_write_latency_seconds
px_pool_stats_written_bytes
px_px_cache_read_hits
px_px_cache_read_miss
px_px_cache_write_hits
px_px_cache_write_miss
px_volume_attached
px_volume_attached_state
px_volume_capacity_bytes
px_volume_currhalevel
px_volume_halevel
px_volume_read_bytes_total
px_volume_read_latency_seconds
px_volume_reads_total
px_volume_replication_status
px_volume_state
px_volume_status
px_volume_usage_bytes
px_volume_write_latency_seconds
px_volume_writes_total
px_volume_written_bytes_total

Preparing the Integration

No preparations are required for this integration.

Installing

The installation of an exporter is not required for this integration.

Monitoring and Troubleshooting Portworx

This document describes important metrics and queries that you can use to monitor and troubleshoot Portworx.

Tracking metrics status

You can track Portworx metrics status with following alerts: Exporter proccess is not serving metrics

# [Portworx] Exporter Process Down
absent(px_cluster_status_size{kube_cluster_name=~$cluster,kube_namespace_name=~$namespace,kube_workload_name=~$workload}) > 0

Exporter proccess is not serving metrics

# [Portworx] Exporter Process Down
absent(px_volume_state{kube_cluster_name=~$cluster,kube_namespace_name=~$namespace,kube_workload_name=~$workload}) > 0

Agent Configuration

These are the default agent jobs for this integration:

- job_name: 'portworx-default'
  tls_config:
    insecure_skip_verify: true
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - action: keep
    source_labels: [__meta_kubernetes_pod_host_ip]
    regex: __HOSTIPS__
  - action: drop
    source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_omit]
    regex: true
  - action: replace
    source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
    target_label: __scheme__
    regex: (https?)
  - action: replace
    source_labels:
    - __meta_kubernetes_pod_container_name
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: (portworx);(.{0}$)
    replacement: portworx
    target_label: __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
  - action: keep
    source_labels:
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: "portworx"
  - action: replace
    source_labels: [__address__]
    regex: ([^:]+)(?::\d+)?
    replacement: $1:9001
    target_label: __address__
  - action: replace
    source_labels: [__meta_kubernetes_pod_uid]
    target_label: sysdig_k8s_pod_uid
  - action: replace
    source_labels: [__meta_kubernetes_pod_container_name]
    target_label: sysdig_k8s_pod_container_name
  metric_relabel_configs:
  - source_labels: [__name__]
    regex: (px_cluster_disk_available_bytes|px_cluster_disk_total_bytes|px_cluster_status_nodes_offline|px_cluster_status_nodes_online|px_cluster_status_nodes_storage_down|px_cluster_status_quorum|px_cluster_status_size|px_cluster_status_storage_nodes_decommissioned|px_cluster_status_storage_nodes_offline|px_cluster_status_storage_nodes_online|px_disk_stats_num_reads_total|px_disk_stats_num_writes_total|px_disk_stats_read_bytes_total|px_disk_stats_read_latency_seconds|px_disk_stats_used_bytes|px_disk_stats_write_latency_seconds|px_disk_stats_written_bytes_total|px_kvdb_health_state_node_view|px_network_io_received_bytes_total|px_network_io_sent_bytes_total|px_node_status_license_expiry|px_node_status_node_status|px_pool_stats_available_bytes|px_pool_stats_flushed_bytes_total|px_pool_stats_num_flushes_total|px_pool_stats_num_writes|px_pool_stats_status|px_pool_stats_total_bytes|px_pool_stats_write_latency_seconds|px_pool_stats_written_bytes|px_px_cache_read_hits|px_px_cache_read_miss|px_px_cache_write_hits|px_px_cache_write_miss|px_volume_attached|px_volume_attached_state|px_volume_capacity_bytes|px_volume_currhalevel|px_volume_halevel|px_volume_read_bytes_total|px_volume_read_latency_seconds|px_volume_reads_total|px_volume_replication_status|px_volume_state|px_volume_status|px_volume_usage_bytes|px_volume_write_latency_seconds|px_volume_writes_total|px_volume_written_bytes_total)
    action: keep


- job_name: 'portworx-openshift-default'
  tls_config:
    insecure_skip_verify: true
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - action: keep
    source_labels: [__meta_kubernetes_pod_host_ip]
    regex: __HOSTIPS__
  - action: drop
    source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_omit]
    regex: true
  - action: replace
    source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
    target_label: __scheme__
    regex: (https?)
  - action: replace
    source_labels:
    - __meta_kubernetes_pod_container_name
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: (portworx);(.{0}$)
    replacement: portworx
    target_label: __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
  - action: keep
    source_labels:
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: "portworx"
  - action: replace
    source_labels: [__address__]
    regex: ([^:]+)(?::\d+)?
    replacement: $1:17001
    target_label: __address__
  - action: replace
    source_labels: [__meta_kubernetes_pod_uid]
    target_label: sysdig_k8s_pod_uid
  - action: replace
    source_labels: [__meta_kubernetes_pod_container_name]
    target_label: sysdig_k8s_pod_container_name
  metric_relabel_configs:
  - source_labels: [__name__]
    regex: (go_build_info|go_gc_duration_seconds|go_gc_duration_seconds_count|go_gc_duration_seconds_sum|go_goroutines|go_memstats_buck_hash_sys_bytes|go_memstats_gc_sys_bytes|go_memstats_heap_alloc_bytes|go_memstats_heap_idle_bytes|go_memstats_heap_inuse_bytes|go_memstats_heap_released_bytes|go_memstats_heap_sys_bytes|go_memstats_lookups_total|go_memstats_mallocs_total|go_memstats_mcache_inuse_bytes|go_memstats_mcache_sys_bytes|go_memstats_mspan_inuse_bytes|go_memstats_mspan_sys_bytes|go_memstats_next_gc_bytes|go_memstats_stack_inuse_bytes|go_memstats_stack_sys_bytes|go_memstats_sys_bytes|go_threads|process_cpu_seconds_total|process_max_fds|process_open_fds|px_cluster_disk_available_bytes|px_cluster_disk_total_bytes|px_cluster_status_nodes_offline|px_cluster_status_nodes_online|px_cluster_status_nodes_storage_down|px_cluster_status_quorum|px_cluster_status_size|px_cluster_status_storage_nodes_decommissioned|px_cluster_status_storage_nodes_offline|px_cluster_status_storage_nodes_online|px_disk_stats_num_reads_total|px_disk_stats_num_writes_total|px_disk_stats_read_bytes_total|px_disk_stats_read_latency_seconds|px_disk_stats_used_bytes|px_disk_stats_write_latency_seconds|px_disk_stats_written_bytes_total|px_kvdb_health_state_node_view|px_network_io_received_bytes_total|px_network_io_sent_bytes_total|px_node_status_license_expiry|px_node_status_node_status|px_pool_stats_available_bytes|px_pool_stats_flushed_bytes_total|px_pool_stats_num_flushes_total|px_pool_stats_num_writes|px_pool_stats_status|px_pool_stats_total_bytes|px_pool_stats_write_latency_seconds|px_pool_stats_written_bytes|px_px_cache_read_hits|px_px_cache_read_miss|px_px_cache_write_hits|px_px_cache_write_miss|px_volume_attached|px_volume_attached_state|px_volume_capacity_bytes|px_volume_currhalevel|px_volume_halevel|px_volume_read_bytes_total|px_volume_read_latency_seconds|px_volume_reads_total|px_volume_replication_status|px_volume_state|px_volume_status|px_volume_usage_bytes|px_volume_write_latency_seconds|px_volume_writes_total|px_volume_written_bytes_total)
    action: keep