Elasticsearch

Metrics, Dashboards, Alerts and more for Elasticsearch Integration in Sysdig Monitor.
Elasticsearch

This integration is enabled by default.

Versions supported: > v6.8

This integration uses a standalone exporter that is available in UBI or scratch base image.

This integration has 28 metrics.

Timeseries generated: 400 timeseries

List of Alerts

AlertDescriptionFormat
[Elasticsearch] Heap Usage Too HighThe heap usage is over 90%Prometheus
[Elasticsearch] Heap Usage WarningThe heap usage is over 80%Prometheus
[Elasticsearch] Disk Space LowDisk available less than 20%Prometheus
[Elasticsearch] Disk Out Of SpaceDisk available less than 10%Prometheus
[Elasticsearch] Cluster RedCluster in Red statusPrometheus
[Elasticsearch] Cluster YellowCluster in Yellow statusPrometheus
[Elasticsearch] Relocation ShardsRelocating shards for too longPrometheus
[Elasticsearch] Initializing ShardsInitializing shards takes too longPrometheus
[Elasticsearch] Unassigned ShardsUnassigned shards for long timePrometheus
[Elasticsearch] Pending TasksElasticsearch has a high number of pending tasksPrometheus
[Elasticsearch] No New DocumentsElasticsearch has no new documents for a period of timePrometheus

List of Dashboards

ElasticSearch Cluster

The dashboard provides information on the status of the ElasticSearch cluster health and its usage of resources. ElasticSearch Cluster

ElasticSearch Infra

The dashboard provides information on the usage of CPU, memory, disk and networking of ElasticSearch. ElasticSearch Infra

List of Metrics

Metric name
elasticsearch_cluster_health_active_primary_shards
elasticsearch_cluster_health_active_shards
elasticsearch_cluster_health_initializing_shards
elasticsearch_cluster_health_number_of_data_nodes
elasticsearch_cluster_health_number_of_nodes
elasticsearch_cluster_health_number_of_pending_tasks
elasticsearch_cluster_health_relocating_shards
elasticsearch_cluster_health_status
elasticsearch_cluster_health_unassigned_shards
elasticsearch_filesystem_data_available_bytes
elasticsearch_filesystem_data_size_bytes
elasticsearch_indices_docs
elasticsearch_indices_indexing_index_time_seconds_total
elasticsearch_indices_indexing_index_total
elasticsearch_indices_merges_total_time_seconds_total
elasticsearch_indices_search_query_time_seconds
elasticsearch_indices_store_throttle_time_seconds_total
elasticsearch_jvm_gc_collection_seconds_count
elasticsearch_jvm_gc_collection_seconds_sum
elasticsearch_jvm_memory_committed_bytes
elasticsearch_jvm_memory_max_bytes
elasticsearch_jvm_memory_used_bytes
elasticsearch_os_load1
elasticsearch_os_load15
elasticsearch_os_load5
elasticsearch_process_cpu_percent
elasticsearch_transport_rx_size_bytes_total
elasticsearch_transport_tx_size_bytes_total

Preparing the Integration

Create the Secrets

Keep in mind:

  • If your ElasticSearch cluster is using basic authentication, the secret that contains the url must have the user and password.
  • The secrets need to be created in the same namespace where the exporter will be deployed.
  • Use the same user name and password that you used for the api.
  • You can change the name of the secret. If you do this, you will need to select it in the next steps of the integration.

Create the Secret for the URL

Without Authentication
kubectl -n Your-Application-Namespace create secret generic elastic-url-secret \
  --from-literal=url='http://SERVICE:PORT'
With Basic Auth
kubectl -n Your-Application-Namespace create secret generic elastic-url-secret \
  --from-literal=url='https://USERNAME:PASSWORD@SERVICE:PORT'

NOTE: You can use either http or https in the URL.

Create the Secret for the TLS Certs

If you are using HTTPS with custom certificates, follow the instructions given below.

kubectl create -n Your-Application-Namespace secret generic elastic-tls-secret \
  --from-file=root-ca.crt=/path/to/tls/ca-cert \
  --from-file=root-ca.key=/path/to/tls/ca-key \
  --from-file=root-ca.pem=/path/to/tls/ca-pem

Installing

An automated wizard is present in the Monitoring Integrations in Sysdig Monitor. However, you can also use this Helm chart for expert users: https://github.com/sysdiglabs/integrations-charts/tree/main/charts/elasticsearch-exporter

Monitoring and Troubleshooting Elasticsearch

This document describes important metrics and queries that you can use to monitor and troubleshoot Elasticsearch.

Tracking metrics status

You can track Elasticsearch metrics status with following alerts: Exporter proccess is not serving metrics

# [Elasticsearch] Exporter Process Down
absent(elasticsearch_cluster_health_status{kube_cluster_name=~$cluster,kube_namespace_name=~$namespace,kube_workload_name=~$workload}) > 0

Exporter proccess is not serving metrics

# [Elasticsearch] Exporter Process Down
absent(elasticsearch_process_cpu_percent{kube_cluster_name=~$cluster,kube_namespace_name=~$namespace,kube_workload_name=~$workload}) > 0

Agent Configuration

This is the default agent job for this integration:

- job_name: elasticsearch-default
  tls_config:
    insecure_skip_verify: true
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - action: keep
    source_labels: [__meta_kubernetes_pod_host_ip]
    regex: __HOSTIPS__
  - action: keep
    source_labels:
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: "elasticsearch"
  - action: replace
    source_labels: [__address__, __meta_kubernetes_pod_annotation_promcat_sysdig_com_port]
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
    target_label: __address__
  - action: replace
    source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_target_ns]
    target_label: kube_namespace_name
  - action: replace
    source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_target_workload_type]
    target_label: kube_workload_type
  - action: replace
    source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_target_workload_name]
    target_label: kube_workload_name
  - action: replace
    replacement: true
    target_label: sysdig_omit_source
  - action: replace
    source_labels: [__meta_kubernetes_pod_uid]
    target_label: sysdig_k8s_pod_uid
  - action: replace
    source_labels: [__meta_kubernetes_pod_container_name]
    target_label: sysdig_k8s_pod_container_name
  metric_relabel_configs:
  - source_labels: [__name__]
    regex: (elasticsearch_cluster_health_active_primary_shards|elasticsearch_cluster_health_active_shards|elasticsearch_cluster_health_initializing_shards|elasticsearch_cluster_health_number_of_data_nodes|elasticsearch_cluster_health_number_of_nodes|elasticsearch_cluster_health_number_of_pending_tasks|elasticsearch_cluster_health_relocating_shards|elasticsearch_cluster_health_status|elasticsearch_cluster_health_unassigned_shards|elasticsearch_filesystem_data_available_bytes|elasticsearch_filesystem_data_size_bytes|elasticsearch_indices_docs|elasticsearch_indices_indexing_index_time_seconds_total|elasticsearch_indices_indexing_index_total|elasticsearch_indices_merges_total_time_seconds_total|elasticsearch_indices_search_query_time_seconds|elasticsearch_indices_store_throttle_time_seconds_total|elasticsearch_jvm_gc_collection_seconds_count|elasticsearch_jvm_gc_collection_seconds_sum|elasticsearch_jvm_memory_committed_bytes|elasticsearch_jvm_memory_max_bytes|elasticsearch_jvm_memory_pool_peak_used_bytes|elasticsearch_jvm_memory_used_bytes|elasticsearch_os_load1|elasticsearch_os_load15|elasticsearch_os_load5|elasticsearch_process_cpu_percent|elasticsearch_transport_rx_size_bytes_total|elasticsearch_transport_tx_size_bytes_total)
    action: keep