Fluentd

Metrics, Dashboards, Alerts and more for Fluentd Integration in Sysdig Monitor.
Fluentd

This integration is enabled by default.

Versions supported: > v1.12.4

This integration is out-of-the-box, so it doesn’t require any exporter.

This integration has 12 metrics.

Timeseries generated: 640 timeseries

List of Alerts

AlertDescriptionFormat
[Fluentd] No Input From ContainerNo Input From Container. This alert does not work in OpenShift.Prometheus
[Fluentd] High Error RatioHigh Error Ratio.Prometheus
[Fluentd] High Retry RatioHigh Retry Ratio.Prometheus
[Fluentd] High Retry WaitHigh Retry Wait.Prometheus
[Fluentd] Low Buffer Available SpaceLow Buffer Available Space.Prometheus
[Fluentd] Buffer Queue Length IncreasingBuffer Queue Length Increasing.Prometheus
[Fluentd] Buffer Total Bytes IncreasingBuffer Total Bytes Increasing.Prometheus
[Fluentd] High Slow Flush RatioHigh Slow Flush Ratio.Prometheus
[Fluentd] No Output Records From PluginNo Output Records From Plugin.Prometheus

List of Dashboards

Fluentd

The dashboard provides information on the status of Fluentd. Fluentd

List of Metrics

Metric name
fluentd_input_status_num_records_total
fluentd_output_status_buffer_available_space_ratio
fluentd_output_status_buffer_queue_length
fluentd_output_status_buffer_total_bytes
fluentd_output_status_emit_count
fluentd_output_status_emit_records
fluentd_output_status_flush_time_count
fluentd_output_status_num_errors
fluentd_output_status_retry_count
fluentd_output_status_retry_wait
fluentd_output_status_rollback_count
fluentd_output_status_slow_flush_count

Preparing the Integration

OpenShift

If you have installed Fluentd using the OpenShift Logging Operator, no further action is required to enable monitoring.

Kubernetes

Enable Prometheus Metrics

For Fluentd to expose Prometheus metrics, enable the following plugins:

  • ‘prometheus’ input plugin
  • ‘prometheus_monitor’ input plugin
  • ‘prometheus_output_monitor’ input plugin

As seen in the official plugin documentation, you can enable them with the following configurations:

<source>
    @type prometheus
    @id in_prometheus
    bind "0.0.0.0"
    port 24231
    metrics_path "/metrics"
</source>

<source>
    @type prometheus_monitor
    @id in_prometheus_monitor
</source>

<source>
    @type prometheus_output_monitor
    @id in_prometheus_output_monitor
</source>

If you are deploying Fluentd using the official Helm chart, it already has these plugins enabled by default in its configuration, so no additional actions are needed.

Installing

The installation of an exporter is not required for this integration.

Monitoring and Troubleshooting Fluentd

This document describes important metrics and queries that you can use to monitor and troubleshoot Fluentd.

Tracking metrics status

You can track Fluentd metrics status with following alerts: Exporter proccess is not serving metrics

# [Fluentd] Exporter Process Down
absent(fluentd_output_status_buffer_available_space_ratio{kube_cluster_name=~$cluster,kube_namespace_name=~$namespace,kube_workload_name=~$workload}) > 0

Agent Configuration

These are the default agent jobs for this integration:

- job_name: 'fluentd-default'
  tls_config:
    insecure_skip_verify: true
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - action: keep
    source_labels: [__meta_kubernetes_pod_host_ip]
    regex: __HOSTIPS__
  - action: drop
    source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_omit]
    regex: true
  - action: replace
    source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
    target_label: __scheme__
    regex: (https?)
  - action: replace
    source_labels:
    - __meta_kubernetes_pod_container_name
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: (fluentd);(.{0}$)
    replacement: fluentd
    target_label: __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
  - action: keep
    source_labels:
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: "fluentd"
  - action: replace
    source_labels: [__meta_kubernetes_pod_uid]
    target_label: sysdig_k8s_pod_uid
  - action: replace
    source_labels: [__meta_kubernetes_pod_container_name]
    target_label: sysdig_k8s_pod_container_name
  metric_relabel_configs:
  - action: replace
    source_labels: 
    - __name__
    - tag
    regex: fluentd_input_status_num_records_total;kubernetes.var.log.containers.([a-zA-Z0-9 \d\.-]+)_([a-zA-Z0-9 \d\.-]+)_([a-zA-Z0-9 \d\.-]+)-[a-zA-Z0-9]+.log
    target_label: input_pod
    replacement: $1
  - action: replace
    source_labels: 
    - __name__
    - tag
    regex: fluentd_input_status_num_records_total;kubernetes.var.log.containers.([a-zA-Z0-9 \d\.-]+)_([a-zA-Z0-9 \d\.-]+)_([a-zA-Z0-9 \d\.-]+)-[a-zA-Z0-9]+.log
    target_label: input_namespace
    replacement: $2
  - action: replace
    source_labels: 
    - __name__
    - tag
    regex: fluentd_input_status_num_records_total;kubernetes.var.log.containers.([a-zA-Z0-9 \d\.-]+)_([a-zA-Z0-9 \d\.-]+)_([a-zA-Z0-9 \d\.-]+)-[a-zA-Z0-9]+.log
    target_label: input_container
    replacement: $3

    
- job_name: openshift-fluentd-default
  scheme: https
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    insecure_skip_verify: true
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - action: keep
    source_labels: [__meta_kubernetes_pod_host_ip]
    regex: __HOSTIPS__
  - action: drop
    source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_omit]
    regex: true
  - action: replace
    source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
    target_label: __scheme__
    regex: (https?)
  - action: replace
    source_labels:
    - __meta_kubernetes_pod_container_name
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: (collector);(.{0}$)
    replacement: collector
    target_label: __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
  - action: keep
    source_labels:
    - __meta_kubernetes_pod_annotation_promcat_sysdig_com_integration_type
    regex: "collector"
  - action: replace
    source_labels: [__meta_kubernetes_pod_uid]
    target_label: sysdig_k8s_pod_uid
  - action: replace
    source_labels: [__meta_kubernetes_pod_container_name]
    target_label: sysdig_k8s_pod_container_name
  metric_relabel_configs:
  - source_labels: [__name__]
    regex: (fluentd_output_status_buffer_available_space_ratio|fluentd_output_status_buffer_queue_length|fluentd_output_status_buffer_total_bytes|fluentd_output_status_emit_count|fluentd_output_status_emit_records|fluentd_output_status_flush_time_count|fluentd_output_status_num_errors|fluentd_output_status_retry_count|fluentd_output_status_retry_wait|fluentd_output_status_rollback_count|fluentd_output_status_slow_flush_count)
    action: keep