This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

  • 1:

    Prioritize Designated Containers

    To get the most out of Sysdig Monitor, you may want to customize the way in which container data is prioritized and reported. Use this page to understand the default behavior and sorting rules, and to implement custom behavior when and where you need it. This can help reduce agent and backend load by not monitoring unnecessary containers, or– if encountering backend limits for containers– you can filter to ensure that the important containers are always reported.

    Overview

    By default, a Sysdig agent will collect metrics from all containers it detects in an environment. When reporting to the Monitor interface, it uses default sorting behavior to prioritize what container information to display first.

    Understand Default Behavior

    Out of the box, it chooses the containers with the highest

    • CPU

    • Memory

    • File IO

    • Net IO

    and allocates approximately 1/4 of the total limit to each stat type.

    Understand Simple Container Filtering

    As of agent version 0.86, it is possible set a use_container_filter parameter in the agent config file, tag/label specific containers, and set include/exclude rules to push those containers to the top of the reporting hierarchy.

    This is an effective sorting tool when:

    • You can manually mark each container with an include or exclude tag, AND

    • The number of includes is small (say, less than 100)

    In this case, the containers that explicitly match the include rules will take top priority.

    Understand Smart Container Reporting

    In some enterprises, the number of containers is too high to tag with simple filtering rules, and/or the include_all group is too large to ensure that the most-desired containers are consistently reported. As of Sysdig agent version 0.91, you can append another parameter to the agent config file, smart_container_reporting.

    This is an effective sorting tool when:

    • The number of containers is large and you can’t or won’t mark each one with include/exclude tags, AND

    • There are certain containers you would like to always prioritize

    This helps ensure that even when there are thousands of containers in an environment, the most-desired containers are consistently reported.

    Container filtering and smart container reporting affect the monitoring of all the processes/metrics within a container, including StatsD, JMX, app-checks, and built-in metrics.

    Prometheus metrics are attached to processes, rather than containers, and are therefore handled differently.

    The container limit is set in dragent.yaml under containers:limit:

    Understand Sysdig Aggregated Container

    The sydig_aggregated parameter is automatically activated when smart container reporting is enabled, to capture the most-desired metrics from the containers that were excluded by smart filtering and report them under a single entity. It appears like any other container in the Sysdig Monitor UI, with the name “sysdig_aggregated.

    Sysdig_aggregated can report on a wide array of metrics; see Sysdig_aggregated Container Metrics. However, because this is not a regular container, certain limitations apply:

    • container_id and container_image do not exist.

    • The aggregated container cannot be segmented by certain metrics that are excluded, such as process.

    • Some default dashboards associated with the aggregated container may have some empty graphs.

    Use Simple Container Filtering

    By default, the filtering feature is turned off. It can be enabled by adding the following line to the agent configuration:

    • use_container_filter: true

    When enabled, the agent will follow include/exclude filtering rules based on:

    • container image

    • container name

    • container label

    • Kubernetes annotation or label

    The default behavior in default.dragent.yaml excludes based on a container label (com.sysdig.report) and/or a Kubernetes pod annotation (.sysdig.com/report ).

    Container Condition Parameters and Rules

    Parameters

    The condition parameters are described in the following table:

    Pattern name

    Description

    Example

    container.image

    Matches if the process is running inside a container running the specified image

    - include:

    container.image: luca3m/prometheus-java-app

    container.name

    Matches if the process is running inside a container with the specified name

    - include:

    container.name: my-java-app

    container.label.*

    Matches if the process is running in a container that has a Label matching the given value

    - include:

    container.label.class: exporter

    kubernetes.<object>.annotation.* kubernetes.<object>.label.*

    Matches if the process is attached to a Kubernetes object (Pod, Namespace, etc.) that is marked with the Annotation/Label matching the given value.

    - include:

    kubernetes.pod.annotation.prometheus.io/scrape: true

    all

    Matches all. Use as last rule to determine default behavior.

    - include:

    all

    Rules

    Once enabled (when use_container_filter: true is set), the agent will follow filtering rules from the container_filter section.

    • Each rule is an include or exclude rule which can contain one or more conditions.

    • The first matching rule in the list will determine if the container is included or excluded.

    • The conditions consist of a key name and a value. If the given key for a container matches the value, the rule will be matched.

    • If a rule contains multiple conditions they all need to match for the rule to be considered a match.

    Default Configuraton

    The dragent.default.yaml contains the following default configuration for container filters:

    use_container_filter: false
    
    container_filter:
      - include:
          container.label.com.sysdig.report: true
      - exclude:
          container.label.com.sysdig.report: false
      - include:
          kubernetes.pod.annotation.sysdig.com/report: true
      - exclude:
          kubernetes.pod.annotation.sysdig.com/report: false
      - include:
            all
    

    Note that it excludes via a container.label and by a kubernetes.pod.annotation.

    The examples on this page show how to edit in the dragent.yaml file directly. Convert the examples to Docker or Helm commands, if applicable for your situation.

    Enable Container Filtering in the Agent Config File

    Option 1: Use the Default Configuration

    To enable container filtering using the default configuration in default.dragent.yaml (above), follow the steps below.

    1. Apply Labels and/or Annotations to Designated Containers

    To set up, decide which containers should be excluded from automatic monitoring.

    Apply the container label .com.sysdig.report and/or the Kubernetes pod annotation sysdig.com/report to the designated containers.

    2. Edit the Agent Configuration

    Add the following line to dragent.yaml to turn on the default functionality:

    use_container_filter: true
    

    Option 2: Define Your Own Rules

    You can also edit dragent.yaml to apply your own container filtering rules.

    1. Designate Containers

    To set up, decide which containers should be excluded from automatic monitoring.

    Note the image, name, label, or Kubernetes pod information as appropriate, and build your rule set accordingly.

    2. Edit the Agent Configuration

    For example:

    use_container_filter: true
    
    container_filter:
      - include:
          container.name: my-app
      - include:
          container.label.com.sysdig.report: true
      - exclude:
          kubernetes.namespace.name: kube-system
          container.image: "gcr.io*"
      - include:
          all
    

    The above example shows a container_filter with 3 include rules and 1 exclude rule.

    • If the container name is “my-app” it will be included.

    • Likewise, if the container has a label with the key “com.sysdig.report” and with the value “true”.

    • If neither of those rules is true, and the container is part of a Kubernetes hierarchy within the “kube-system” namespace and the container image starts with “gcr.io”, it will be excluded.

    • The last rule includes all, so any containers not matching an earlier rule will be monitored and metrics for them will be sent to the backend.

    Use Smart Container Reporting

    As of Sysdig agent version 0.91, you can add another parameter to the config file: smart_container_reporting = true

    This enables several new prioritization checks:

    • container_filter (you would enable and set include/exclude rules, as described above)

    • container age

    • high stats

    • legacy patterns

    The sort is modified with the following rules in priority order:

    1. User-specified containers come before others

    2. Containers reported previously should be reported before those which have never been reported

    3. Containers with higher usage by each of the 4 default stats should come before those with lower usage

    Enable Smart Container Reporting and sysdig_aggregated

    1. Set up any simple container filtering rules you need, following either Option 1 or Option 2, above.

    2. Edit the agent configuration:

      smart_container_reporting: true
      
    3. This turns on both smart_container_reporting and sysdig_aggregated. The changes will be visible in the Sysdig Monitor UI.

      See also Sysdig_aggregated Container Metrics..

    Logging

    When the log level is set to DEBUG, the following messages may be found in the logs:

    messagemeaning
    container <id>, no filter configuredcontainer filtering is not enabled
    container <id>, include in reportcontainer is included
    container <id>, exclude in reportcontainer is excluded
    Not reporting thread <thread-id> in container <id>Process thread is excluded

    See also: Optional: Change the Agent Log Level.

    1 -

    Sysdig Aggregated Container Metrics

    Sysdig_aggregated containers can report on the following metrics:

    • tcounters

      • other

        • time_ns

        • time_percentage

        • count

      • io_file

        • time_ns_in

        • time_ns_out

        • time_ns_other

        • time_percentage_in

        • time_percentage_out

        • time_percentage_other

        • count_in

        • count_out

        • count_other

        • bytes_in

        • bytes_out

        • bytes_other

      • io_net

        • time_ns_in

        • time_ns_out

        • time_ns_other

        • time_percentage_in

        • time_percentage_out

        • time_percentage_other

        • count_in

        • count_out

        • count_other

        • bytes_in

        • bytes_out

        • bytes_other

      • processing

        • time_ns

        • time_percentage

        • count

    • reqcounters

      • other

        • time_ns

        • time_percentage

        • count

      • io_file

        • time_ns_in

        • time_ns_out

        • time_ns_other

        • time_percentage_in

        • time_percentage_out

        • time_percentage_other

        • count_in

        • count_out

        • count_other

        • bytes_in

        • bytes_out

        • bytes_other

      • io_net

        • time_ns_in

        • time_ns_out

        • time_ns_other

        • time_percentage_in

        • time_percentage_out

        • time_percentage_other

        • count_in

        • count_out

        • count_other

        • bytes_in

        • bytes_out

        • bytes_other

      • processing

        • time_ns

        • time_percentage

        • count

    • max_transaction_counters

      • time_ns_in

      • time_ns_out

      • count_in

      • count_out

    • resource_counters

      • connection_queue_usage_pct

      • fd_usage_pct

      • cpu_pct

      • resident_memory_usage_kb

      • swap_memory_usage_kb

      • major_pagefaults

      • minor_pagefaults

      • fd_count

      • cpu_shares

      • memory_limit_kb

      • swap_limit_kb

      • count_processes

      • proc_start_count

      • threads_count

    • syscall_errors

      • count

      • count_file

      • count_file_opened

      • count_net

    • protos

      • http

        • server_totals

          • ncalls

          • time_tot

          • time_max

          • bytes_in

          • bytes_out

          • nerrors

        • client_totals

          • ncalls

          • time_tot

          • time_max

          • bytes_in

          • bytes_out

          • nerrors

      • mysql

        • server_totals

          • ncalls

          • time_tot

          • time_max

          • bytes_in

          • bytes_out

          • nerrors

        • client_totals

          • ncalls

          • time_tot

          • time_max

          • bytes_in

          • bytes_out

          • nerrors

      • postgres

        • server_totals

          • ncalls

          • time_tot

          • time_max

          • bytes_in

          • bytes_out

          • nerrors

        • client_totals

          • ncalls

          • time_tot

          • time_max

          • bytes_in

          • bytes_out

          • nerrors

      • mongodb

        • server_totals

          • ncalls

          • time_tot

          • time_max

          • bytes_in

          • bytes_out

          • nerrors

        • client_totals

          • ncalls

          • time_tot

          • time_max

          • bytes_in

          • bytes_out

          • nerrors

    • names

    • transaction_counters

      • time_ns_in

      • time_ns_out

      • count_in

      • count_out