Create a Custom App Check
Application checks are integrations that allow the Sysdig agent to poll specific metrics exposed by any application, and the built-in app checks currently supported are listed on the App Checks main page. Many other Java-based applications are also supported out-of-the-box.
If your application is not already supported though, you have a few options:
Utilize Prometheus, StatsD, or JMX to collect custom metrics:
Send a request at support@sysdig.com, and we'll do our best to add support for your application.
Create your own check by following the instructions below.
If you do write a custom check, let us know. We love hearing about how our users extend Sysdig Monitor, and we can also consider embedding your app check automatically in the Sysdig agent.
See also Understanding the Agent Config Files for details on accessing and editing the agent configuration files in general.
Check Anatomy
Essentially, an app check is a Python Class that extends AgentCheck
:
from checks import AgentCheck class MyCustomCheck(AgentCheck): # namespaces of the monitored process to join # right now we support 'net', 'mnt' and 'uts' # put there the minimum necessary namespaces to join # usually 'net' is enough. In this case you can also omit the variable # NEEDED_NS = ( 'net', ) # def __init__(self, name, init_config, agentConfig): # ''' # Optional, define it if you need custom initialization # remember to accept these parameters and pass them to the superclass # ''' # AgentCheck.__init__(self, name, init_config, agentConfig) # self.myvar = None def check(self, instance): ''' This function gets called to perform the check. Connect to the application, parse the metrics and add them to aggregation using superclass methods like `self.gauge(metricname, value, tags)` ''' server_port = instance['port'] self.gauge("testmetric", 1)
Put this file into /opt/draios/lib/python/checks.custom.d
(create the directory if not present) and it will be available to the Sysdig agent. To run your checks, you need to supply configuration information in the agent's config file, dragent.yaml
as is done with bundled checks:
app_checks: - name: voltdb # check name, must be unique # name of your .py file, if it's the same of the check name you can omit it # check_module: voltdb pattern: # pattern to match the application comm: java arg: org.voltdb.VoltDB conf: port: 21212 # any key value config you need on `check(self, instance_conf)` function
Check Interface Detail
As you can see, the most important piece of the check interface is the check function. The function declaration is:
def check(self, instance)
instance
is a dict containing the configuration of the check. It will contain all the attributes found in the conf:
section in dragent.yaml
plus the following:
name
: The check unique name.ports
: An array of all listening ports of the process.port
: The first listening port of the process.
These attributes are available as defaults and allow you to automatically configure your check. The conf:
section as higher priority on these values.
Inside the check function you can call these methods to send metrics:
self.gauge(metric_name, value, tags) # Sample a gauge metric self.rate(metric_name, value, tags) # Sample a point, with the rate calculated at the end of the check self.increment(metric_name, value, tags) # Increment a counter metric self.decrement(metric_name, value, tags) # Decrement a counter metric self.histogram(metric_name, value, tags) # Sample a histogram metric self.count(metric_name, value, tags) # Sample a raw count metric self.monotonic_count(metric_name, value, tags) # Sample an increasing counter metric
Usually the most used are gauge
and rate
. Besides metric_name
and value
parameters that are quite obvious, you can also add tags
to your metric using this format:
tags = [ "key:value", "key2:value2", "key_without_value"]
It is an array of string representing tags in both single or key/value approach. They will be useful in Sysdig Monitor for graph segmentation.
You can also send service checks which are on/off metrics, using this interface:
self.service_check(name, status, tags)
Where status can be:
AgentCheck.OK
AgentCheck.WARNING
AgentCheck.CRITICAL
AgentCheck.UNKNOWN
Testing
To test your check you can launch Sysdig App Checks from the command line to avoid running the full agent and iterate faster:
# from /opt/draios directory ./bin/sdchecks runCheck <check_unique_name> <process_pid> [<process_vpid>] [<process_port>]
check_unique_name
: The check name as on config file.pid
: Process pid seen from host.vpid
: Optional, process pid seen inside the container, defaults to 1.port
: Optional, port where the process is listening, defaults to None.
Example:
./bin/sdchecks runCheck redis 1254 1 6379 5658:INFO:Starting 5658:INFO:Container support: True 5658:INFO:Run AppCheck for {'ports': [6379], 'pid': 5625, 'check': 'redis', 'vpid': 1} Conf: {'port': 6379, 'socket_timeout': 5, 'host': '127.0.0.1', 'name': 'redis', 'ports': [6379]} Metrics: # metrics array Checks: # metrics check Exception: None # exceptions
The output is intentionally raw to allow you to better debug what the check is doing.