Agent Install: Mesos | Marathon | DCOS
Marathon is the container orchestration platform for Mesosphere’s Datacenter Operating System (DC/OS) and Apache Mesos.
This guide describes how to install the Sysdig agent container on each underlying host in your Mesos cluster. Once installed, the agent will automatically connect to the Mesos and Marathon APIs to pull relevant metadata about the environment and will begin monitoring all of your hosts, apps, containers, and frameworks.
Standard Installation Instructions
Review the Host Requirements for Agent Installation.
In this three-part installation, you:
Deploy the Sysdig agent on all Mesos Agent (aka “Slave”) nodes, either automatically or by creating and posting a .
json
file to the leader Marathon API server.Deploy the Sysdig agent on the Mesos Master nodes.
Special configuration steps: modify the Sysdig agent config file to monitor Marathon instances.
Deploy the Sysdig agent on your Mesos Agent nodes
Preferred Option: Automatic install (DC/OS 1.11+)
If you’re using DC/OS 1.8 or higher, then you can find Sysdig in the Mesosphere Universe marketplace and install it from there.
It will automatically deploy the Sysdig agent container on each of your Mesos Agent nodes as a Marathon app.
Proceed to Deploy the Sysdig Agent.
Alternate Option: Post a .json file
If you are using a version of DC/OS earlier than 1.8 then:
Create a JSON file for Marathon, in the following format.
The
COLLECTOR
address comes from your own environment in on-prem installations. For SaaS installations, find the collector endpoint for your region listed here.COLLECTOR_PORT, SECURE,
andCHECK_CERT
are used in environments with Sysdig’s on-premises backend installed.{ "backoffFactor": 1.15, "backoffSeconds": 1, "constraints": [ [ "hostname", "UNIQUE" ] ], "container": { "docker": { "forcePullImage": true, "image": "sysdig/agent", "parameters": [], "privileged": true }, "type": "DOCKER", "volumes": [ { "containerPath": "/host/var/run/docker.sock", "hostPath": "/var/run/docker.sock", "mode": "RW" }, { "containerPath": "/host/dev", "hostPath": "/dev", "mode": "RW" }, { "containerPath": "/host/proc", "hostPath": "/proc", "mode": "RO" }, { "containerPath": "/host/boot", "hostPath": "/boot", "mode": "RO" }, { "containerPath": "/host/lib/modules", "hostPath": "/lib/modules", "mode": "RO" }, { "containerPath": "/host/usr", "hostPath": "/usr", "mode": "RO" } ] }, "cpus": 1, "deployments": [], "disk": 0, "env": { "ACCESS_KEY": "ACCESS_KEY=YOUR-ACCESS-KEY-HERE", "CHECK_CERT": "false", "SECURE": "true", "TAGS": "example_tag:example_value", "name": "sdc-agent", "pid": "host", "role": "monitoring", "shm-size": "350m" }, "executor": "", "gpus": 0, "id": "/sysdig-agent", "instances": 1, "killSelection": "YOUNGEST_FIRST", "labels": {}, "lastTaskFailure": { "appId": "/sysdig-agent", "host": "YOUR-HOST", "message": "Container exited with status 70", "slaveId": "1fa6f2fc-95b0-445f-8b97-7f91c1321250-S2", "state": "TASK_FAILED", "taskId": "sysdig-agent.3bb0759d-3fa3-11e9-b446-c60a7a2ee871", "timestamp": "2019-03-06T00:03:16.234Z", "version": "2019-03-06T00:01:57.182Z" }, "maxLaunchDelaySeconds": 3600, "mem": 850, "networks": [ { "mode": "host" } ], "portDefinitions": [ { "name": "default", "port": 10101, "protocol": "tcp" } ], "requirePorts": false, "tasks": [ { "appId": "/sysdig-agent", "healthCheckResults": [], "host": "YOUR-HOST-IP", "id": "sysdig-agent.0d5436f4-3fa4-11e9-b446-c60a7a2ee871", "ipAddresses": [ { "ipAddress": "YOUR-HOST-IP", "protocol": "IPv4" } ], "localVolumes": [], "ports": [ 4764 ], "servicePorts": [], "slaveId": "1fa6f2fc-95b0-445f-8b97-7f91c1321250-S2", "stagedAt": "2019-03-06T00:09:04.232Z", "startedAt": "2019-03-06T00:09:06.912Z", "state": "TASK_RUNNING", "version": "2019-03-06T00:09:04.182Z" } ], "tasksHealthy": 0, "tasksRunning": 1, "tasksStaged": 0, "tasksUnhealthy": 0, "unreachableStrategy": { "expungeAfterSeconds": 0, "inactiveAfterSeconds": 0 }, "upgradeStrategy": { "maximumOverCapacity": 1, "minimumHealthCapacity": 1 }, "version": "2019-03-06T00:09:04.182Z", "versionInfo": { "lastConfigChangeAt": "2019-03-06T00:09:04.182Z", "lastScalingAt": "2019-03-06T00:09:04.182Z" } }
See Table 1: Environment Variables for Agent Config Filef or the Sysdig
name:value
definitions.Complete the “
cpus
”, “mem
” and “labels
” (i.e. Marathon labels) entries to fit the capacity and requirements of the cluster environment.Update the created.
json
file to the leader Marathon API server:$ $curl -X POST http://$(hostname -i):8080/v2/apps -d @sysdig.json -H "Content-type: application/json"
Deploy the Sysdig Agent
After deploying the agent to the Mesos Agent nodes, you will install agents on each of the Mesos Master nodes as well.
If any cluster node has both Mesos Master and Mesos Agent roles, do not perform this installation step on that node. It already will have a Sysdig agent installed from the procedure in step A. Running duplicate Sysdig agents on a node will cause errors.
Use the Agent Install: Non-Orchestrated instructions to install the agent directly on each of your Mesos Master nodes.
When the Sysdig agent is successfully installed on the master nodes, it
will automatically connect to the local Mesos and Marathon (if
available) API servers via http://localhost:5050
and
http://localhost:8080
respectively, to collect cluster configuration
and current state metadata in addition to host metrics.
Special Configuration Steps
In certains situations, you may need to add additional configurations to
the dragent.yaml
file:
If the Sysdig agent cannot be run directly on the Mesos API server
If the API server is protected with a username/password.
Descriptions and examples are shown below.
If the Sysdig Agent Cannot Run On the Mesos API Server
Mesos allows multiple masters. If the API server can not be instrumented with a Sysdig agent, simply delegate ONE other node with an agent installed to remotely receive infrastructure information from the API server.
NOTE: If you manually configure the agent to point to a master with a static configuration file entry, then automatic detection/following of leader changes will no longer be enabled.
Add the following Mesos parameter to the delegated agent’s
dragent.yaml
file to allow it to connect to the remote API server and
authenticate, either by:
a. Directly editing dragent.yaml
on the host, or
b. Converting the YAML code to a single-line format and adding it as an
ADDITIONAL_CONF
argument in a Docker command.
See Understanding the Agent Config Files for details.
Specify the API server’s connection method, address, and port. Also specify credentials if necessary.
YAML example:
mesos_state_uri: http://[acct:passwd@][hostname][:port]
marathon_uris:
- http://[acct:passwd@][hostname][:port]
Although marathon_uris:
is an array, currently only a single “root”
Marathon framework per cluster is supported. Multiple side-by-side
Marathon frameworks should not be configured in order for our agent to
function properly. Multiple side-by-side “root” Marathon frameworks on
the same cluster are currently not supported. The only supported
multiple-Marathon configuration is with one “root” Marathon and other
Marathon frameworks as its apps.
If the Mesos API server requires authentication
If the agent is installed on the API server but the API server uses a different port or requires authentication, those parameters must be explicitly specified.
Add the following Mesos parameters to the API server’s dragent.yaml
to
make it connect to the API server and authenticate with any unique
account and password, either by:
a. Directly editing dragent.yaml
on the host, or
b. Converting the YAML code to a single-line format and adding it as an
ADDITIONAL_CONF
argument in a Docker command.
See Understanding the Agent Config Files for details.
Specify the API server’s protocol, user credentials, and port:
mesos_state_uri: http://[username:password@][hostname][:port]
marathon_uris:
- http://[acct:passwd@][hostname][:port]
*HTTPS protocol is also supported.
Troubleshooting: Turning Off Metadata Reception
In troubleshooting cases where auto-detection and reporting of your Mesos infrastructure needs to be temporarily turned off in a designated agent:
Comment out the Mesos parameter entries in the agent’s dragent.yaml file.
Example parameters to disable:
mesos_state_uri, marathon_uris
If the agent is running on the API server (Master node) and auto-detecting a default configuration, you can add the line:
mesos_autodetect: false
either directly in the dragent.yaml file or as an
ADDITIONAL_CONF
parameter in a Docker command.Restart the agent.
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.