Working with the Data API

The data API provides access to the labels and metrics data captured by Sysdig agents and stored in the Sysdig datastores. Sysdig agents capture process, network, system and other infrastructure data with a 1-second resolution, and sends them to the Sysdig worker service with a 10-second resolution.

The data API allows you to fetch data at the native resolution of 10-second or lower. You can specify the resolution to return data via the sampling parameter. Each resolution has different data retention periods. Since native data capturing is performed with a 1-second resolution, for each metric you need to specify a time aggregation.

Similarly, data associated with individual pods, processes, and so can be aggregated at a higher level. For example, at the host or namespace level. This type of aggregation depends on the list of labels you specify. For instance, you could aggregate data at the host level by specifying a host label. For this to work, you need to specify a group aggregation.

To learn more about data aggregation, see Data Aggregation.

General Guidelines for Using the Data API

  • The maximum number of samples you can fetch via the data API is 600. Consequently, the larger the time window for the data retrieval, the lower the resolution will be.

  • The API enforces a response timeout of 30 seconds. Larger time windows, a higher number of metrics, multiple segments (higher number of columns and rows) might cause a longer response time.

  • Some labels might not apply to certain entities. When those labels are retrieved, the label value will be null.

  • Due to a current limitation, the same metric name cannot be specified more than once independent of the aggregations.

REST Resource: Data

See Sysdig REST API Conventions for generic conventions and authentication.

Request Variables

Field

Description

last

Specifies the time window. The timestamp is expressed in seconds.

start

end

An alternative to last to specify the time window.

The timestamp is expressed in seconds.

sampling

Data resolution expressed in seconds. Sampling gives a single aggregated value across the entire window. It's value can be one of the following;

  • end - start

  • last

filter

Specifies the scope for the data to be returned. The simple expression to filter out data is:

(not) label operator value

The filter can also be a set of expressions, for instance, expr1 AND expr2 OR expr3.

  • label: Any label that allows for segmentation

  • operator: Supported operators are:

    • =

    • !=

    • in

    • contains

    • starts with

  • value is one of the following:

    • single value

    • list of values

    • null (supported with = and != operators only)

For example:

POST /api/data

{
  "filter": "kubernetes.node.name = 'n1'",
  ...

metrics

Specifies the lost of labels or metrics, or both, to be returned. Labels require only ID, whereas metrics require ID as well as the time and group aggregations.

Time aggregations: timeAvg (rate), average, minimum, maximum, sum

Group aggregations: average, minimum, maximum, sum

"metrics": [
    {"id": "container.name"},
    {
      "id": "cpu.cores.used",
      "aggregations": { "time": "avg", "group": "avg" }
    }

The first metric requested in the query is the container name. This is a segmentation metric, and therefore, no aggregation criteria is specified. This second metrics queries for CPU utilization of each container separately. The metric is aggregated as an average.

dataSourceType

Specifies the type of entity for which metrics are retrieved. This is particularly useful when the same metric name, for instance cpu.cores.used, is used for different sources.

Accepted values: host and container.

paging

Specifies the number of rows of data to be returned. By default, rows from 0 to 9 (10 rows) are returned. Pagination is applied to the sorted rows according to the following criteria:

  • If a metric is available, sort by first metric value (descending)

  • If a metric is unavailable, sort by first label value (descending)

Response Variables

Field

Description

data

Returns a list of data points.

Each data point is uniquely identified by timestamp and a list of label values.

t: timestamp expressed in seconds.

d: list of values representing labels and metrics, sorted as given in the request.

start

end

Returns the actual time window of the drawn data.

It may be different from the time when the API requested for data (eg. to align to timelines)

Sample Request and Response

In this example, CPU cores for containers are retrieved.

POST /api/data

{
  "last": 600,
  "sampling": 600,
  "filter": null,
  "metrics": [
    {
      "id": "cpu.cores.used",
      "aggregations": { "time": "avg", "group": "sum" }
    }
  ],
  "dataSourceType": "container",
  "paging": {
    "from": 0,
    "to": 99
  }
}

Given below is a sample response :

{
    "data": [
        {
            "t": 1582756200,
            "d": [
                6.481
            ]
        }
    ],
    "start": 1582755600,
    "end": 1582756200
}

Python Script Library for Data API

Sysdig Python client for interacting with the Data API. See Python SDC Client for comprehensive examples and documentation.

Function: sysdig_api.get_data

sdclient.get_data(metrics,start_ts,end_ts,sampling_s,filter,paging,datasource_type)

Returns the requested metrics data.

Arguments

Arguments

Description

start_ts

Start of a query time window. The timestamp is expressed in seconds.

end_ts

End of a query time window. The timestamp is expressed in seconds.

sampling_s

Data resolution expressed in seconds.

filter

Specifies the scope for the data to be returned.

metrics

List of metrics to query.

datasource_type

The source for the metrics.

Accepted values: host and container.

paging

Specifies the number of rows of data to be returned. By default, rows from 0 to 9 (10 rows) are returned.

Successful Return Value

Returns the requested metrics data in a JSON file.

Sample Script

ok, res = sysdig_api.get_data(
  start_ts = -600,
  end_ts = 0,
  sampling_s = 600,
  filter = None,
  metrics = [
    {
      "id": "cpu.cores.used",
      "aggregations": {
          "time": "avg",
          "group": "sum"
      }
    }
  ],
  datasource_type = "container",
  paging = {"from": 0, "to": 99}
)