SysQL Reference Library

SysQL is a lightweight query language developed by Sysdig, inspired by Cypher. It is designed to interact with the Sysdig Secure datastore, which contains various entities and relationships. SysQL queries are categorized into Risk, Posture, and Vulnerabilities, enabling efficient security analysis. This topic covers the syntax, keywords, and examples to help you effectively use SysQL.

How Does SysQL Work?

Sysdig stores your resource inventory together with the security findings, such as events or vulnerabilities, in the Sysdig datastore. This allows Sysdig to design queries that match complex paths through the connected data. SysQL is available in a web user interface that helps designing the queries in an interactive way with the following widgets:

  • SysQL builder: Enables you to build your query from scratch by defining entities.
  • SysQL editor: Helps you to update your SysQL query, and add more entities.

Additionally, you can use the SysQL Public APIs to build queries and retrieve resources the Sysdig Secure datastore programmatically.

Clauses and Subclauses

SysQL uses the following basic key clauses to structure and filter queries:

ClausesDescription
MATCHDefines the pattern to match in the graph
OPTIONALUsed with MATCH to specify optional patterns
WHEREFilters results based on specified criteria

A Clause is the fundamental component of a SysQL and it defines specific operations or constraints. SysQL is structured using different clauses to manipulate and retrieve graph data efficiently. MATCH and OPTIONAL MATCH are the main clauses in SysQL.

For example: MATCH Vulnerability LIMIT 10;

A Subclause typically refers to a part of a larger SysQL query that performs a specific function or operation within the query. These subclauses help to organize and refine the query logic.

For example: LIMIT 10

MATCH

MATCH is the first clause of every SysQL query. Multiple MATCH clauses combine into a single pattern. MATCH is logically similar to the FROM clause in SQL.

The MATCH clause is the starting point of every SysQL statement. It defines the patterns to search for in the graph and is the main way to retrieve data. Typically, it is used with WHERE to refine results. Multiple MATCH clauses can be combined into one pattern.

Example: Get Vulnerability Data

You can retrieve vulnerability data stored in the graph by executing the following SysQL statement:

MATCH Vulnerability LIMIT 10;

OPTIONAL MATCH

OPTIONAL MATCH functions similarly to MATCH in pattern matching within the graph database. The key difference is that if no matches are found, OPTIONAL MATCH returns null for the missing parts of the pattern instead of failing. It can be considered the SysQL equivalent of an outer join in SQL, where either the entire pattern is matched or null is returned for unmatched elements.

If a relationship is optional, use the OPTIONAL MATCH clause. This works like a SQL outer join—returning the relationship if it exists or null if it does not.

Example: Retrieve all the Affected Kubernetes Workloads

If you want to retrieve all Kubernetes workloads affected by vulnerabilities, and optionally list any runtime events generated by them, issue the following SysQL statement:

MATCH KubeWorkload AS k AFFECTED_BY Vulnerability OPTIONAL MATCH RuntimeEvent GENERATED_BY k;

WHERE

WHERE is not a standalone clause but can be placed after a MATCH clause. It adds constraints to the pattern in a MATCH or OPTIONAL MATCH clause, functioning as an optional component rather than a separate clause.

Example: List Vulnerabilities with Names Ending 123

To list vulnerabilities with names ending in ‘123’ that also affect packages, run the following SysQL statement:

MATCH Vulnerability AS v AFFECTS Package WHERE v.name =~ 'CVE-.\*123';

Comparison Operators

The comparison operators comprise:

  • Equality: =
  • Inequality: <>
  • Less than: <
  • Greater than: >
  • Less than or equal to: <=
  • Greater than or equal to: >=
  • IS NULL
  • IS NOT NULL

Additional Operators

  • STARTS WITH: Performs case-sensitive prefix searching on STRING values.
  • ENDS WITH: Performs case-sensitive suffix searching on STRING values.
  • CONTAINS: Performs case-sensitive inclusion searching in STRING values.
  • =~: Regular expression for matching a pattern.

RETURN

RETURN is logically similar to the SELECT clause in SQL.

The RETURN clause specifies what to include in the query result. It is an essential part of your SysQL statement, where you define the parts of the pattern you are interested in. These can include nodes, relationships, or properties of these elements.

The term “entity” refers to either a node or a relationship within the graph.

Example: Returning an Entity or Node

To return a node, list it in the RETURN statement:

MATCH ​​AWSAccount AS a WHERE a.id = '474668386876' RETURN a;

The query returns the AWS Account entity with the account ID ‘474668386876’.

Example: Returning Relationships

To return a relationship, just include it in the RETURN list.

MATCH AWSAccount AS a IN AWSOrganization AS b WHERE a.id = '474668386876' RETURN IN;

The query returns the IN relationship.

Example: Returning Properties

To return a property, use dot notation:

MATCH AWSAccount AS a WHERE a.id = '474668386876' RETURN a.name;

The query returns the AWS Account name of the account ID ‘474668386876’.

Example: Returning all Elements

To return all elements, run the following query:

MATCH AWSAccount AS a IN AWSOrganization AS b WHERE a.id = '474668386876' RETURN a, IN, b;

The query returns the AWS Account, AWS Organization, and the IN relationship.

Example: Aliasing a Field

To rename a field, use AS :

MATCH AWSAccount AS a WHERE a.id = '474668386876' RETURN a.name AS accountName;

Returns the name of the AWS account, but renames the field as accountName.

DISTINCT

DISTINCT retrieves only the unique records based on the fields selected for output.

MATCH AWSAccount AS a RETURN DISTINCT a;

The query returns node AWSAccount once.

Return Items

You can request for nodes, property references, functions, certain other operators like existential and pattern comprehension in the query result.

MATCH AWSAccount AS a RETURN a ORDER BY a.name;

Performing Aggregations

This is not a GROUP BY clause. All RETURN items that are aggregated serve as grouping keys. A grouping key is essentially a unique tuple (a window or bin), and aggregates are computed independently for each unique grouping key.

ORDER BY

ORDER BY is a sub-clause following RETURN, and it specifies how the output should be sorted.

To return sorted results, use the following syntax:

MATCH AWSAccount AS a RETURN a ORDER BY a.name;

LIMIT/OFFSET

LIMIT constrains the number of records in the output. It accepts any expression that evaluates to a positive integer.

To return a subset of the result, run the following query:

MATCH AWSAccount AS a RETURN a LIMIT 10;

Graph Patterns

A graph pattern is the core component of MATCH, existential operators, and pattern comprehensions.

It consists of a mandatory node followed by an optional, unbounded alternation between edges and nodes.

Node EDGE Node\*

Nodes

Nodes are the primary objects in the graph. They are always represented as nouns and written in title case, such as KubeNode or AWSRegion.

Aliasing

Nodes in the MATCH clause are implicitly aliased with the node label. In a simple query like:

MATCH AWSAccount RETURN AWSAccount;

AWSAccount serves as both the node label and the node alias. To assign a custom alias, use the AS keyword.

MATCH AWSAccount AS a RETURN a;

This is sometimes necessary, as in:

MATCH AWSResource AS parent PARENT_OF AWSResource AS child RETURN parent, child;

If no aliases are introduced using AS:

MATCH AWSResource PARENT_OF AWSResource;

The query would search for an AWSResource connected to itself by a PARENT_OF relationship.

Relationships

Relationships are directed connections between nodes in a graph. They represent the associations between entities (nodes) and have both a type and direction. Each relationship can also have properties associated with it, just like nodes. Relationships are used to model how entities are related to one another in the graph, and they form the backbone of graph data models.

Virtual Relationships

Virtual relationships are not explicitly stored in the database but are inferred or computed at query time. They help simplify queries by representing patterns or connections without adding physical edges to the graph.

Functions and Other Operators

Scalar Functions

SysQL provides scalar functions, which are built-in functions that process a single value and return a single result. These functions enhance data manipulation efficiency and simplify complex calculations within SysQL.

Mathematical Numerical Functions

These functions operate only on numeric expressions and will return an error if used with any other data types. The following are the supported numeric mathematical functions:

abs()

abs() returns the absolute value of the given number.

Syntax: abs(expression)

To retrieve the absolute CVSS score of vulnerability, you can use the following query:

MATCH Vulnerability AS v RETURN abs(v.cvssScore) AS score, v.name;

Aggregating Functions

SysQL provides aggregation capabilities similar to SQL’s GROUP BY for computing aggregated data.

Aggregating functions process a set of values to compute a single aggregated result. Examples include

  • avg(): calculates the average of multiple numeric values
  • min(): finds the smallest numeric or string value within a dataset.

When using an aggregation function on a set of values, it applies the inner expression, for example n.CvssScore, to all records within the same aggregation group.

To sort the result set using aggregations, the aggregation must be included in the RETURN clause so that it can be referenced in ORDER BY.

You can use the DISTINCT operator alongside aggregation to ensure all values are unique before applying the aggregate function.

count()

count() returns the number for values or records. It returns an integer.

Syntax: count(expression)

To retrieve the total count of vulnerabilities with Critical severity, use the following query:

MATCH Vulnerability AS v WHERE v.severity = 'Critical' RETURN count(1) AS criticalVulnerabilityCount;

min()

min() returns the minimum value in a set of values.

Syntax: min(expression)

You can use the following syntax to retrieve the vulnerability with the lowest CVSS score.

MATCH Vulnerability AS v RETURN min(v.cvssScore) AS score, v.name;

max()

max() returns the maximum value in a set of values.

Syntax: max(expression)

You can use the following syntax to retrieve the vulnerability with the highest CVSS score.

MATCH Vulnerability AS v RETURN max(v.cvssScore) AS score, v.name;

avg()

avg() returns the average of a set of numerical values. It returns either an Integer or a Float, based on the expression’s values and whether the calculation results in an overflow.

Syntax: avg(expression)

You can use the following syntax to retrieve the vulnerability with the highest CVSS score.

MATCH Vulnerability AS v RETURN avg(v.cvssScore) AS score, v.name;

collect()

collect() returns a list of values produced by an expression, aggregating multiple records or values into a single list. The resulting list can contain heterogeneous elements, with types determined by the values returned by the expression.

Syntax: avg(expression)

You can use the following syntax to collect the distinct severities in a single list.

MATCH Vulnerability AS v RETURN collect(DISTINCT v.severity) AS severities;

Invalid Query Examples

When using SysQL, certain queries may not return results or could be invalid. Following are a few examples:

QueryReason
MATCH EC2Instance THAT IS AFFECTED BY Vulnerability THAT AFFECTS KubeWorkloadEC2Instances and KubeWorkloads are unrelated in your real infrastructure. They appear related in the graph only due to the vulnerability, but this does not reflect an actual relationship.
MATCH KubeWorkload THAT IS AFFECTED BY Vulnerability THAT AFFECTS KubeWorkloadThis query creates a cyclic traversal, leading to an empty result set.