SysQL Reference Library
How Does SysQL Work?
Sysdig stores your resource inventory together with the security findings, such as events or vulnerabilities, in the Sysdig datastore. This allows Sysdig to design queries that match complex paths through the connected data. SysQL is available in a web user interface that helps designing the queries in an interactive way with the following widgets:
- SysQL builder: Enables you to build your query from scratch by defining entities.
- SysQL editor: Helps you to update your SysQL query, and add more entities.
Additionally, you can use the SysQL Public APIs to build queries and retrieve resources the Sysdig Secure datastore programmatically.
Clauses and Subclauses
SysQL uses the following basic key clauses to structure and filter queries:
Clauses | Description |
---|---|
MATCH | Defines the pattern to match in the graph |
OPTIONAL | Used with MATCH to specify optional patterns |
WHERE | Filters results based on specified criteria |
A Clause is the fundamental component of a SysQL and it defines specific operations or constraints. SysQL is structured using different clauses to manipulate and retrieve graph data efficiently. MATCH
and OPTIONAL MATCH
are the main clauses in SysQL.
For example: MATCH Vulnerability LIMIT 10;
A Subclause typically refers to a part of a larger SysQL query that performs a specific function or operation within the query. These subclauses help to organize and refine the query logic.
For example: LIMIT 10
MATCH
MATCH is the first clause of every SysQL query. Multiple MATCH clauses combine into a single pattern. MATCH is logically similar to the FROM
clause in SQL.
The MATCH
clause is the starting point of every SysQL statement. It defines the patterns to search for in the graph and is the main way to retrieve data. Typically, it is used with WHERE
to refine results. Multiple MATCH
clauses can be combined into one pattern.
Example: Get Vulnerability Data
You can retrieve vulnerability data stored in the graph by executing the following SysQL statement:
MATCH Vulnerability LIMIT 10;
OPTIONAL MATCH
OPTIONAL MATCH functions similarly to MATCH in pattern matching within the graph database. The key difference is that if no matches are found, OPTIONAL MATCH returns null for the missing parts of the pattern instead of failing. It can be considered the SysQL equivalent of an outer join in SQL, where either the entire pattern is matched or null is returned for unmatched elements.
If a relationship is optional, use the OPTIONAL MATCH clause. This works like a SQL outer join—returning the relationship if it exists or null if it does not.
Example: Retrieve all the Affected Kubernetes Workloads
If you want to retrieve all Kubernetes workloads affected by vulnerabilities, and optionally list any runtime events generated by them, issue the following SysQL statement:
MATCH KubeWorkload AS k AFFECTED_BY Vulnerability OPTIONAL MATCH RuntimeEvent GENERATED_BY k;
WHERE
WHERE is not a standalone clause but can be placed after a MATCH clause. It adds constraints to the pattern in a MATCH or OPTIONAL MATCH clause, functioning as an optional component rather than a separate clause.
Example: List Vulnerabilities with Names Ending 123
To list vulnerabilities with names ending in ‘123’ that also affect packages, run the following SysQL statement:
MATCH Vulnerability AS v AFFECTS Package WHERE v.name =~ 'CVE-.\*123';
Comparison Operators
The comparison operators comprise:
- Equality:
=
- Inequality:
<>
- Less than:
<
- Greater than:
>
- Less than or equal to:
<=
- Greater than or equal to:
>=
IS NULL
IS NOT NULL
Additional Operators
STARTS WITH
: Performs case-sensitive prefix searching onSTRING
values.ENDS WITH
: Performs case-sensitive suffix searching onSTRING
values.CONTAINS
: Performs case-sensitive inclusion searching inSTRING
values.=~
: Regular expression for matching a pattern.
RETURN
RETURN is logically similar to the SELECT clause in SQL.
The RETURN
clause specifies what to include in the query result. It is an essential part of your SysQL statement, where you define the parts of the pattern you are interested in. These can include nodes, relationships, or properties of these elements.
The term “entity” refers to either a node or a relationship within the graph.
Example: Returning an Entity or Node
To return a node, list it in the RETURN statement:
MATCH AWSAccount AS a WHERE a.id = '474668386876' RETURN a;
The query returns the AWS Account entity with the account ID ‘474668386876’.
Example: Returning Relationships
To return a relationship, just include it in the RETURN list.
MATCH AWSAccount AS a IN AWSOrganization AS b WHERE a.id = '474668386876' RETURN IN;
The query returns the IN relationship.
Example: Returning Properties
To return a property, use dot notation:
MATCH AWSAccount AS a WHERE a.id = '474668386876' RETURN a.name;
The query returns the AWS Account name of the account ID ‘474668386876’.
Example: Returning all Elements
To return all elements, run the following query:
MATCH AWSAccount AS a IN AWSOrganization AS b WHERE a.id = '474668386876' RETURN a, IN, b;
The query returns the AWS Account, AWS Organization, and the IN relationship.
Example: Aliasing a Field
To rename a field, use AS
MATCH AWSAccount AS a WHERE a.id = '474668386876' RETURN a.name AS accountName;
Returns the name of the AWS account, but renames the field as accountName
.
DISTINCT
DISTINCT
retrieves only the unique records based on the fields selected for output.
MATCH AWSAccount AS a RETURN DISTINCT a;
The query returns node AWSAccount
once.
Return Items
You can request for nodes, property references, functions, certain other operators like existential and pattern comprehension in the query result.
MATCH AWSAccount AS a RETURN a ORDER BY a.name;
Performing Aggregations
This is not a GROUP BY
clause. All RETURN
items that are aggregated serve as grouping keys. A grouping key is essentially a unique tuple (a window or bin), and aggregates are computed independently for each unique grouping key.
ORDER BY
ORDER BY
is a sub-clause following RETURN
, and it specifies how the output should be sorted.
To return sorted results, use the following syntax:
MATCH AWSAccount AS a RETURN a ORDER BY a.name;
LIMIT/OFFSET
LIMIT
constrains the number of records in the output. It accepts any expression that evaluates to a positive integer.
To return a subset of the result, run the following query:
MATCH AWSAccount AS a RETURN a LIMIT 10;
Graph Patterns
A graph pattern is the core component of MATCH
, existential operators, and pattern comprehensions.
It consists of a mandatory node followed by an optional, unbounded alternation between edges and nodes.
Node EDGE Node\*
Nodes
Nodes are the primary objects in the graph. They are always represented as nouns and written in title case, such as KubeNode
or AWSRegion
.
Aliasing
Nodes in the MATCH
clause are implicitly aliased with the node label. In a simple query like:
MATCH AWSAccount RETURN AWSAccount;
AWSAccount
serves as both the node label and the node alias. To assign a custom alias, use the AS
keyword.
MATCH AWSAccount AS a RETURN a;
This is sometimes necessary, as in:
MATCH AWSResource AS parent PARENT_OF AWSResource AS child RETURN parent, child;
If no aliases are introduced using AS
:
MATCH AWSResource PARENT_OF AWSResource;
The query would search for an AWSResource
connected to itself by a PARENT_OF
relationship.
Relationships
Relationships are directed connections between nodes in a graph. They represent the associations between entities (nodes) and have both a type and direction. Each relationship can also have properties associated with it, just like nodes. Relationships are used to model how entities are related to one another in the graph, and they form the backbone of graph data models.
Virtual Relationships
Virtual relationships are not explicitly stored in the database but are inferred or computed at query time. They help simplify queries by representing patterns or connections without adding physical edges to the graph.
Functions and Other Operators
Scalar Functions
SysQL provides scalar functions, which are built-in functions that process a single value and return a single result. These functions enhance data manipulation efficiency and simplify complex calculations within SysQL.
Mathematical Numerical Functions
These functions operate only on numeric expressions and will return an error if used with any other data types. The following are the supported numeric mathematical functions:
abs()
abs()
returns the absolute value of the given number.
Syntax: abs(expression)
To retrieve the absolute CVSS score of vulnerability, you can use the following query:
MATCH Vulnerability AS v RETURN abs(v.cvssScore) AS score, v.name;
Aggregating Functions
SysQL provides aggregation capabilities similar to SQL’s GROUP BY
for computing aggregated data.
Aggregating functions process a set of values to compute a single aggregated result. Examples include
avg()
: calculates the average of multiple numeric valuesmin()
: finds the smallest numeric or string value within a dataset.
When using an aggregation function on a set of values, it applies the inner expression, for example n.CvssScore
, to all records within the same aggregation group.
To sort the result set using aggregations, the aggregation must be included in the RETURN
clause so that it can be referenced in ORDER BY
.
You can use the DISTINCT
operator alongside aggregation to ensure all values are unique before applying the aggregate function.
count()
count()
returns the number for values or records. It returns an integer.
Syntax: count(expression)
To retrieve the total count of vulnerabilities with Critical severity, use the following query:
MATCH Vulnerability AS v WHERE v.severity = 'Critical' RETURN count(1) AS criticalVulnerabilityCount;
min()
min()
returns the minimum value in a set of values.
Syntax: min(expression)
You can use the following syntax to retrieve the vulnerability with the lowest CVSS score.
MATCH Vulnerability AS v RETURN min(v.cvssScore) AS score, v.name;
max()
max()
returns the maximum value in a set of values.
Syntax: max(expression)
You can use the following syntax to retrieve the vulnerability with the highest CVSS score.
MATCH Vulnerability AS v RETURN max(v.cvssScore) AS score, v.name;
avg()
avg()
returns the average of a set of numerical values. It returns either an Integer or a Float, based on the expression’s values and whether the calculation results in an overflow.
Syntax: avg(expression)
You can use the following syntax to retrieve the vulnerability with the highest CVSS score.
MATCH Vulnerability AS v RETURN avg(v.cvssScore) AS score, v.name;
collect()
collect()
returns a list of values produced by an expression, aggregating multiple records or values into a single list. The resulting list can contain heterogeneous elements, with types determined by the values returned by the expression.
Syntax: avg(expression)
You can use the following syntax to collect the distinct severities in a single list.
MATCH Vulnerability AS v RETURN collect(DISTINCT v.severity) AS severities;
Invalid Query Examples
When using SysQL, certain queries may not return results or could be invalid. Following are a few examples:
Query | Reason |
---|---|
MATCH EC2Instance THAT IS AFFECTED BY Vulnerability THAT AFFECTS KubeWorkload | EC2Instances and KubeWorkloads are unrelated in your real infrastructure. They appear related in the graph only due to the vulnerability, but this does not reflect an actual relationship. |
MATCH KubeWorkload THAT IS AFFECTED BY Vulnerability THAT AFFECTS KubeWorkload | This query creates a cyclic traversal, leading to an empty result set. |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.