AWS Lambda Metrics

Metrics, Dashboards, Alerts and more for AWS Lambda Metrics Integration in Sysdig Monitor.
AWS Lambda Metrics

This integration can be enabled via the Integrate AWS Lambda Telemetry API.

This integration has 4 metrics.

List of Alerts

AlertDescriptionFormat
[AWS Lambda Metrics] High Function Error RateHigh Function Error Rate.Prometheus
[AWS Lambda Metrics] Function TimeoutFunction reached timeout limit.Prometheus

List of Dashboards

AWS Lambda Metrics

The dashboard provides information on the AWS Lambda integration. AWS Lambda Metrics

List of Metrics

Metric name
aws_lambda_duration
aws_lambda_errors
aws_lambda_invocations
aws_lambda_postruntime_extensions_duration

Monitoring and Troubleshooting AWS Lambda Metrics

AWS Lambda service is one of the main services AWS provides for serverless computing. The factors that contribute to high costs are function duration and execution errors. The users should be aware that lengthy or repeated execution increases the cost. This document describes important metrics and queries that you can use to monitor and troubleshoot AWS Lambda.

Use the label function_name if you want to filter by name of the function.

Invocations

If you want to monitor the number of invocations you can use the metric aws_lambda_invocations.

The following query gives you the top highest functions invocations:

topk(10,sum_over_time(aws_lambda_invocations{function_name!=""}[5m]))

Errors

The following query gives you the top 10 highest errors functions:

topk(10,sum_over_time(aws_lambda_errors{function_name!=""}[5m]))

Duration

The duration of a serverless function is the main aspect that contributes to its cost. Use the following query to get the top 10 function duration:

topk(10,max_over_time(aws_lambda_duration{function_name!=""}[5m]))

You have to take into acccount lambda functions have a timeout of 15 minutes. Use the following alert query you can be aware about that possible issue.

max_over_time (aws_lambda_duration{function_name!=""} [5m]) > 840000

Therefore, if your serverless environment has functions that do not take too much time, use the Duration Dashboards to monitor the average and maximum time of your function executions. This information gives you enough data to manage the correct behavior of your Lambda functions.

topk(10,max_over_time(aws_lambda_duration{function_name!=""}[5m]))

The other context to be aware of is that functions could take more time than required and could reach the timeout limit for Lambda functions. You can track this with a specific alert and a dashboard that shows the maximum duration.

PostRuntime Extend Duration

The function postruntime extend duration might not be as important as the function duration. However, if the postruntime extend duration is higher than usual, you should analyze your environment and function resources, since it represents the duration of the Lambda function control and management, not the computing function time itself. Therefore, if the duration average and maximum begin to increase, you should analyze your functions and Lambda services.

The following query gives you the top 10 highest postruntime extensions duration functions:

topk(10,sum_over_time(aws_lambda_postruntime_extensions_duration[5m]))