AWS Lambda Metrics
This integration can be enabled via the Integrate AWS Lambda Telemetry API.
This integration has 4 metrics.
List of Alerts
Alert | Description | Format |
---|---|---|
[AWS Lambda Metrics] High Function Error Rate | High Function Error Rate. | Prometheus |
[AWS Lambda Metrics] Function Timeout | Function reached timeout limit. | Prometheus |
List of Dashboards
AWS Lambda Metrics
The dashboard provides information on the AWS Lambda integration.
List of Metrics
Metric name |
---|
aws_lambda_duration |
aws_lambda_errors |
aws_lambda_invocations |
aws_lambda_postruntime_extensions_duration |
Monitoring and Troubleshooting AWS Lambda Metrics
AWS Lambda service is one of the main services AWS provides for serverless computing. The factors that contribute to high costs are function duration and execution errors. The users should be aware that lengthy or repeated execution increases the cost. This document describes important metrics and queries that you can use to monitor and troubleshoot AWS Lambda.
Use the label function_name
if you want to filter by name of the function.
Invocations
If you want to monitor the number of invocations you can use the metric aws_lambda_invocations
.
The following query gives you the top highest functions invocations:
topk(10,sum_over_time(aws_lambda_invocations{function_name!=""}[5m]))
Errors
The following query gives you the top 10 highest errors functions:
topk(10,sum_over_time(aws_lambda_errors{function_name!=""}[5m]))
Duration
The duration of a serverless function is the main aspect that contributes to its cost. Use the following query to get the top 10 function duration:
topk(10,max_over_time(aws_lambda_duration{function_name!=""}[5m]))
You have to take into acccount lambda functions have a timeout of 15 minutes. Use the following alert query you can be aware about that possible issue.
max_over_time (aws_lambda_duration{function_name!=""} [5m]) > 840000
Therefore, if your serverless environment has functions that do not take too much time, use the Duration Dashboards to monitor the average and maximum time of your function executions. This information gives you enough data to manage the correct behavior of your Lambda functions.
topk(10,max_over_time(aws_lambda_duration{function_name!=""}[5m]))
The other context to be aware of is that functions could take more time than required and could reach the timeout limit for Lambda functions. You can track this with a specific alert and a dashboard that shows the maximum duration.
PostRuntime Extend Duration
The function postruntime extend duration might not be as important as the function duration. However, if the postruntime extend duration is higher than usual, you should analyze your environment and function resources, since it represents the duration of the Lambda function control and management, not the computing function time itself. Therefore, if the duration average and maximum begin to increase, you should analyze your functions and Lambda services.
The following query gives you the top 10 highest postruntime extensions duration functions:
topk(10,sum_over_time(aws_lambda_postruntime_extensions_duration[5m]))
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.