GCP Cloud PostgreSQL
You can enable this integration using the Connect a GCP Account option on the Monitor UI.
This integration has 32 metrics.
List of Alerts
Alert | Description | Format |
---|---|---|
[GCP Cloud PostgreSQL] Database Down | The PostgreSQL database is down. | Prometheus |
[GCP Cloud PostgreSQL] No High Availability | The PostgreSQL database doesn’t have ‘High Availability’ enabled, or there are no available instances in other zones for automatic failover. The service will be down when the instance undergoes maintenance or has any other issues. | Prometheus |
[GCP Cloud PostgreSQL] High CPU Usage | The CPU usage is reaching the limit. | Prometheus |
[GCP Cloud PostgreSQL] High Memory Usage | The memory usage is reaching the limit. | Prometheus |
[GCP Cloud PostgreSQL] Disk Full in 48h | The disk will be full in 48 hours. | Prometheus |
[GCP Cloud PostgreSQL] Disk Full in 12h | The disk will be full in 12 hours. | Prometheus |
[GCP Cloud PostgreSQL] Replica Lag | The PostgreSQL replica is lagging behind its primary instance. This issue is only applicable to replicas. | Prometheus |
[GCP Cloud PostgreSQL] Network Lag | The time taken from primary binary log to IO thread on replica is high. This issue is only applicable to replicas. | Prometheus |
[GCP Cloud PostgreSQL] Low Cache Hit Rate | The cache hit rate is low. | Prometheus |
[GCP Cloud PostgreSQL] DeadLocks In Database | Deadlocks detected in database. | Prometheus |
[GCP Cloud PostgreSQL] High Transaction ID Usage | The transaction ID usage is reaching the limit. | Prometheus |
List of Dashboards
GCP Cloud PostgreSQL
The dashboard provides information on the GCP Cloud PostgreSQL integration.
List of Metrics
Metric name |
---|
gcp_cloudsql_database_database_auto_failover_request_count |
gcp_cloudsql_database_database_available_for_failover |
gcp_cloudsql_database_database_cpu_utilization |
gcp_cloudsql_database_database_disk_bytes_used_by_data_type |
gcp_cloudsql_database_database_disk_read_ops_count |
gcp_cloudsql_database_database_disk_utilization |
gcp_cloudsql_database_database_disk_write_ops_count |
gcp_cloudsql_database_database_instance_state |
gcp_cloudsql_database_database_memory_utilization |
gcp_cloudsql_database_database_network_received_bytes_count |
gcp_cloudsql_database_database_network_sent_bytes_count |
gcp_cloudsql_database_database_postgresql_blocks_read_count |
gcp_cloudsql_database_database_postgresql_deadlock_count |
gcp_cloudsql_database_database_postgresql_num_backends |
gcp_cloudsql_database_database_postgresql_num_backends_by_application |
gcp_cloudsql_database_database_postgresql_num_backends_by_state |
gcp_cloudsql_database_database_postgresql_replication_replica_byte_lag |
gcp_cloudsql_database_database_postgresql_temp_bytes_written_count |
gcp_cloudsql_database_database_postgresql_temp_files_written_count |
gcp_cloudsql_database_database_postgresql_transaction_count |
gcp_cloudsql_database_database_postgresql_transaction_id_count |
gcp_cloudsql_database_database_postgresql_transaction_id_utilization |
gcp_cloudsql_database_database_postgresql_tuple_size |
gcp_cloudsql_database_database_postgresql_tuples_fetched_count |
gcp_cloudsql_database_database_postgresql_tuples_processed_count |
gcp_cloudsql_database_database_postgresql_tuples_returned_count |
gcp_cloudsql_database_database_postgresql_vacuum_oldest_transaction_age |
gcp_cloudsql_database_database_replication_network_lag |
gcp_cloudsql_database_database_replication_replica_lag |
gcp_cloudsql_database_database_replication_state |
gcp_cloudsql_database_database_up |
gcp_cloudsql_database_database_uptime |
Monitoring and Troubleshooting GCP Cloud PostgreSQL
This document describes important metrics and queries that you can use to monitor and troubleshoot GCP Cloud PostgreSQL.
Most of the metrics covered in this document are applicable to all the GCP Cloud SQL integrations (MySQL, PostgreSQL, SQL Server). There is a separate section for MySQL metrics.
The GCP metrics of type DELTA
, which typically have the suffix _count
in their name (but not always), return the cumulative value of 1 minute. They need to be divided by 60 to get the value per second.
Status and Replication
Status
Use the following query to get the status of the database. A value of 1 means UP, and a value of 0 means DOWN.
gcp_cloudsql_database_database_up
Use the following query to get the current serving state of the Cloud SQL instance. The state is stored inside the label state
.
gcp_cloudsql_database_database_instance_state > 0
The state can be one of the following:
- RUNNING: The instance is expected to be running. If an instance experiences unplanned (non-maintenance) downtime, the state will still be RUNNING, but the
Database UP
metric will report 0. - SUSPENDED: The instance is not available, for example, due to problems with billing.
- RUNNABLE: The instance has been stopped by the owner. It is not currently running but is ready to be restarted.
- PENDING_CREATE: The instance is being created.
- MAINTENANCE: The instance is down for maintenance.
- FAILED: The instance creation failed.
- UNKNOWN_STATE: The state of the instance is unknown.
Replication
Failover
The following query returns a value > 0 if the failover operation is available on the instance.
gcp_cloudsql_database_database_available_for_failover
Use the following query to get the number of instance auto-failover requests per second.
gcp_cloudsql_database_database_auto_failover_request_count / 60
State
Use the following query to get the current serving state of replication. The state is stored inside the label state
. According to the GCP documentation, this metric is in the BETA launch stage.
gcp_cloudsql_database_database_replication_state > 0
The state can be one of the following:
- RUNNING: Replication is active and running.
- STOPPED: Replication is inactive and stopped.
- ERROR: An error has occurred and replication is stopped.
- SYNCING: The replication is syncing.
- UNSYNCED: The replication is unsynced.
Replica and Network Lag
Use the following query to get the approximate number of seconds the read replica is behind its primary instance. This is only applicable to replicas.
gcp_cloudsql_database_database_replication_replica_lag
The following query indicates time taken from primary binary log to IO thread on replica. This is only applicable to replicas.
gcp_cloudsql_database_database_replication_network_lag
CPU, Memory and Storage
CPU
Use the following query to get the current CPU utilization represented as a percentage of the reserved CPU that is currently in use. The return values are typically numbers between 0 and 1 (but might exceed 1).
gcp_cloudsql_database_database_cpu_utilization
Memory
Use the following query to get the percentage of the memory quota that is currently in use. The return values are numbers between 0 and 1.
gcp_cloudsql_database_database_memory_utilization
Disk
Use the following query to get the percentage of the disk quota that is currently in use. The return values are numbers between 0 and 1.
gcp_cloudsql_database_database_disk_utilization
Use the following query to get the disk usage per data type. The data type is stored inside the label data_type
. According to the GCP documentation, this metric is in the BETA launch stage.
gcp_cloudsql_database_database_disk_bytes_used_by_data_type
Use the following queries to get the disk read and write IO operations per second.
gcp_cloudsql_database_database_disk_read_ops_count / 60
gcp_cloudsql_database_database_disk_write_ops_count / 60
Network
Use the following queries to get the bytes received and sent through the network per second.
gcp_cloudsql_database_database_network_received_bytes_count / 60
gcp_cloudsql_database_database_network_sent_bytes_count / 60
PostgreSQL Stats
Backends
Use the following query to get the number of connections to the Cloud SQL PostgreSQL instance.
gcp_cloudsql_database_database_postgresql_num_backends
Use the following query to get the number of connections to the Cloud SQL PostgreSQL instance, grouped by its state. The possible values inside the state
label are idle
, active
, idle_in_transaction
, idle_in_transaction_aborted
, disabled
or fastpath_function_call
. According to the GCP documentation, this metric is in the BETA launch stage.
gcp_cloudsql_database_database_postgresql_num_backends_by_state
Use the following query to get the number of connections to the Cloud SQL PostgreSQL instance, grouped by applications. The application is stored inside the label application
.
gcp_cloudsql_database_database_postgresql_num_backends_by_application
Transactions
Use the following query to get the number of transactions per second. The transaction type is stored inside the label transaction_type
.
gcp_cloudsql_database_database_postgresql_transaction_count / 60
Use the following query to get the number of transaction IDs per second. The possible values inside the action
label are assigned
or frozen
. assigned
indicates the number of transaction IDs assigned and consumed by the instance, whereas frozen
indicates the count of transaction IDs replenished by the VACUUM’s freeze operation.
gcp_cloudsql_database_database_postgresql_transaction_id_count / 60
Use the following query to get the current utilization represented as a percentage of transaction IDs consumed by the Cloud SQL PostgreSQL instance. Values are numbers between 0 and 1.
gcp_cloudsql_database_database_postgresql_transaction_id_utilization
Use the following query to get the age of the oldest transaction yet to be vacuumed in the Cloud SQL PostgreSQL instance. The metric is measured as the number of transactions that have happened since the oldest transaction. The possible values of the oldest_transaction_type
label are running
, prepared
, replication_slot
and replica
.
gcp_cloudsql_database_database_postgresql_vacuum_oldest_transaction_age
Tuples
Use the following query to get the number of tuples (rows) in the database. The possible values inside the tuple_state
label are live
or dead
. According to the GCP documentation, this metric is in the BETA launch stage.
gcp_cloudsql_database_database_postgresql_tuple_size
Use the following query to get the number of tuples(rows) processed per second for a given database for operations like insert, update, or delete. The operation type is stored inside the label operation_type
. According to the GCP documentation, this metric is in the BETA launch stage.
gcp_cloudsql_database_database_postgresql_tuples_processed_count / 60
Use the following query to get the total number of rows fetched per second as a result of queries per database in the PostgreSQL instance.
gcp_cloudsql_database_database_postgresql_tuples_fetched_count / 60
Use the following query to get the total number of rows scanned per second while processing the queries per database in the PostgreSQL instance.
gcp_cloudsql_database_database_postgresql_tuples_returned_count / 60
Disk Read, Deadlocks and Replication
Disk Read
Use the following query to get the number of disk blocks read by this database per second. According to the GCP documentation, this metric is in the BETA launch stage.
gcp_cloudsql_database_database_postgresql_blocks_read_count{source="disk"} / 60
Use the following query to get the number of reads from buffer cache by this database per second. According to the GCP documentation, this metric is in BETA launch stage.
gcp_cloudsql_database_database_postgresql_blocks_read_count{source="buffer_cache"} / 60
Using the two previous metrics, calculate the hit cache ratio with the following query.
gcp_cloudsql_database_database_postgresql_blocks_read_count{source="buffer_cache"}
/
(gcp_cloudsql_database_database_postgresql_blocks_read_count{source="buffer_cache"} + ignoring(source) gcp_cloudsql_database_database_postgresql_blocks_read_count{source="disk"})
Use the following query to get the total amount of data in bytes written per second to temporary files by the queries per database.
gcp_cloudsql_database_database_postgresql_temp_bytes_written_count / 60
Use the following query to get the total number of temporary files used per second for writing data while performing algorithms such as join and sort.
gcp_cloudsql_database_database_postgresql_temp_files_written_count / 60
Deadlocks
Use the following query to get the number of deadlocks detected per second for this database. According to the GCP documentation, this metric is in the BETA launch stage.
gcp_cloudsql_database_database_postgresql_deadlock_count / 60
Replication
Use the following query to get the Replication lag in bytes. Reported from the master per replica. The possible values of the replica_lag_type
label are replay_location
, flush_location
, write_location
, and sent_location
.
gcp_cloudsql_database_database_postgresql_replication_replica_byte_lag
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.