Skip to content

Commit 69257ed

Browse files
committed
Support cascade replication in get_stats_replication_delays
When monitoring a cluster with cascade replication the current query fails with "get_stats_replication_delays". We can instead monitor the diff using pg_last_wal_receive_lsn, this will emit metrics showing how much replication delay the replica has compared to the upstream node (but not relative to the primary instance of the cluster).
1 parent da0c678 commit 69257ed

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

postgresql_metrics/postgres_queries.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,10 @@ def get_replication_delays(conn):
221221
sql = ("SELECT client_addr, "
222222
"pg_xlog_location_diff(pg_current_xlog_location(), replay_location) AS bytes_diff "
223223
"FROM public.pg_stat_repl")
224+
if is_in_recovery(conn):
225+
# pg_current_xlog_location cannot be called in a replica
226+
# use pg_last_xlog_receive_location for monitoring cascade replication
227+
sql = sql.replace("pg_current_xlog_location", "pg_last_xlog_receive_location")
224228
if conn.server_version >= 100000: # PostgreSQL 10 and higher
225229
sql = sql.replace('_xlog', '_wal')
226230
sql = sql.replace('_location', '_lsn')
@@ -273,3 +277,7 @@ def get_wal_receiver_status(conn):
273277
host = CONNINFO_HOST_RE.search(conn_info).groupdict().get('host', 'UNKNOWN')
274278
host_replication_status.append((host, status))
275279
return host_replication_status
280+
281+
282+
def is_in_recovery(conn):
283+
return query(conn, "SELECT pg_is_in_recovery()")[0][0]

0 commit comments

Comments
 (0)