当前位置:网站首页>Error alarm of Postgres master-slave replication delay monitoring
Error alarm of Postgres master-slave replication delay monitoring
2022-04-23 07:05:00 【A sunny afternoon】
In the use of Prometheus Monitored Postgres Database time , An alarm will be generated due to the delay of master-slave replication , But in fact, the problem of normal database , What we use exporter by https://github.com/prometheus-community/postgres_exporter, The alarm expression is :
pg_replication_lag > 300
The description of this indicator is as follows :
# HELP pg_replication_lag Replication lag behind master in seconds
# TYPE pg_replication_lag gauge
pg_replication_lag{server=""}
But actually , This indicator indicates how long there has been no replication between master and slave , from https://github.com/prometheus-community/postgres_exporter/blob/master/queries.yaml We can find out the use of this indicator SQL by :
SELECT
CASE
WHEN NOT pg_is_in_recovery() THEN 0
ELSE GREATEST(0, EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_TIMESTAMP())))
END
AS LAG
The official documents describe the two functions as follows :
Name | Return Type | Description |
---|---|---|
pg_is_in_recovery() | bool | True if recovery is still in progress. |
pg_last_xact_replay_TIMESTAMP() | timestamp with time zone | Get time stamp of last transaction replayed during recovery. This is the time at which the commit or abort WAL record for that transaction was generated on the primary. If no transactions have been replayed during recovery, this function returns NULL. Otherwise, if recovery is still in progress this will increase monotonically. If recovery has completed then this value will remain static at the value of the last transaction applied during that recovery. When the server has been started normally without recovery the function returns NULL. |
this SQL As the result of the : The main warehouse is 0, From the library is the difference between the current time and the last recovery transaction time .
So a special case is , If there is no transaction commit in the main database , that pg_last_xact_replay_TIMESTAMP()
The value of remains unchanged , Corresponding pg_replication_lag
The value will continue to increase , However, it does not mean that the master-slave replication fails .
therefore , If you want to avoid false alarms , We can create a test table in the main library , Update the data in the table every minute , Keep the database active , In this way, if an alarm occurs, it really indicates that there is a serious fault in the data . The specific methods are as follows :
psql (11.7)
Type "help" for help.
postgres=# CREATE DATABASE test;
postgres=# \c test
postgres=# CREATE TABLE test(id INT PRIMARY KEY NOT NULL DEFAULT 1, time varchar(255));
postgres=# INSERT INTO "public"."test"("time") VALUES ('123');
Configure scheduled tasks :
# crontab -e
* * * * * time=`date`;/usr/pgsql-11/bin/psql -h localhost -p 18083 -d test -c "UPDATE public.test SET time = '${time}'"
版权声明
本文为[A sunny afternoon]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230600558247.html
边栏推荐
- qs. In the stringify interface, the input parameter is converted into a & connected string (with the application / x-www-form-urlencoded request header)
- Oracle数据库性能分析之常用视图
- Ali vector library Icon tutorial (online, download)
- 异常记录-12
- openvswitch vlan网络实践
- [Lombok quick start]
- 一个DG环境的ORA-16047: DGID mismatch between destination setting and target database问题排查及监听VNCR特性
- 异常记录-8
- ovs与ovs+dpdk架构分析
- [fish in the net] ansible awx calls playbook to transfer parameters
猜你喜欢
Redis practice notes and source code analysis
Winter combat camp hands-on combat - MySQL database rapid deployment practice lead mouse cloud Xiaobao
基於DPDK實現VPC和IDC間互聯互通的高性能網關
冬季实战营动手实战-上云必备环境准备,动手实操快速搭建LAMP环境 领鼠标 云小宝 背包 无影
Thanos Compactor组件使用
阿里矢量库的图标使用教程(在线,下载)
基于ECS搭建云上博客(体验有礼)
virtio 与vhost_net介绍
使用prom-label-proxy实现Prometheus Thanos的基于标签的多租户读
LeetCode刷题|38外观数组
随机推荐
Kids and COVID: why young immune systems are still on top
LeetCode刷题|13罗马数字转整数
Introduction to DDoS attack / defense
异常记录-16
How to use tiup to deploy a tidb V5 0 cluster
Problems related to Prometheus cortex using block storage
OVS and OVS + dpdk architecture analysis
[ES6 quick start]
异常记录-12
JS function package foreach use return can not jump out of the outer function
DDOS攻击/防御介绍
[shell script exercise] batch add the newly added disks to the specified VG
Prometheus监控influxdb的方法及指标释义
JS format current time and date calculation
Winter combat camp hands-on combat - MySQL database rapid deployment practice lead mouse cloud Xiaobao
Analysis of Rdam principle
Imitation scallop essay reading page
异常记录-5
用反射与注解获取两个不同对象间的属性值差异
openvswitch vlan网络实践