当前位置:网站首页>Case of kingbasees v8r3 cluster deleting data nodes online in Jincang database
Case of kingbasees v8r3 cluster deleting data nodes online in Jincang database
2022-04-21 16:01:00 【Warehouse database】
Case description
KingbaseES V8R3 Cluster one master multi slave architecture , Generally, two nodes are the management nodes of the cluster , All nodes can be data nodes ; Data nodes that are not management nodes can be deleted online ; But for management nodes , Cannot delete online , If you delete a management node , The cluster needs to be redeployed . This case is under the structure of one active and two standby , Delete data nodes ( Non management node ) Test cases of .
System host environment
[kingbase@node3 bin]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.7.248 node1 # Cluster management node & Data nodes
192.168.7.249 node2 # Data nodes
192.168.7.243 node3 # Cluster management node & Data nodes
Cluster architecture
Database version
TEST=# select version();
VERSION -------------------------------------------------------------------------------------------------------------------------
Kingbase V008R003C002B0270 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)
One 、 View the cluster status information
= Be careful : Before deleting a data node , The normal state of the cluster is guaranteed , Including cluster node status and primary / standby stream replication status =
Cluster node state
[kingbase@node3 bin]$ ./ksql -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0270)
Type “help” for help.
TEST=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------±--------------±------±-------±----------±--------±-----------±------------------±------------------
0 | 192.168.7.243 | 54321 | up | 0.333333 | primary | 0 | false | 0
1 | 192.168.7.248 | 54321 | up | 0.333333 | standby | 0 | true | 0
2 | 192.168.7.249 | 54321 | up | 0.333333 | standby | 0 | false | 0
(3 rows)
Active / standby stream replication status
TEST=# select * from sys_stat_replication;
PID | USESYSID | USENAME | APPLICATION_NAME | CLIENT_ADDR | CLIENT_HOSTNAME | CLIENT_PORT | BACKEND_START | BACKEND_XMIN |
STATE | SENT_LOCATION | WRITE_LOCATION | FLUSH_LOCATION | REPLAY_LOCATION | SYNC_PRIORITY | SYNC_STATE -------±---------±--------±-----------------±--------------±----------------±------------±------------------------------±-------------±-
12316 | 10 | SYSTEM | node249 | 192.168.7.249 | | 39337 | 2021-03-01 12:59:29.003870+08 | | s
treaming | 0/50001E8 | 0/50001E8 | 0/50001E8 | 0/50001E8 | 3 | potential
15429 | 10 | SYSTEM | node248 | 192.168.7.248 | | 35885 | 2021-03-01 12:59:38.317605+08 | | s
treaming | 0/50001E8 | 0/50001E8 | 0/50001E8 | 0/50001E8 | 2 | sync
(2 rows)
Two 、 Delete the cluster data node
1、 Stop... On the data node cron service (netwrok_rewind.sh Planning tasks )
[kingbase@node2 bin]$ cat /etc/cron.d/KINGBASECRON #/1 * * * * kingbase . /etc/profile;/home/kingbase/cluster/R6HA/KHA/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/R6HA/KHA/kingbase/bin/…/etc/repmgr.conf >> /home/kingbase/cluster/R6HA/KHA/kingbase/bin/…/kbha.log 2>&1#/1 * * * * kingbase /home/kingbase/cluster/kha/db/bin/network_rewind.sh
2、 Stop the data node database service
[kingbase@node2 bin]$ ./sys_ctl stop -D …/data
waiting for server to shut down… done
server stopped
3、 Delete the replication slot on the master node
TEST=# select * from sys_replication_slots;
SLOT_NAME | PLUGIN | SLOT_TYPE | DATOID | DATABASE | ACTIVE | ACTIVE_PID | XMIN | CATALOG_XMIN | RESTART_LSN | CONFIRMED_FLUSH_LSN
--------------±-------±----------±-------±---------±-------±-----------±-----±-------------±------------±--------------------
slot_node243 | | physical | | | f | | | | |
slot_node248 | | physical | | | t | 29330 | 2076 | | 0/70000D0 |
slot_node249 | | physical | | | f | | 2076 | | 0/60001B0 |
(3 rows)
TEST=# select SYS_DROP_REPLICATION_SLOT(‘slot_node249’);
SYS_DROP_REPLICATION_SLOT
(1 row)
TEST=# select * from sys_replication_slots;
SLOT_NAME | PLUGIN | SLOT_TYPE | DATOID | DATABASE | ACTIVE | ACTIVE_PID | XMIN | CATALOG_XMIN | RESTART_LSN | CONFIRMED_FLUSH_LSN
--------------±-------±----------±-------±---------±-------±-----------±-----±-------------±------------±--------------------
slot_node243 | | physical | | | f | | | | |
slot_node248 | | physical | | | t | 29330 | 2076 | | 0/70000D0 |
(2 rows)
4、 Edit profile ( All management nodes )
1) HAmodule.conf The configuration file (db/etc and kingbasecluster/etc Next )
= As shown below , Host names and addresses of all nodes in the cluster ip Configuration information , You need to clear the configuration information of the deleted node =
[kingbase@node3 etc]$ cat HAmodule.conf |grep -i all#IP of all nodes in the cluster.example:KB_ALL_IP=“(192.168.28.128 192.168.28.129 )”
KB_ALL_IP=(192.168.7.243 192.168.7.248 192.168.7.249 )#recoord the names of all nodes.example:ALL_NODE_NAME=1 (node1 node2 node3)
ALL_NODE_NAME=(node243 node248 node249)
= As shown in the figure below , The host name and name of the node that is about to be deleted ip The information is cleared from the configuration =
2) edit kingbasecluster The configuration file
= As shown below , Delete the node's configuration information from the configuration file comment =
[kingbase@node1 etc]$ tail kingbasecluster.conf
backend_hostname1=‘192.168.7.248’
backend_port1=54321
backend_weight1=1
backend_data_directory1=‘/home/kingbase/cluster/kha/db/data’
notes node249 Configuration information #backend_hostname2=‘192.168.7.249’#backend_port2=54321#backend_weight2=1#backend_data_directory2=‘/home/kingbase/cluster/kha/db/data’
3、 ... and 、 Restart cluster test
=== Be careful : In production environment , There is no need to restart the cluster immediately , Restart the cluster at the appropriate time ===
[kingbase@node3 bin]$ ./kingbase_monitor.sh restart
-----------------------------------------------------------------------2021-03-01 13:26:44 KingbaseES automation beging…2021-03-01 13:26:44 stop kingbasecluster [192.168.7.243] …
remove status file /home/kingbase/cluster/kha/run/kingbasecluster/kingbasecluster_status
DEL VIP NOW AT 2021-03-01 13:26:49 ON enp0s3
No VIP on my dev, nothing to do.2021-03-01 13:26:50 Done…2021-03-01 13:26:50 stop kingbasecluster [192.168.7.248] …
remove status file /home/kingbase/cluster/kha/run/kingbasecluster/kingbasecluster_status
DEL VIP NOW AT 2021-03-01 13:09:36 ON enp0s3
No VIP on my dev, nothing to do.2021-03-01 13:26:55 Done…2021-03-01 13:26:55 stop kingbase [192.168.7.243] …set /home/kingbase/cluster/kha/db/data down now…2021-03-01 13:27:01 Done…2021-03-01 13:27:02 Del kingbase VIP [192.168.7.245/24] …
DEL VIP NOW AT 2021-03-01 13:27:03 ON enp0s3execute: [/sbin/ip addr del 192.168.7.245/24 dev enp0s3]
Oprate del ip cmd end.2021-03-01 13:27:03 Done…2021-03-01 13:27:03 stop kingbase [192.168.7.248] …set /home/kingbase/cluster/kha/db/data down now…2021-03-01 13:27:06 Done…2021-03-01 13:27:07 Del kingbase VIP [192.168.7.245/24] …
DEL VIP NOW AT 2021-03-01 13:09:47 ON enp0s3
No VIP on my dev, nothing to do.2021-03-01 13:27:07 Done…
…
all stop…
ping trust ip 192.168.7.1 success ping times :[3], success times:[2]
ping trust ip 192.168.7.1 success ping times :[3], success times:[2]
start crontab kingbase position : [3]
Redirecting to /bin/systemctl restart crond.service
ADD VIP NOW AT 2021-03-01 13:27:17 ON enp0s3execute: [/sbin/ip addr add 192.168.7.245/24 dev enp0s3 label enp0s3:2]execute: /home/kingbase/cluster/kha/db/bin/arping -U 192.168.7.245 -I enp0s3 -w 1
ARPING 192.168.7.245 from 192.168.7.245 enp0s3
Sent 1 probes (1 broadcast(s))
Received 0 response(s)
start crontab kingbase position : [2]
Redirecting to /bin/systemctl restart crond.service
ping vip 192.168.7.245 success ping times :[3], success times:[3]
ping vip 192.168.7.245 success ping times :[3], success times:[2]
now,there is a synchronous standby.
wait kingbase recovery 5 sec…
start crontab kingbasecluster line number: [6]
Redirecting to /bin/systemctl restart crond.service
start crontab kingbasecluster line number: [3]
Redirecting to /bin/systemctl restart crond.service
…
all started…
…
now we check again
| ip | program| [status]
[ 192.168.7.243]| [kingbasecluster]| [active]
[ 192.168.7.248]| [kingbasecluster]| [active]
[ 192.168.7.243]| [kingbase]| [active]
[ 192.168.7.248]| [kingbase]| [active]
Four 、 Verify cluster status
1、 View stream replication status information
Active and standby stream replication status information
[kingbase@node3 bin]$ ./ksql -U SYSTEM -W 123456 TEST
ksql (V008R003C002B0270)
Type “help” for help.
TEST=# select * from sys_stat_replication;
PID | USESYSID | USENAME | APPLICATION_NAME | CLIENT_ADDR | CLIENT_HOSTNAME | CLIENT_PORT | BACKEND_START | BACKEND_XMIN |
STATE | SENT_LOCATION | WRITE_LOCATION | FLUSH_LOCATION | REPLAY_LOCATION | SYNC_PRIORITY | SYNC_STATE -------±---------±--------±-----------------±--------------±----------------±------------±------------------------------±-------------±-
29330 | 10 | SYSTEM | node248 | 192.168.7.248 | | 39484 | 2021-03-01 13:27:19.649897+08 | | s
treaming | 0/70000D0 | 0/70000D0 | 0/70000D0 | 0/70000D0 | 2 | sync
(1 row)
Copy slot information
TEST=# select * from sys_replication_slots;
SLOT_NAME | PLUGIN | SLOT_TYPE | DATOID | DATABASE | ACTIVE | ACTIVE_PID | XMIN | CATALOG_XMIN | RESTART_LSN | CONFIRMED_FLUSH_LSN --------------±-------±----------±-------±---------±-------±-----------±-----±-------------±------------±--------------------
slot_node243 | | physical | | | f | | | | |
slot_node248 |
| physical | | | t | 29330 | 2076 | | 0/70000D0 |
(2 rows)
2、 View cluster node status
[kingbase@node3 bin]$ ./ksql -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0270)
Type “help” for help.
TEST=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------±--------------±------±-------±----------±--------±-----------±------------------±------------------
0 | 192.168.7.243 | 54321 | up | 0.500000 | primary | 0 | false | 0
1 | 192.168.7.248 | 54321 | up | 0.500000 | standby | 0 | true | 0
(2 rows)
TEST=# select * from sys_stat_replication;
PID | USESYSID | USENAME | APPLICATION_NAME | CLIENT_ADDR | CLIENT_HOSTNAME | CLIENT_PORT | BACKEND_START | BACKEND_XMIN |
STATE | SENT_LOCATION | WRITE_LOCATION | FLUSH_LOCATION | REPLAY_LOCATION | SYNC_PRIORITY | SYNC_STATE -------±---------±--------±-----------------±--------------±----------------±------------±------------------------------±-------------±----------±--------------±---------------±---------------±----------------±--------------±-----------
29330 | 10 | SYSTEM | node248 | 192.168.7.248 | | 39484 | 2021-03-01 13:27:19.649897+08 | | s
treaming | 0/70001B0 | 0/70001B0 | 0/70001B0 | 0/70001B0 | 2 | sync
(1 row)
5、 ... and 、 Delete the data node installation directory
[kingbase@node2 cluster]$ rm -rf kha/
6、 ... and 、 summary
1、 Before deleting cluster data nodes , Ensure the status of the whole cluster ( Cluster node and stream replication ) normal .
2、 Comment out the of the data node cron Planning tasks .
3、 Stop the data node database service .
4、 Delete the data node in the master node slot.
5、 Edit the configuration files of all management nodes (HAmoudle.conf and kingbasecluster.conf).
6、 Restart the cluster ( Not necessary ).
7、 Test cluster status .
8、 Delete the installation directory of the data node .
版权声明
本文为[Warehouse database]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204211545229685.html
边栏推荐
- 程序员美团面试经历,从基础到算法历时六个小时面试,年薪20w
- [2023 questions d'appel d'offres] Huawei Personality Evaluation (Comprehensive Evaluation) Strategic Guide
- 实现高德坐标转GPS坐标
- Ji Geng 45 / 90
- "Checking and remedying deficiencies", sorting out the core concepts of DDD
- C coordinate click WebBrowser1
- .NET Swagger配置
- R语言ggplot2可视化散点图(scatter plot)、并基于组合规则高亮(highlight)指定的数据点、设置数据点的大小(size)、数据点的色彩(color)
- LeetCode 141、环形链表
- Is Huishang futures account opening reliable? Is the money safe?
猜你喜欢

柱状图应用全面剖析

嵌入式GUI盘点-你了解几款?

Announcement of the first ship sea data intelligent application innovation competition

汇编语言程序设计:模块化程序设计 输入字符类型统计的设计与调试

解决 idea web项目没有小蓝点的问题

Sharkteam releases quarterly report on security situational awareness of Q1 smart contract in 2022

The conflict between Russia and Ukraine raised concerns. The five eye network security department suggested that allies strengthen infrastructure protection measures

C语言进阶第41式:内存操作经典问题分析一

Qt5.14.2编译mysql

2022 年 4 月中國數據庫排行榜:春風拂面春意暖,分數回昇四月天
随机推荐
Correspondence between annotation of CDs view and UI elements in SAP Fiori smart template technology
俄乌冲突引发顾虑 五眼网络安全部门建议盟友增强基础设施防护措施
CRM系统可以帮助改善客户体验吗?
Announcement of the first ship sea data intelligent application innovation competition
Mark, 365 fans in two years
Can CRM system help improve customer experience?
LeetCode 567、字符串的排列
Xiaomi civi 1s is priced from 2299 yuan. It focuses on beauty and gives you freedom to appear on the camera
LeetCode 141、环形链表
多用户场景的Harbor,我是如何轻松管理的!
从全内存、全本地磁盘缓存、一半缓存,一半OSS的测试结果来看,有什么结论?
季更48/90
LeetCode 1539、第 k 个缺失的正整数
Ji Geng 56 / 90
Root unlock problem
实现高德坐标转GPS坐标
季更43/90
National Rainfall Erosivity Factor r value
Web. Detailed explanation of XML file
Infrastructure 知识: DNS 命令: dig, host