当前位置:网站首页>Analysis of the problem that the cluster component GIPC in RAC environment cannot correctly identify the heartbeat network state
Analysis of the problem that the cluster component GIPC in RAC environment cannot correctly identify the heartbeat network state
2022-04-23 13:42:00 【Not dizzy yet】
In the near future , A cluster database node fails to start in a user environment 、 The problem of joining a cluster . The cluster version is 11.2 edition , Check the cluster log , The problem is obvious , colony alert Let me see... In the log CSSD Process log ,CSSD No heartbeat network is displayed in :has a disk HB, but no network HB; Check and handle according to the following steps :
1. First, through hosts The file confirms the database heartbeat network IP, At the operating system level, confirm that the heartbeat network card is in normal state and can communicate with each other PING through 、SSH Unicom .
2. adopt gpnptool get Confirm that the heartbeat network used by the cluster is the one checked in the previous step .
3. according to 11.2 Cluster component functions ,GIPC The process is responsible for detecting the cluster network status ; see GIPC Process log , Find out GIPC The heartbeat network of the process ID eth1 - rank 0; It is an abnormal state ( Normal eth1 - rank 99).
4. Steps in 1 It has been checked that the heartbeat network is normal at the host level ; Therefore, combined with the characteristics of cluster components , Try to trigger the cluster to re detect the state of the heartbeat network ( Usually you can KILL GIPC Process or restart the cluster software );
5. This time KILL GIPC The process or restarting the cluster software is invalid , Through the operating system Restart the network card , after GIPC The process correctly identifies the status of the network card , The cluster can start normally .
The relevant logs are as follows :
1. At the time of abnormal GPNP Central hop network information :
[grid@nphisdb1 gpnpd]$gpnptool get
Warning: some command line parameters were defaulted. Resulting command line:
/u01/app/11.2.0/grid_1/bin/gpnptool.bin get -o-
<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="4" ClusterUId="a3268b3b769cdf7dbfc43c8ffd69e87f" ClusterName="nphisdb-cluster" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="192.168.205.0" Adapter="eth0" Use="public"/><gpnp:Network id="net2" IP="10.10.10.0" Adapter="eth1" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/oracleasm/disks" SPFile="+CRS/nphisdb-cluster/asmparameterfile/registry.253.1028034033"/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>bjVFpM9uJREXWTWBP6GSC1A11Zw=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>UN5iBJd7mbmW8usjptRlTXtIBf05z76r+MyCNOSlXAGcsTE/zbb2BFeZkH0LMpyF5jbpQUzHE+U3wjUzZl/VsQS+y9QPeANVz1q1E9XDpfsxJwhRyhv0MNtK4/yy9xr9Y/zgTdg6dO2utm2Hy9pyCoDIrQ75gsmnZCtmPrfwR0A=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile>
Success.
2. Check GIPC Network in process rank value
2022-03-20 13:30:58.580: [ CLSINET][346261248] Returning NETDATA: 1 interfaces
2022-03-20 13:30:58.580: [ CLSINET][346261248] # 0 Interface 'eth1',ip='10.10.10.1',mac='40-f2-e9-64-24-5e',mask='255.255.255.0',net='10.10.10.0',use='cluster_interconnect'
2022-03-20 13:31:00.903: [GIPCDMON][346261248] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 0, avgms 30000000000.000000 [ 32 / 0 / 0 ]
2022-03-20 13:31:01.430: [GIPCDCLT][350463744] gipcdClientThread: req from local client of type gipcdmsgtypeInterfaceMetrics, endp 000000000000046d
2022-03-20 13:31:02.431: [GIPCDCLT][350463744] gipcdClientThread: req from local client of type gipcdmsgtypeInterfaceMetrics, endp 0000000000000199
2022-03-20 13:31:03.432: [GIPCDCLT][350463744] gipcdClientThread: req from local client of type gipcdmsgtypeInterfaceMetrics, endp 000000000000032e
2022-03-20 13:31:03.584: [ CLSINET][346261248] Returning NETDATA: 1 interfaces
2022-03-20 13:31:03.584: [ CLSINET][346261248] # 0 Interface 'eth1',ip='10.10.10.1',mac='40-f2-e9-64-24-5e',mask='255.255.255.0',net='10.10.10.0',use='cluster_interconnect'
2022-03-20 13:31:06.433: [GIPCDCLT][350463744] gipcdClientThread: req from local client of type gipcdmsgtypeInterfaceMetrics, endp 000000000000046d
2022-03-20 13:31:07.434: [GIPCDCLT][350463744] gipcdClientThread: req from local client of type gipcdmsgtypeInterfaceMetrics, endp 0000000000000199
3. The problem cannot be solved after restarting the cluster software , Restart NIC
4. Check GIPC Process log , It's back to normal rank 99
[grid@nphisdb1 gipcd]$tail -f gipcd.log |grep rank
2022-03-20 13:38:30.626: [GIPCDMON][346261248] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 1.143791 [ 300 / 306 / 306 ]
2022-03-20 13:39:00.634: [GIPCDMON][346261248] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 0.628019 [ 204 / 207 / 207 ]
2022-03-20 13:39:30.642: [GIPCDMON][346261248] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 1.564626 [ 153 / 147 / 147 ]
2022-03-20 13:40:00.642: [GIPCDMON][346261248] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 1.052632 [ 119 / 114 / 114 ]
2022-03-20 13:40:30.644: [GIPCDMON][346261248] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 1.016949 [ 121 / 118 / 118 ]
2022-03-20 13:41:00.655: [GIPCDMON][346261248] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 1.636364 [ 115 / 110 / 110 ]
2022-03-20 13:41:30.658: [GIPCDMON][346261248] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 1.071429 [ 117 / 112 / 112 ]
版权声明
本文为[Not dizzy yet]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230601579652.html
边栏推荐
- SHA512 / 384 principle and C language implementation (with source code)
- [point cloud series] learning representations and generative models for 3D point clouds
- 【重心坐标插值、透视矫正插值】原理以及用法见解
- 零拷贝技术
- Machine learning -- model optimization
- QT calling external program
- Database transactions
- How do ordinary college students get offers from big factories? Ao Bing teaches you one move to win!
- Operations related to Oracle partition
- Oracle and MySQL batch query all table names and table name comments under users
猜你喜欢
浅谈js正则之test方法bug篇
集简云 x 飞书深诺,助力企业运营部实现自动化办公
零拷贝技术
Lenovo Savior y9000x 2020
Zero copy technology
[point cloud series] unsupervised multi task feature learning on point clouds
100000 college students have become ape powder. What are you waiting for?
[official announcement] Changsha software talent training base was established!
[point cloud series] multi view neural human rendering (NHR)
Information: 2021 / 9 / 29 10:01 - build completed with 1 error and 0 warnings in 11S 30ms error exception handling
随机推荐
Oracle renames objects
联想拯救者Y9000X 2020
Bottomsheetdialogfragment + viewpager + fragment + recyclerview sliding problem
Static interface method calls are not supported at language level '5'
Tangent space
On the bug of JS regular test method
TCP reset Gongji principle and actual combat reproduction
Unified task distribution scheduling execution framework
Django::Did you install mysqlclient?
Lenovo Saver y9000x 2020
顶级元宇宙游戏Plato Farm,近期动作不断利好频频
Solve tp6 download error course not find package topthink / think with stability stable
"Xiangjian" Technology Salon | programmer & CSDN's advanced road
Dolphin scheduler scheduling spark task stepping record
Part 3: docker installing MySQL container (custom port)
torch. Where can transfer gradient
Window function row commonly used for fusion and de duplication_ number
The difference between is and as in Oracle stored procedure
Oracle view related
Machine learning -- model optimization