当前位置:网站首页>Oracle RAC数据库实例启动异常问题分析IPC Send timeout
Oracle RAC数据库实例启动异常问题分析IPC Send timeout
2022-04-23 06:02:00 【还不算晕】
近期,某用户在重启RAC一个节点的数据库实例时,发现启动速度非常慢。同时业务部门反馈连接RAC存活节点的业务也受影响。
通过对日志的分析,在启动数据库时,Reconfiguration速度慢,Reconfiguration后报错IPC Send timeout detected. Sender: ospid 53884 [oracle@test2 (LMD0)],从而出现了数据库实例组的节点驱逐;Wed Apr 13 19:28:02 2022
Instance termination initiated by instance 2 with reason 1.
Instance 2 received a reconfig event from its cluster manager indicating that this instance is supposed to be down
Please check instance 2's alert log and LMON trace file for more details.
Please also examine the CSS log files.
LMON (ospid: 47523): terminating the instance due to error 481
因此,需要排查进程出现IPC Send timeout的原因;这方面有BUG也可能有RAC节点负载原因,可以参考MOS文档中的排查步骤一项项的排查系统的信息:
Instance Evicted After LMON to LMON IPC Send timeout Due to Storage Issue (Doc ID 2080029.1)
"ipc send timeout" Precedes Database Instance Crash or Eviction (Doc ID 1951216.1)
While Evicting One of the Instance, the Remaining instances Terminated by LMON with "LMON is running too slowly and in the middle of reconfiguration" (Doc ID 1949505.1)
相关日志如下:
1.
2022-04-13 18:57:29.215 节点1集群软件人工重启成功,
数据库实例也启动成功,
Wed Apr 13 18:58:32 2022
QMNC started with pid=100, OS id=52025
Completed: ALTER DATABASE OPEN /* db agent *//* {1:49652:2} */
2.节点2 RECONFIG过程中节点1异常
--节点2
Wed Apr 13 19:22:26 2022
Starting ORACLE instance (normal)
--节点1:
Wed Apr 13 19:28:00 2022
IPC Send timeout detected. Receiver ospid 47526 [
Wed Apr 13 19:28:00 2022
Errors in file /oracle/app/diag/rdbms/testnew/test1/trace/test1_lmd0_47526.trc:
Wed Apr 13 19:28:02 2022
Instance termination initiated by instance 2 with reason 1.
Instance 2 received a reconfig event from its cluster manager indicating that this instance is supposed to be down
Please check instance 2's alert log and LMON trace file for more details.
Please also examine the CSS log files.
LMON (ospid: 47523): terminating the instance due to error 481
System state dump requested by (instance=1, osid=47523 (LMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/app/diag/rdbms/testnew/test1/trace/test1_diag_47507_20220413192802.trc
Wed Apr 13 19:28:03 2022
ORA-1092 : opitsk aborting process
Instance terminated by LMON, pid = 47523
--节点2:
Wed Apr 13 19:28:00 2022
IPC Send timeout detected. Sender: ospid 53884 [oracle@test2 (LMD0)]
Receiver: inst 1 binc 429458022 ospid 47526
IPC Send timeout to 1.0 inc 4 for msg type 65521 from opid 11
Wed Apr 13 19:28:02 2022
Communications reconfiguration: instance_number 1
Wed Apr 13 19:28:02 2022
Dumping diagnostic data in directory=[cdmp_20220413192802], requested by (instance=1, osid=47523 (LMON)), summary=[abnormal instance termination].
Reconfiguration started (old inc 4, new inc 8)
#############################
3.要查节点2启动,Reconfiguration过程中,IPC Send timeout 的原因--这也是节点2人工启动时感觉很慢的原因 ;同时节点1在19:34启动时报了ORA-00240错误 ,要综合检查一下当时的网络及存储情况以及节点的负载等,参考MOS上文档。
Wed Apr 13 19:34:49 2022
Errors in file /oracle/app/diag/rdbms/testnew/test1/trace/test1_dbw0_247773.trc (incident=168173):
ORA-00240: control file enqueue held for more than 120 seconds
Instance Evicted After LMON to LMON IPC Send timeout Due to Storage Issue (Doc ID 2080029.1)
"ipc send timeout" Precedes Database Instance Crash or Eviction (Doc ID 1951216.1)
While Evicting One of the Instance, the Remaining instances Terminated by LMON with "LMON is running too slowly and in the middle of reconfiguration" (Doc ID 1949505.1)
版权声明
本文为[还不算晕]所创,转载请带上原文链接,感谢
https://blog.csdn.net/q947817003/article/details/124163828
边栏推荐
- Construire un blog Cloud basé sur ECS (bénédiction sur le Code Cloud Xiaobao, explication détaillée de la tâche iphone13 gratuite)
- [shell script exercise] batch add the newly added disks to the specified VG
- Introduction to DDoS attack / defense
- 异常记录-8
- 异常记录-16
- 如何通过dba_hist_active_sess_history分析数据库历史性能问题
- Winter combat camp hands-on combat - cloud essential environment preparation, hands-on practical operation, quickly build lamp environment, lead mouse cloud Xiaobao backpack without shadow
- js 格式化当前时间 日期推算
- Oracle redo log产生量大的查找思路与案例
- Use the SED command to process text efficiently
猜你喜欢

Virtio and Vhost_ Net introduction

数据库基本概念:OLTP/OLAP/HTAP、RPO/RTO、MPP

Prometheus thanos Quick Guide

【ES6快速入门】

Try catch cannot catch asynchronous errors

通过源码探究@ModelAndView如何实现数据与页面的转发

搭建基于OSS的图片分享网站-反馈有礼

Introduction to common APIs for EBFP programming

ovs与ovs+dpdk架构分析

Redis 详解(基础+数据类型+事务+持久化+发布订阅+主从复制+哨兵+缓存穿透、击穿、雪崩)
随机推荐
异常记录-17
OVS and OVS + dpdk architecture analysis
异常记录-22
SSM项目在阿里云部署
EMR Based offline data analysis - polite feedback
[step by step, even thousands of miles] MySQL reports a large number of unauthenticated user connection errors
搭建基于OSS的图片分享网站-反馈有礼
"Write multi tenant" implementation of Prometheus and thanos receiver
实践使用PolarDB和ECS搭建门户网站
openvswitch 编译安装
JS format current time and date calculation
异常记录-13
将数组中指定的对象排在数组的前边
Detailed explanation of RDMA programming
Thanos Compactor组件使用
How does VirtualBox modify the IP network segment assigned to the virtual machine in the "network address translation (NAT)" network mode
Research on alertmanager repeated / missing alarm phenomenon and two key parameters_ Wait and group_ Interpretation of interval
Ali vector library Icon tutorial (online, download)
tc ebpf 实践
异常记录-19