当前位置:网站首页>Oracle RAC数据库实例启动异常问题分析IPC Send timeout
Oracle RAC数据库实例启动异常问题分析IPC Send timeout
2022-04-23 06:02:00 【还不算晕】
近期,某用户在重启RAC一个节点的数据库实例时,发现启动速度非常慢。同时业务部门反馈连接RAC存活节点的业务也受影响。
通过对日志的分析,在启动数据库时,Reconfiguration速度慢,Reconfiguration后报错IPC Send timeout detected. Sender: ospid 53884 [oracle@test2 (LMD0)],从而出现了数据库实例组的节点驱逐;Wed Apr 13 19:28:02 2022
Instance termination initiated by instance 2 with reason 1.
Instance 2 received a reconfig event from its cluster manager indicating that this instance is supposed to be down
Please check instance 2's alert log and LMON trace file for more details.
Please also examine the CSS log files.
LMON (ospid: 47523): terminating the instance due to error 481
因此,需要排查进程出现IPC Send timeout的原因;这方面有BUG也可能有RAC节点负载原因,可以参考MOS文档中的排查步骤一项项的排查系统的信息:
Instance Evicted After LMON to LMON IPC Send timeout Due to Storage Issue (Doc ID 2080029.1)
"ipc send timeout" Precedes Database Instance Crash or Eviction (Doc ID 1951216.1)
While Evicting One of the Instance, the Remaining instances Terminated by LMON with "LMON is running too slowly and in the middle of reconfiguration" (Doc ID 1949505.1)
相关日志如下:
1.
2022-04-13 18:57:29.215 节点1集群软件人工重启成功,
数据库实例也启动成功,
Wed Apr 13 18:58:32 2022
QMNC started with pid=100, OS id=52025
Completed: ALTER DATABASE OPEN /* db agent *//* {1:49652:2} */
2.节点2 RECONFIG过程中节点1异常
--节点2
Wed Apr 13 19:22:26 2022
Starting ORACLE instance (normal)
--节点1:
Wed Apr 13 19:28:00 2022
IPC Send timeout detected. Receiver ospid 47526 [
Wed Apr 13 19:28:00 2022
Errors in file /oracle/app/diag/rdbms/testnew/test1/trace/test1_lmd0_47526.trc:
Wed Apr 13 19:28:02 2022
Instance termination initiated by instance 2 with reason 1.
Instance 2 received a reconfig event from its cluster manager indicating that this instance is supposed to be down
Please check instance 2's alert log and LMON trace file for more details.
Please also examine the CSS log files.
LMON (ospid: 47523): terminating the instance due to error 481
System state dump requested by (instance=1, osid=47523 (LMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/app/diag/rdbms/testnew/test1/trace/test1_diag_47507_20220413192802.trc
Wed Apr 13 19:28:03 2022
ORA-1092 : opitsk aborting process
Instance terminated by LMON, pid = 47523
--节点2:
Wed Apr 13 19:28:00 2022
IPC Send timeout detected. Sender: ospid 53884 [oracle@test2 (LMD0)]
Receiver: inst 1 binc 429458022 ospid 47526
IPC Send timeout to 1.0 inc 4 for msg type 65521 from opid 11
Wed Apr 13 19:28:02 2022
Communications reconfiguration: instance_number 1
Wed Apr 13 19:28:02 2022
Dumping diagnostic data in directory=[cdmp_20220413192802], requested by (instance=1, osid=47523 (LMON)), summary=[abnormal instance termination].
Reconfiguration started (old inc 4, new inc 8)
#############################
3.要查节点2启动,Reconfiguration过程中,IPC Send timeout 的原因--这也是节点2人工启动时感觉很慢的原因 ;同时节点1在19:34启动时报了ORA-00240错误 ,要综合检查一下当时的网络及存储情况以及节点的负载等,参考MOS上文档。
Wed Apr 13 19:34:49 2022
Errors in file /oracle/app/diag/rdbms/testnew/test1/trace/test1_dbw0_247773.trc (incident=168173):
ORA-00240: control file enqueue held for more than 120 seconds
Instance Evicted After LMON to LMON IPC Send timeout Due to Storage Issue (Doc ID 2080029.1)
"ipc send timeout" Precedes Database Instance Crash or Eviction (Doc ID 1951216.1)
While Evicting One of the Instance, the Remaining instances Terminated by LMON with "LMON is running too slowly and in the middle of reconfiguration" (Doc ID 1949505.1)
版权声明
本文为[还不算晕]所创,转载请带上原文链接,感谢
https://blog.csdn.net/q947817003/article/details/124163828
边栏推荐
- 通过源码探究@ModelAndView如何实现数据与页面的转发
- 如何使用TiUP部署一个TiDB v5.0集群
- High performance gateway for interconnection between VPC and IDC based on dpdk
- Introduction to DDoS attack / defense
- Openvswitch compilation and installation
- 使用sed命令来高效处理文本
- Practice of openvswitch VLAN network
- Arranges the objects specified in the array in front of the array
- Virtio and Vhost_ Net introduction
- Oracle性能分析工具:OSWatcher
猜你喜欢

ovs与ovs+dpdk架构分析

Kids and COVID: why young immune systems are still on top

关于 synchronized、ThreadLocal、线程池、Atomic 原子类的 JUC 面试题

Using Prom label proxy to implement label based multi tenant reading of Prometheus thanos

使用prom-label-proxy实现Prometheus Thanos的基于标签的多租户读

阿里云日志服务sls的典型应用场景

LeetCode刷题|13罗马数字转整数

qs. In the stringify interface, the input parameter is converted into a & connected string (with the application / x-www-form-urlencoded request header)

通过源码探究@ModelAndView如何实现数据与页面的转发

冬季实战营动手实战-上云必备环境准备,动手实操快速搭建LAMP环境 领鼠标 云小宝 背包 无影
随机推荐
异常记录-18
rdam 原理解析
Implementation of multi tenant read and write in Prometheus cortex
Virtio and Vhost_ Net introduction
异常记录-16
异常记录-5
[Lombok quick start]
【Shell脚本练习】将新加的磁盘批量添加到指定的VG中
DDOS攻击/防御介绍
【MySQL基础篇】启动选项、系统变量、状态变量
rdma 介绍
基於DPDK實現VPC和IDC間互聯互通的高性能網關
Winter combat camp hands-on combat - MySQL database rapid deployment practice lead mouse cloud Xiaobao
JS implementation of web page rotation map
[OSS file upload quick start]
一个DG环境的ORA-16047: DGID mismatch between destination setting and target database问题排查及监听VNCR特性
bcc安装和基本工具使用说明
数据库基本概念:OLTP/OLAP/HTAP、RPO/RTO、MPP
Redis FAQ
异常记录-15