数据库学习日常案例20231219-19C集群ORA-27301 ORA-27300分析处理

发布时间：2023年12月21日

1 问题概述

今日对数据库检查发现如下错误

Errors in file /u01/app/oracle/diag/rdbms/orcldb/orcldb1/incident/incdir_4510778/orcldb1_m004_322434_i4510778.trc:
ORA-00313: open failed for members of log group 6 of thread 1
ORA-00312: online log 6 thread 1: '+REDO/orcldb/ONLINELOG/group_6.266.1070296359'
ORA-17503: ksfdopn:2 Failed to open file +REDO/orcldb/ONLINELOG/group_6.266.1070296359
ORA-15064: communication failure with ASM instance
ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:sendmsg failed with status: 105
ORA-27301: OS failure message: No buffer space available
ORA-27302: failure occurred at: sskgxpsnd2
2023-12-19T03:03:32.788217+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldb/orcldb1/incident/incdir_4510778/orcldb1_m004_322434_i4510778.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 7 thread 1: '+DGDATA4/orcldb/ONLINELOG/group_7.262.1066409025'
ORA-17503: ksfdopn:2 Failed to open file +DGDATA4/orcldb/ONLINELOG/group_7.262.1066409025
ORA-15064: communication failure with ASM instance
ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:sendmsg failed with status: 105
ORA-27301: OS failure message: No buffer space available
ORA-27302: failure occurred at: sskgxpsnd2
2023-12-19T03:03:42.795004+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldb/orcldb1/incident/incdir_4510778/orcldb1_m004_322434_i4510778.trc:
ORA-00313: open failed for members of log group 8 of thread 2
ORA-00312: online log 8 thread 2: '+REDO/orcldb/ONLINELOG/group_8.267.1070296539'
ORA-17503: ksfdopn:2 Failed to open file +REDO/orcldb/ONLINELOG/group_8.267.1070296539
ORA-15064: communication failure with ASM instance
ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:sendmsg failed with status: 105
ORA-27301: OS failure message: No buffer space available
ORA-27302: failure occurred at: sskgxpsnd2
2023-12-19T03:03:52.801743+08:00
?

2 数据库numa和mtu值信息

[root@orcl01 ~]# numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
node 0 size: 195001 MB
node 0 free: 1759 MB
node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
node 1 size: 196608 MB
node 1 free: 629 MB
node 2 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
node 2 size: 131072 MB
node 2 free: 613 MB
node 3 cpus: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
node 3 size: 131072 MB
node 3 free: 465 MB
node distances:
node ? 0 ? 1 ? 2 ? 3?
? 0: ?10 ?21 ?21 ?21?
? 1: ?21 ?10 ?21 ?21?
? 2: ?21 ?21 ?10 ?21?
? 3: ?21 ?21 ?21 ?10?
[root@orcl01 ~]# ip ad sh
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
? ? link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
? ? inet 127.0.0.1/8 scope host lo
? ? ? ?valid_lft forever preferred_lft forever
? ? inet6 ::1/128 scope host?
? ? ? ?valid_lft forever preferred_lft forever
2: eno4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
? ? link/ether c4:b8:b4:2e:b6:f0 brd ff:ff:ff:ff:ff:ff
3: ens3f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
? ? link/ether 34:0a:98:9c:ed:37 brd ff:ff:ff:ff:ff:ff
? ? inet 10.2.0.101/24 brd 10.2.0.255 scope global ens3f0
? ? ? ?valid_lft forever preferred_lft forever
? ? inet 169.254.22.17/19 brd 169.254.31.255 scope global ens3f0:1
? ? ? ?valid_lft forever preferred_lft forever
? ? inet6 fe80::360a:98ff:fe9c:ed37/64 scope link?
? ? ? ?valid_lft forever preferred_lft forever
4: ens3f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
? ? link/ether 34:0a:98:9c:ed:38 brd ff:ff:ff:ff:ff:ff
5: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
? ? link/ether c4:b8:b4:2e:b6:f1 brd ff:ff:ff:ff:ff:ff
? ? inet 10.228.224.4/26 brd 10.228.224.63 scope global eno3
? ? ? ?valid_lft forever preferred_lft forever
? ? inet 10.228.224.10/26 brd 10.228.224.63 scope global secondary eno3:1
? ? ? ?valid_lft forever preferred_lft forever
? ? inet 10.228.224.12/26 brd 10.228.224.63 scope global secondary eno3:2
? ? ? ?valid_lft forever preferred_lft forever
? ? inet6 2409:8002:5a06:120:10:0:2:d00c/116 scope global deprecated?
? ? ? ?valid_lft forever preferred_lft 0sec
? ? inet6 2409:8002:5a06:120:10:0:2:d00a/116 scope global deprecated?
? ? ? ?valid_lft forever preferred_lft 0sec
? ? inet6 2409:8002:5a06:120:10:0:2:d004/116 scope global?
? ? ? ?valid_lft forever preferred_lft forever
? ? inet6 fe80::c6b8:b4ff:fe2e:b6f1/64 scope link?
? ? ? ?valid_lft forever preferred_lft forever
6: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
? ? link/ether c4:b8:b4:2e:b6:f2 brd ff:ff:ff:ff:ff:ff
7: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
? ? link/ether c4:b8:b4:2e:b6:f3 brd ff:ff:ff:ff:ff:ff
8: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
? ? link/ether 52:54:00:76:94:f3 brd ff:ff:ff:ff:ff:ff
? ? inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
? ? ? ?valid_lft forever preferred_lft forever
9: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
? ? link/ether 52:54:00:76:94:f3 brd ff:ff:ff:ff:ff:ff
[root@orcl01 ~]#?

3 问题分析处理

很早以前遇到过IPC这样的问题无数，

这个就是咱们现在大内存的主机下，缓冲区碎片整理单元有点小，并且回环地址的mtu值有问题。我们按照官方MOS文档指示进行更改即可；

On servers with High Physical Memory, the parameter vm.min_free_kbytes should be set in the order of 0.4% of total Physical Memory. This helps in keeping a larger range of defragmented memory pages available for network buffers reducing the probability of a low-buffer-space conditions.
*** For example, on a server which is having 256GB RAM, the parameter vm.min_free_kbytes should be set to 1073742 ***

On NUMA Enabled Systems, the value of vm.min_free_kbytes should be multiplied by the number of NUMA nodes since the value is to be split across all the nodes.

On NUMA Enabled Systems, the value of vm.min_free_kbytes = n * 0.4% of total Physical Memory. Here ‘n’ is the number of NUMA nodes.

Additionally, the MTU value should be modified as below
#ifconfig lo mtu 16436

To make the change persistent over reboot add the following line in the file /etc/sysconfig/network-scripts/ifcfg-lo :

MTU=16436
Save the file and restart the network service to load the changes

#service network restart

Note : While making the changes in CRS nodes, if network is restarted while CRS is up, it can hung CRS. So cluster services should be stopped prior to the network restart.

数据库日常数据库检查发现报错如下：

?Database Alert log captures below:

mtype: 61 process 14152 failed because of a resource problem in the OS. The OS has most likely run out of buffers (rval: 4)
Errors in file /oracle/admin/diag/rdbms/<DB>/<SID>/trace/<SID>_ora_14152.trc ?(incident=249601):
ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:sendmsg failed with status: 105
ORA-27301: OS failure message: No buffer space available
ORA-27302: failure occurred at: sskgxpsnd2
Incident details in: /oracle/admin/diag/rdbms/<DB>/<SID>/incident/incdir_249601/<SID>_ora_14152_i249601.trc

Network communication issue and High Memory consumption would be observed during this time.
Server is running with UEK3 kernel

This happens due to less space available for network buffer reservation.根据查询数据库metalink发现需要对主机参数进行调整优化。 ???????

文章来源:https://blog.csdn.net/oradbm/article/details/135078213
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若内容造成侵权/违法违规/事实不符，请联系我的编程经验分享网邮箱：chenni525@qq.com进行投诉反馈，一经查实，立即删除！