crfclust.bdb文件过大处理

发布时间:2023年12月30日

问题现象

巡检过程中发下1套RAC的生产环境服务上,/oracle目录空间仅剩余8.3G,需尽快清理大文件避免磁盘爆满宕机。

--查看磁盘空间
[root@rac01 ~]# df -h
文件系统	      容量  已用  可用 已用%% 挂载点
/dev/mapper/vg_rac01-lv_root
                       50G   11G   37G  23% /
tmpfs                  64G   37G   27G  59% /dev/shm
/dev/sda1             485M   38M  422M   9% /boot
/dev/mapper/vg_rac01-lv_oracle
                       99G   86G  8.3G  92% /oracle
/dev/mapper/vg_rac01-lv_tmp
                       50G  180M   47G   1% /tmp
/dev/mapper/vg_rac01-lv_usr
                       50G  3.0G   44G   7% /usr
/dev/mapper/vg_rac01-lv_var
                       50G  582M   47G   2% /var

分析过程

使用find命令找出来大文件

[root@rac01 ~]# find /oracle -type f -size +1024M
/oracle/grid_home/crf/db/rac01/crfloclts.bdb
/oracle/grid_home/crf/db/rac01/crfclust.bdb
/oracle/grid_home/log/diag/tnslsnr/rac01/listener_scan1/trace/listener_scan1.log
/oracle/app/diag/rdbms/orcl/orcl1/trace/alert_orcl1.log
/oracle/app/diag/rdbms/icpsp/icpsp1/trace/alert_icpsp1.log

[root@rac01 ~]# ls -lh /oracle/grid_home/crf/db/rac01/crfloclts.bdb
-rw-r----- 1 root root 1.2G 12月 29 10:50 /oracle/grid_home/crf/db/rac01/crfloclts.bdb
You have mail in /var/spool/mail/root
[root@rac01 ~]# ls -lh /oracle/grid_home/crf/db/rac01/crfclust.bdb
-rw-r----- 1 root root 53G 12月 29 10:50 /oracle/grid_home/crf/db/rac01/crfclust.bdb

问题原因

由于文件crfclust.bdb是Cluster Health Monitor (CHM) file,他的默认大小是1G,但是有在一些平台和版本中由于bug原因导致过大.
Oracle Cluster Health Monitor (CHM) using large amount of space (more than default) (Doc ID 1343105.1)
Bug 20186278 – crfclust.bdb Becomes Huge Size Due to Sudden Retention Change (Doc ID 20186278.8)

ora.crf用途

资源对应的功能是CHM.Cluster Health Monitor(以下简称CHM)是一个Oracle提供的工具,用来自动收集操作系统的资源(CPU、内存、SWAP、进程、I/O以及网络等)的使用情况。CHM会每秒收集一次数据。这些系统资源数据对于诊断集群系统的节点重启、Hang、实例驱逐(Eviction)、性能问题等是非常有帮助的。另外,用户可以使用CHM来及早发现一些系统负载高、内存异常等问题,从而避免产生更严重的问题。
crfclust.bdb 文件是Oracle Cluster Health Monitor (CHM) 中 CRF 服务用于存储数据的文件,默认只存储一定时间数据,正常情况不会增长过大,默认大小是1G。但是有在一些平台和版本中由于bug原因导致过大。
例如在11.2.0.4版本中,由于bug 10165314,ORA.CRF服务可能会生成很大的文件,这可能会对$GI_HOME的使用率造成压力。因此,在某些情况下可能需要删除这些文件或者禁止ORA.CRF随ohas启动而启动。

解决步骤

获取CHM路径

--获取Cluster Health Monitor (CHM) 存储路径
[grid@rac01 bin]$ /oracle/grid_home/bin/oclumon manage -get reppath
CHM Repository Path = /oracle/grid_home/crf/db/rac02
 Done

本次生产环境中获取Cluster Health Monitor (CHM) 存储路径提示如下:
[grid@rac01 bin]$ cd /oracle/grid_home/crf/db/rac02
-bash: cd: /oracle/grid_home/crf/db/rac02: No such file or directory
而虚拟机环境试了下可以获取,原因不详,继续往下分析

[grid@rac01 bin]$ cd  /oracle/grid_home/crf/db/rac01
[root@wldb01 wldb01]# du -sh
58G     .
[grid@rac01 wldb01]# ls -lhtr
total 58G
-rw-r-----. 1 root root  16M Dec 30 14:35 log.0000047847
-rw-r-----. 1 root root 8.0K Dec 30 14:35 repdhosts.bdb
-rw-r-----. 1 root root  24K Dec 30 14:36 __db.001
-rw-r--r--. 1 root root 115M Dec 30 14:36 wldb01.ldb
-rw-r-----. 1 root root 8.0K Dec 30 14:36 crfconn.bdb
-rw-r-----. 1 root root 329M Dec 30 14:36 crfts.bdb
-rw-r-----. 1 root root 508M Dec 30 14:36 crfloclts.bdb
-rw-r-----. 1 root root  54G Dec 30 14:35 crfclust.bdb
-rw-r-----. 1 root root 392K Dec 30 14:35 __db.002
-rw-r-----. 1 root root  16M Dec 30 14:36 log.0000047848
-rw-r-----. 1 root root 504M Dec 30 14:36 crfhosts.bdb
-rw-r-----. 1 root root 650M Dec 30 14:36 crfcpu.bdb
-rw-r-----. 1 root root 534M Dec 30 14:36 crfalert.bdb
-rw-r-----. 1 root root  56K Dec 30 14:36 __db.006
-rw-r-----. 1 root root 1.2M Dec 30 14:36 __db.005
-rw-r-----. 1 root root 2.1M Dec 30 14:36 __db.004
-rw-r-----. 1 root root 2.6M Dec 30 14:36 __db.003

清理bdb文件

两节点依次清理,清理完一节点,再清理二节点:

--查看集群中所有资源状态,不显示初始化资源信息,如 ora.cssd、ora.ctssd、ora.diskmon 等基础资源。
[root@rac01 rac01]# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
ora.DATA.dg
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
ora.asm
               ONLINE  ONLINE       rac01                    Started             
               ONLINE  ONLINE       rac02                    Started             
ora.gsd
               OFFLINE OFFLINE      rac01                                        
               OFFLINE OFFLINE      rac02                                        
ora.net1.network
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
ora.ons
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac02                                        
ora.cvu
      1        ONLINE  ONLINE       rac02                                        
ora.icpsp.db
      1        ONLINE  ONLINE       rac01                    Open                
      2        ONLINE  ONLINE       rac02                    Open                
ora.oc4j
      1        ONLINE  ONLINE       rac02                                        
ora.orcl.db
      1        ONLINE  ONLINE       rac01                    Open                
      2        ONLINE  ONLINE       rac02                    Open                
ora.rac01.vip
      1        ONLINE  ONLINE       rac01                                        
ora.rac02.vip
      1        ONLINE  ONLINE       rac02                                        
ora.scan1.vip
      1        ONLINE  ONLINE       rac02    
      


--守护进程状态   
-init: 这个选项用于显示初始化资源的状态信息,这些资源通常包括如 ora.cssd、ora.ctssd、ora.diskmon 等基础资源。

[root@rac01 rac01]# crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac01                    Started             
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       rac01                                        
ora.crf
      1        ONLINE  ONLINE       rac01                                        
ora.crsd
      1        ONLINE  ONLINE       rac01                                        
ora.cssd
      1        ONLINE  ONLINE       rac01                                        
ora.cssdmonitor
      1        ONLINE  ONLINE       rac01                                        
ora.ctssd
      1        ONLINE  ONLINE       rac01                    OBSERVER            
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  ONLINE       rac01                                        
ora.gipcd
      1        ONLINE  ONLINE       rac01                                        
ora.gpnpd
      1        ONLINE  ONLINE       rac01                                        
ora.mdnsd
      1        ONLINE  ONLINE       rac01      

      
[root@wldb01 wldb01]# /u01/app/11.2.0/grid/bin/crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'wldb01'
CRS-2677: Stop of 'ora.crf' on 'wldb01' succeeded
[root@wldb01 wldb01]# /u01/app/11.2.0/grid/bin/crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       wldb01                   Started             
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       wldb01                                       
ora.crf
      1        OFFLINE OFFLINE                                                   
ora.crsd
      1        ONLINE  ONLINE       wldb01                                       
ora.cssd
      1        ONLINE  ONLINE       wldb01                                       
ora.cssdmonitor
      1        ONLINE  ONLINE       wldb01                                       
ora.ctssd
      1        ONLINE  ONLINE       wldb01                   ACTIVE:0            
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.drivers.acfs
      1        ONLINE  ONLINE       wldb01                                       
ora.evmd
      1        ONLINE  ONLINE       wldb01                                       
ora.gipcd
      1        ONLINE  ONLINE       wldb01                                       
ora.gpnpd
      1        ONLINE  ONLINE       wldb01                                       
ora.mdnsd
      1        ONLINE  ONLINE       wldb01       

--删除文件      
[root@wldb01 wldb01]# rm -rf crfclust.bdb
或
[root@wldb01 wldb01]# rm -rf *.bdb

--查看磁盘空间
[root@rac01 rac01]# df -h
文件系统	      容量  已用  可用 已用%% 挂载点
/dev/mapper/vg_rac01-lv_root
                       50G   11G   37G  23% /
tmpfs                  64G   37G   27G  58% /dev/shm
/dev/sda1             485M   38M  422M   9% /boot
/dev/mapper/vg_rac01-lv_oracle
                       99G   33G   61G  35% /oracle
/dev/mapper/vg_rac01-lv_tmp
                       50G  180M   47G   1% /tmp
/dev/mapper/vg_rac01-lv_usr
                       50G  3.0G   44G   7% /usr
/dev/mapper/vg_rac01-lv_var
                       50G  582M   47G   2% /var
                       
[root@rac01 rac01]# du -sh 
4.9G

--启动crf服务.
[root@wldb01 wldb01]# /u01/app/11.2.0/grid/bin/crsctl start res ora.crf -init
CRS-2672: Attempting to start 'ora.crf' on 'wldb01'
CRS-2676: Start of 'ora.crf' on 'wldb01' succeeded
[root@wldb01 wldb01]# /u01/app/11.2.0/grid/bin/crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       wldb01                   Started             
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       wldb01                                       
ora.crf
      1        ONLINE  ONLINE       wldb01                                       
ora.crsd
      1        ONLINE  ONLINE       wldb01                                       
ora.cssd
      1        ONLINE  ONLINE       wldb01                                       
ora.cssdmonitor
      1        ONLINE  ONLINE       wldb01                                       
ora.ctssd
      1        ONLINE  ONLINE       wldb01                   ACTIVE:0            
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.drivers.acfs
      1        ONLINE  ONLINE       wldb01                                       
ora.evmd
      1        ONLINE  ONLINE       wldb01                                       
ora.gipcd
      1        ONLINE  ONLINE       wldb01                                       
ora.gpnpd
      1        ONLINE  ONLINE       wldb01                                       
ora.mdnsd
      1        ONLINE  ONLINE       wldb01   

如果不想这么麻烦,也可以不用管服务,直接删除文件,crf会自动重建文件(亲测没有问题,建议还是停服务后再操作避免意外发生)

rm -f *.bdb

如果确认不需要该服务,可以禁用

crsctl modify resource “ora.crf” -attr “AUTO_START=0” -init

疑问

--问题描述
不显示ora.crf资源信息

--原因
基本功不扎实,未弄清楚crsctl status res -t和crsctl status res -t -init  2个命令的区别

--解除疑问过程
--查看集群中所有资源状态,不显示初始化资源信息,如 ora.cssd、ora.ctssd、ora.diskmon 等基础资源。
[root@rac01 rac01]# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
ora.DATA.dg
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
ora.asm
               ONLINE  ONLINE       rac01                    Started             
               ONLINE  ONLINE       rac02                    Started             
ora.gsd
               OFFLINE OFFLINE      rac01                                        
               OFFLINE OFFLINE      rac02                                        
ora.net1.network
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
ora.ons
               ONLINE  ONLINE       rac01                                        
               ONLINE  ONLINE       rac02                                        
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac02                                        
ora.cvu
      1        ONLINE  ONLINE       rac02                                        
ora.icpsp.db
      1        ONLINE  ONLINE       rac01                    Open                
      2        ONLINE  ONLINE       rac02                    Open                
ora.oc4j
      1        ONLINE  ONLINE       rac02                                        
ora.orcl.db
      1        ONLINE  ONLINE       rac01                    Open                
      2        ONLINE  ONLINE       rac02                    Open                
ora.rac01.vip
      1        ONLINE  ONLINE       rac01                                        
ora.rac02.vip
      1        ONLINE  ONLINE       rac02                                        
ora.scan1.vip
      1        ONLINE  ONLINE       rac02    
      


--守护进程状态   
-init: 这个选项用于显示初始化资源的状态信息,这些资源通常包括如 ora.cssd、ora.ctssd、ora.diskmon 等基础资源。

[root@rac01 rac01]# crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac01                    Started             
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       rac01                                        
ora.crf
      1        ONLINE  ONLINE       rac01                                        
ora.crsd
      1        ONLINE  ONLINE       rac01                                        
ora.cssd
      1        ONLINE  ONLINE       rac01                                        
ora.cssdmonitor
      1        ONLINE  ONLINE       rac01                                        
ora.ctssd
      1        ONLINE  ONLINE       rac01                    OBSERVER            
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  ONLINE       rac01                                        
ora.gipcd
      1        ONLINE  ONLINE       rac01                                        
ora.gpnpd
      1        ONLINE  ONLINE       rac01                                        
ora.mdnsd
      1        ONLINE  ONLINE       rac01      

https://www.xifenfei.com/2017/03/high-space-usage-crfclust-bdb.html
https://blog.csdn.net/weixin_43700866/article/details/114382015

文章来源:https://blog.csdn.net/qq961573863/article/details/135306427
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。