在Oracle中 当一个ASM disk / diskgroup 被drop/dismount掉后,一般认为所有相关进程都将释放这些 ASM Disk对应的文件描述符(Disk descriptors)了,但实际运维过程中经常发现drop disk/diskgroup 后仍有进程不释放这些磁盘资源。
该问题主要是由于Oracle ASM的一些bug引起的,包括:
Bug 11666137 ASM dismounted disks are still held by background processes for long time
Bug 7225720 – ASM does not close open descriptors (Doc ID 7225720.8)
Bug:11785938 – ASM 11.2.0.2 IS NOT RELEASING FILE DESCRIPTORS AFTER DROP DISKGROUP
虽然这些bug 都被宣称在11.2.0.2版本中修复了,但实际在11.2.0.3上还可能遇到该问题。
还可以参考文档:
ASM 11.2.0.2 Is Not Releasing File Descriptors After Drop or Dismount Diskgroup. (Doc ID 1306574.1)
如果这些不释放资源的进程是前台进程,那么可以通过KILL进程来绕过该问题;如果是后台关键进程则只能等待其主动释放磁盘描述符了。
有一个SR里提了一个偏门的方法 是对 drop的DISK建一个临时diskgroup 然后drop 这个diskgroup ,不过不太推荐。The bug # 7225720 should be included in your 11.2 version.Also, there is a workaround of creating a dummy diskgroup. And when it is dropped, the file descriptors should be gone.Or, bouncing the ASM instance.Is the database 11.2 as well as the ASM? Since the patch #7225720 must be on both the database and ASM homes.Can you please upload the ASM alert.logs and the database alert.log.As we discussed, this will most likely require testing and a bug.In the interim, the following information exists in Note #429296.1 (Internal note), explaining this condition. And suggests that you may want to also utilize ASMLIB for your disks;Why aren’t file handles closed in all processes when Oracle stops using a file?Stopping use can be due to several reasons: * Dropping a tablespace * In ASM … an ALTER DISKGROUP … DROP DISK * In ASM … a DROP DISKGROUPSolution————-When use of a datafile by Oracle is “completed” by any of the above mentionedmethods … There is no mechanism within Oracle to force all sessions holdinga handle on the “completed” file to release that handleAfter long detailed discussion with development it was determined that there are only twomethods that can provide a consistant release / close of file handle / descriptors: * via the implementation of ASMLIB (if using ASM) * via the impementation of ODMLIBODMLIB’s are written by third party software vendors like VeritasASMLIB’s are written by operating system vendors and as suchcannot be provided by Oracle except in the case of Linux as we provide Linuxsupport and have access to the source code.The only ‘manual’ method that will force the release of all file handles would be to close all processes that havethat handle heldThis means all Oracle sessions in all instances and ifclustered on all boxes … that are holding that handleThe problem arises in determining which sessions hold the handle. The ‘lsof’ command can beused to determine which processes have open file handles and then it may be possible to track thisback to database processes which if it is not a vital process (background processes are for the most part)it could be killed. Perhaps, the following note can help:Article-ID: Note 787780.1Circulation: PUBLISHED (EXTERNAL)Title: Open Files/Open File DescriptorsIf this does not release the handle then shutting down all suspect instances is theonly solution === ODM Research ===Article-ID: Note 883028.1Circulation: PUBLISHED (EXTERNAL)Title: New Background Processes introduced by ACFSArticle-ID: Note 402526.1Circulation: PUBLISHED (EXTERNAL)Title: Asm Devices Are Still Held Open After Dismount or DropArticle-ID: Note 787780.1Circulation: PUBLISHED (EXTERNAL)Title: Open Files/Open File DescriptorsArticle-ID: Note 429296.1Circulation: MODERATED (EXTERNAL)Title: Why aren’t file handles closed in all processes when Oracle stops using a file? ASMCMD> lsdsk -kp /dev/raw/raw21(nothing)[oracle@devrac1b ~]$ /usr/sbin/lsof | grep raw21oracle 15114 oracle 12u CHR 162,21 25309 /dev/raw/raw21oracle 15150 oracle 12u CHR 162,21 25309 /dev/raw/raw21oracle 18752 oracle 77u CHR 162,21 25309 /dev/raw/raw21oracle 18754 oracle 51u CHR 162,21 25309 /dev/raw/raw21oracle 18760 oracle 53u CHR 162,21 25309 /dev/raw/raw21oracle 18762 oracle 78u CHR 162,21 25309 /dev/raw/raw21[oracle@devrac1b ~]$ ps -ef | grep 15114oracle 2586 15351 0 15:29 pts/2 00:00:00 grep 15114oracle 15114 1 0 May11 ? 00:00:00 oracleivrdvl2 (LOCAL=NO)[oracle@devrac1b ~]$ ps -ef | grep 15150oracle 3493 15351 0 15:29 pts/2 00:00:00 grep 15150oracle 15150 1 0 May11 ? 00:00:00 oracleivrtst2 (LOCAL=NO)[oracle@devrac1b ~]$ ps -ef | grep 18752oracle 3536 15351 0 15:29 pts/2 00:00:00 grep 18752oracle 18752 1 0 May11 ? 00:00:00 asm_dbw0_+ASM2[oracle@devrac1b ~]$ ps -ef | grep 18754oracle 3681 15351 0 15:30 pts/2 00:00:00 grep 18754oracle 18754 1 0 May11 ? 00:00:00 asm_lgwr_+ASM2[oracle@devrac1b ~]$ ps -ef | grep 18760oracle 4538 15351 0 15:30 pts/2 00:00:00 grep 18760oracle 18760 1 0 May11 ? 00:08:53 asm_rbal_+ASM2[oracle@devrac1b ~]$ ps -ef | grep 18762oracle 4726 15351 0 15:30 pts/2 00:00:00 grep 18762oracle 18762 1 0 May11 ? 00:00:00 asm_gmon_+ASM2### STEPS TO REPRODUCE ###SQL>alter diskgroup drop disk$lsof Hdr: 7225720 10.2.0.4 RDBMS 10.2.0.4 ASM PRODID-5 PORTID-197 4693355Abstract: ASM DOES NOT CLOSE OPEN DESCRIPTORS EVEN AFTER APPLYING THE PATCH 4693355Fixed-Releases: A2041 A205 B1071 B201 PATCH:A204.RECRAC.2 PATCH:B107.RECRAC.1Article-ID: Note 7225720.8Circulation: PUBLISHED (EXTERNAL)Title: Bug 7225720 – ASM does not close open descriptors
oradebug call close 可以用来强制释放文件句柄,但是注意 直接使用该系统调用是绕过 OPI 的,可能造成后续的未知问题,不是理想的解决方案