可能很多朋友都遇到过这样的情况,在UNIX/Linux上定期清理Oracle日志文件夹时可能删除到仍被后台进程open着的trace文件,即某些后台进程一直持有着这些”被已经误删了的“打开文件的描述符(fd),这种情况下文件系统上该文件实际占用的空间是不会被释放的,这就造成使用df命令查看文件系统剩余空间和用du命令查看文件夹空间使用量时数值不一致的问题。此外因为是后台进程持有这些打开文件描述符,所以我们无法像kill服务进程一样来解决该问题(部分后台进程是可以kill的,不建议这样做)。oradebug是sqlplus中威力强大的debug命令,我们可以通过该命令发起多种trace/dump,其中也包括了close_trace事件;close_trace事件可以让指定进程关闭其正持有的trace文件。
下面我们就来演示下相关操作:
[maclean@rh2 ~]$ ps -ef|grep ora_|grep -v grep maclean 7281 1 0 16:35 ? 00:00:00 ora_pmon_PROD maclean 7283 1 0 16:35 ? 00:00:00 ora_psp0_PROD maclean 7285 1 0 16:35 ? 00:00:00 ora_mman_PROD maclean 7287 1 0 16:35 ? 00:00:00 ora_dbw0_PROD maclean 7289 1 0 16:35 ? 00:00:00 ora_lgwr_PROD maclean 7291 1 0 16:35 ? 00:00:00 ora_ckpt_PROD maclean 7293 1 0 16:35 ? 00:00:00 ora_smon_PROD maclean 7295 1 0 16:35 ? 00:00:00 ora_reco_PROD maclean 7297 1 0 16:35 ? 00:00:00 ora_cjq0_PROD maclean 7299 1 0 16:35 ? 00:00:00 ora_mmon_PROD maclean 7301 1 0 16:35 ? 00:00:00 ora_mmnl_PROD maclean 7303 1 0 16:35 ? 00:00:00 ora_d000_PROD maclean 7305 1 0 16:35 ? 00:00:00 ora_s000_PROD maclean 7313 1 0 16:35 ? 00:00:00 ora_qmnc_PROD maclean 7430 1 0 16:35 ? 00:00:00 ora_q000_PROD maclean 7438 1 0 16:36 ? 00:00:00 ora_q001_PROD /* lgwr是著名的Oracle后台进程,在这个启动的实例中其系统进程号为7289*/ [maclean@rh2 ~]$ ls -l /proc/7289/fd /* linux上的proc文件系统可以很方便我们探测进程信息*/ total 0 lr-x------ 1 maclean oinstall 64 Jul 26 16:38 0 -> /dev/null lr-x------ 1 maclean oinstall 64 Jul 26 16:38 1 -> /dev/null lr-x------ 1 maclean oinstall 64 Jul 26 16:38 10 -> /dev/zero lr-x------ 1 maclean oinstall 64 Jul 26 16:38 11 -> /dev/zero lr-x------ 1 maclean oinstall 64 Jul 26 16:38 12 -> /s01/rac10g/rdbms/mesg/oraus.msb lrwx------ 1 maclean oinstall 64 Jul 26 16:38 13 -> /s01/rac10g/dbs/hc_PROD.dat lrwx------ 1 maclean oinstall 64 Jul 26 16:38 14 -> /s01/rac10g/dbs/lkPROD lrwx------ 1 maclean oinstall 64 Jul 26 16:38 15 -> /s01/rac10g/oradata/PROD/controlfile/o1_mf_64q6xphj_.ctl lrwx------ 1 maclean oinstall 64 Jul 26 16:38 16 -> /s01/rac10g/flash_recovery_area/PROD/controlfile/o1_mf_64q6xpms_.ctl lrwx------ 1 maclean oinstall 64 Jul 26 16:38 17 -> /s01/rac10g/oradata/PROD/onlinelog/o1_mf_1_64q6xrsr_.log lrwx------ 1 maclean oinstall 64 Jul 26 16:38 18 -> /s01/rac10g/flash_recovery_area/PROD/onlinelog/o1_mf_1_64q6xsoy_.log lrwx------ 1 maclean oinstall 64 Jul 26 16:38 19 -> /s01/rac10g/oradata/PROD/onlinelog/o1_mf_2_64q6xths_.log l-wx------ 1 maclean oinstall 64 Jul 26 16:38 2 -> /s01/rac10g/admin/PROD/bdump/prod_lgwr_7289.trc lrwx------ 1 maclean oinstall 64 Jul 26 16:38 20 -> /s01/rac10g/flash_recovery_area/PROD/onlinelog/o1_mf_2_64q6xv9o_.log lrwx------ 1 maclean oinstall 64 Jul 26 16:38 21 -> /s01/rac10g/oradata/PROD/onlinelog/o1_mf_3_64q6xw1b_.log lrwx------ 1 maclean oinstall 64 Jul 26 16:38 22 -> /s01/rac10g/flash_recovery_area/PROD/onlinelog/o1_mf_3_64q6xwv0_.log lrwx------ 1 maclean oinstall 64 Jul 26 16:38 23 -> /s01/rac10g/oradata/PROD/datafile/o1_mf_system_64q6wd5j_.dbf lrwx------ 1 maclean oinstall 64 Jul 26 16:38 24 -> /s01/rac10g/oradata/PROD/datafile/o1_mf_undotbs1_64q6wd7f_.dbf lrwx------ 1 maclean oinstall 64 Jul 26 16:38 25 -> /s01/rac10g/oradata/PROD/datafile/o1_mf_sysaux_64q6wd5m_.dbf lrwx------ 1 maclean oinstall 64 Jul 26 16:38 26 -> /s01/rac10g/oradata/PROD/datafile/o1_mf_users_64q6wd89_.dbf lrwx------ 1 maclean oinstall 64 Jul 26 16:38 27 -> /s01/rac10g/oradata/PROD/datafile/o1_mf_temp_64q6xyox_.tmp lr-x------ 1 maclean oinstall 64 Jul 26 16:38 28 -> /s01/rac10g/rdbms/mesg/oraus.msb lr-x------ 1 maclean oinstall 64 Jul 26 16:38 3 -> /dev/null lr-x------ 1 maclean oinstall 64 Jul 26 16:38 4 -> /dev/null l-wx------ 1 maclean oinstall 64 Jul 26 16:38 5 -> /s01/rac10g/admin/PROD/udump/prod_ora_7279.trc l-wx------ 1 maclean oinstall 64 Jul 26 16:38 6 -> /s01/rac10g/admin/PROD/bdump/alert_PROD.log lrwx------ 1 maclean oinstall 64 Jul 26 16:38 7 -> /s01/rac10g/dbs/lkinstPROD (deleted) lrwx------ 1 maclean oinstall 64 Jul 26 16:38 8 -> /s01/rac10g/dbs/hc_PROD.dat l-wx------ 1 maclean oinstall 64 Jul 26 16:38 9 -> /s01/rac10g/admin/PROD/bdump/alert_PROD.log /*可以看到lgwr进程相关trace文件为/s01/rac10g/admin/PROD/bdump/prod_lgwr_7289.trc,对应打开文件描述符为2*/ [maclean@rh2 ~]$ ls -lh /s01/rac10g/admin/PROD/bdump/prod_lgwr_7289.trc -rw-r----- 1 maclean oinstall 1.7M Jul 26 16:37 /s01/rac10g/admin/PROD/bdump/prod_lgwr_7289.trc [maclean@rh2 ~]$ rm -f /s01/rac10g/admin/PROD/bdump/prod_lgwr_7289.trc /*尝试删除该trace文件*/ [maclean@rh2 ~]$ ls -l /proc/7289/fd|grep lgwr l-wx------ 1 maclean oinstall 64 Jul 26 16:38 2 -> /s01/rac10g/admin/PROD/bdump/prod_lgwr_7289.trc (deleted) /*文件已处在deleted状态,但lgwr进程仍持有该文件相关的文件描述符,这个时候该文件占有的空间并不会被释放*/ [maclean@rh2 ~]$ lsof|grep lgwr oracle 7289 maclean 2w REG 8,2 1702391 3867134 /s01/rac10g/admin/PROD/bdump/prod_lgwr_7289.trc (deleted) [maclean@rh2 ~]$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 26 17:03:04 2010 Copyright (c) 1982, 2010, Oracle. All Rights Reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> oradebug setospid 7289; Oracle pid: 6, Unix process pid: 7289, image: oracle@rh2 (LGWR) SQL> oradebug flush; /*写出trace buffer内容到trace文件*/ Statement processed. SQL> oradebug close_trace; Statement processed. /*close_trace能够释放指定Oracle进程正打开着的文件,To close the current trace file use*/ SQL> host [maclean@rh2 ~]$ lsof|grep lgwr [maclean@rh2 ~]$ ls -l /proc/7289/fd/|grep lgwr [maclean@rh2 ~]$ /* 从进程相关的fd文件夹中查找不到原来的trace文件;close_trace命令成功释放了该文件,并回收了磁盘空间。*/
Applies to:
Oracle Server – Enterprise Edition – Version: 10.2.0.1
Information in this document applies to any platform.
Goal
This article outlines how to truncate a trace file generated by a background process without restarting the database and without breaking the logging mechanism.
In the first place you should investigate why this tracing is being written. If it is due to an event setting then this event needs to be disabled otherwise the tracing will just restart after the file has been truncated.
Solution
1. First identify the process id of the background process in question, in this example I will use LGWR background process:
SQL> select pid, program from v$process where program like ‘%LGWR%’;
PID PROGRAM
———- ————————————————
6 oracle@besun21 (LGWR)
2. Secondly, use oradebug to set the orapid and therby attach to the background process
SQL> oradebug setorapid 6
Unix process pid: 21955, image: oracle@besun21 (LGWR)
3. Rename the trace file at the Operating System level
4. Next close the trace file:
SQL> oradebug close_trace
When the background process needs to write to the trace file again it will create a new logfile.
Applies to:
Oracle Server – Enterprise Edition – Version: 9.2.0 to 10.2.0
Oracle Server – Standard Edition – Version: 9.2.0 to 10.2.0
Information in this document applies to any platform.
Oracle Real Application Cluster
Oracle Rdbms (Single Instance)
Goal
The goal of this document is to help recreate the trace files for background processes (like lmon,lms,pmon,ckpt) which have been deleted or renamed to some other file. There are many reasons why customers delete trace files. Some examples are
* Customers have jobs which go in and delete trace files which have not been written into for X days.
* Lack of space on the file system containing these traces
* Accidental as in file to be deleted was ora_lms1_10945.trc but the command given was rm -r ora_lms1*
In the end, it does not matter why these traces are deleted but the importance of these trace files come up when these trace files are requested by Oracle support for any issue that the customer has opened an Service request for.
It is a bad idea to delete trace files of background processes which are still active. If an automated job has been implemented to remove inactive trace files then it is recommended that the job skips the deleting / archiving or moving of files that are open.
Solution
Oracle RDBMS by itself does not delete any trace files. The process is unaware that the file it is writing to is deleted and continues to write to the file descriptor.
The steps below will help recreate the file however if the steps are performed after the issue then those information is gone. For example lets say
* The trace file for lms is deleted at 14:01
* There is some problem at 14:50 for which customer opens an Service request with Oracle Support
* Support requests the lms trace file
* Customer finds the lms trace file is missing
* Customer recreates the lms trace file at 15:05 following the steps in this Note
* However the information that would have helped to understand the issue at time 14:50 is lost since the trace file will only contain information subsequent to 15:05
Steps to recreate a trace file
1. Find out the process id of the process which was supposed to write trace (ps -ef |grep $ORACLE_SID |grep lms) There can be multiple lms so be sure to find the exact process
4. SQL>oradebug close_trace
2. Login to SQL*plus using “/ as sysdba”
3. SQL>oradebug setospid
5. SQL>oradebug flush
6. SQL>exit
7. Now check the file in the bdump location, a new file would have been created
Caution
Oradebug sends an interrupt to the background process so there is always a chance that something could go wrong. The server process may terminate or report an error when when these steps are executed. Therefore it is recommended to use these steps with caution.