一套Linux x86-64上的11.2.0.2 RAC+ASM系统,其中一个节点出现了ORA-00600[kfioTranslateIO03]内部错误,其具体日志如下:
=============================alert.log===============================
adrci> show alert -tail -f
2011-05-30 20:29:12.657000 +08:00
Starting background process RSMN
RSMN started with pid=31, OS id=22084
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /s01/orabase
ALTER DATABASE MOUNT /* db agent *//* {0:7:3} */
This instance was first to mount
2011-05-30 20:29:15.026000 +08:00
Sweep [inc][100831]: completed
Sweep [inc2][100831]: completed
NOTE: Loaded library: System
ORA-15025: could not open disk "/dev/raw/raw1"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/raw/raw2"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/raw/raw3"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/raw/raw5"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
SUCCESS: diskgroup DATA was mounted
NOTE: dependency between database PROD and diskgroup resource ora.DATA.dg is established
Errors in file /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ckpt_22056.trc (incident=104831):
ORA-00600: internal error code, arguments: [kfioTranslateIO03], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /s01/orabase/diag/rdbms/prod/PROD1/incident/incdir_104831/PROD1_ckpt_22056_i104831.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
adrci> show problem
ADR Home = /s01/orabase/diag/rdbms/prod/PROD1:
*************************************************************************
PROBLEM_ID PROBLEM_KEY LAST_INCIDENT
-------------------- ----------------------------------------------------------- --------------------
2 ORA 7445 [kghdmp_new()+1133] 18387
3 ORA 7445 [kghfnd()+2672] 20701
5 ORA 7445 [kcldmp()+246] 28229
6 ORA 7445 [kclxle()+311] 28230
1 ORA 4031 56918
4 ORA 445 90278
7 ORA 600 [kfioTranslateIO03] 108831
adrci> show incident -mode detail -p "incident_id=108831"
ADR Home = /s01/orabase/diag/rdbms/prod/PROD1:
*************************************************************************
**********************************************************
INCIDENT INFO RECORD 1
**********************************************************
INCIDENT_ID 108831
STATUS ready
CREATE_TIME 2011-05-30 20:31:55.484000 +08:00
PROBLEM_ID 7
CLOSE_TIME
FLOOD_CONTROLLED none
ERROR_FACILITY ORA
ERROR_NUMBER 600
ERROR_ARG1 kfioTranslateIO03
ERROR_ARG2
ERROR_ARG3
ERROR_ARG4
ERROR_ARG5
ERROR_ARG6
ERROR_ARG7
ERROR_ARG8
ERROR_ARG9
ERROR_ARG10
ERROR_ARG11
ERROR_ARG12
SIGNALLING_COMPONENT ASM
SIGNALLING_SUBCOMPONENT
SUSPECT_COMPONENT
SUSPECT_SUBCOMPONENT
ECID
IMPACTS 0
PROBLEM_KEY ORA 600 [kfioTranslateIO03]
FIRST_INCIDENT 96831
FIRSTINC_TIME 2011-05-30 20:24:40.372000 +08:00
LAST_INCIDENT 108831
LASTINC_TIME 2011-05-30 20:31:55.484000 +08:00
IMPACT1 0
IMPACT2 0
IMPACT3 0
IMPACT4 0
KEY_NAME ProcId
KEY_VALUE 19.1
KEY_NAME Client ProcId
KEY_VALUE oracle@rh2.oracle.com.22504_139763918456544
KEY_NAME SID
KEY_VALUE 397.1
OWNER_ID 1
INCIDENT_FILE /s01/orabase/diag/rdbms/prod/PROD1/incident/incdir_108831/PROD1_ckpt_22504_i108831.trc
OWNER_ID 1
INCIDENT_FILE /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ckpt_22504.trc
1 rows fetched
===================================trace===================================
adrci> view /s01/orabase/diag/rdbms/prod/PROD1/incident/incdir_108831/PROD1_ckpt_22504_i108831.trc
Dump continued from file: /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ckpt_22504.trc
ORA-00600: internal error code, arguments: [kfioTranslateIO03], [], [], [], [], [], [], [], [], [], [], []
========= Dump for incident 108831 (ORA 600 [kfioTranslateIO03]) ========
----- Beginning of Customized Incident Dump(s) -----
kfioRqSet=0x7f1d524151c0 parent=0x7fffb2642d30 gn=(64.0) cnt=0
size=32768 vxn=0 byte offset=16384 buf offset=0
tried[0]=0 tried[1]=0 tried[2]=0 tried[3]=0 tried[4]=0 tried[5]=0
skipped[0]=0 skipped[1]=0 skipped[2]=0 skipped[3]=0 skipped[4]=0 skipped[5]=0
parent :
DDE: Ending a split invocation on error recording!
----- End of Customized Incident Dump(s) -----
*** 2011-05-30 20:31:55.548
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+36 call kgdsdst() 000000000 ? 000000000 ?
7FFFB2634D58 ? 000000001 ?
000000001 ? 000000002 ?
ksedst1()+98 call skdstdst() 000000000 ? 000000000 ?
7FFFB2634D58 ? 000000001 ?
000000000 ? 000000002 ?
ksedst()+34 call ksedst1() 000000000 ? 000000001 ?
7FFFB2634D58 ? 000000001 ?
000000000 ? 000000002 ?
dbkedDefDump()+2741 call ksedst() 000000000 ? 000000001 ?
7FFFB2634D58 ? 000000001 ?
000000000 ? 000000002 ?
ksedmp()+36 call dbkedDefDump() 000000003 ? 000000002 ?
7FFFB2634D58 ? 000000001 ?
000000000 ? 000000002 ?
ksfdmp()+64 call ksedmp() 000000003 ? 000000002 ?
7FFFB2634D58 ? 000000001 ?
000000000 ? 000000002 ?
dbgexPhaseII()+1764 call ksfdmp() 000000003 ? 000000002 ?
7FFFB2634D58 ? 000000001 ?
000000000 ? 000000002 ?
dbgexExplicitEndInc call dbgexPhaseII() 7F1D5281F710 ? 7F1D52822500 ?
()+750 7FFFB2640890 ? 000000001 ?
000000000 ? 000000002 ?
dbgeEndDDEInvocatio call dbgexExplicitEndInc 7F1D5281F710 ? 7F1D52822500 ?
nImpl()+767 () 7FFFB2640890 ? 000000001 ?
000000000 ? 000000002 ?
dbgeEndSpltInvokOnR call dbgeEndDDEInvocatio 7F1D5281F710 ? 7F1D52822500 ?
ec()+265 nImpl() 7FFFB2640890 ? 000000001 ?
000000000 ? 000000002 ?
dbgePostErrorKGE()+ call dbgeEndSpltInvokOnR 7F1D5281F710 ? 7F1D52822500 ?
248 ec() 7FFFB2640890 ? 000000001 ?
000000000 ? 000000002 ?
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 000000000 ? 7F1D52830E40 ?
63 000003AE9 ? 000000000 ?
100000000 ? 000000002 ?
kgeade()+351 call dbkePostKGE_kgsf() 00B7C8EA0 ? 7F1D52830E40 ?
000003AE9 ? 000000000 ?
100000000 ? 000000002 ?
kgerelv()+135 call kgeade() 00B7C8EA0 ? 00B7C9050 ?
7F1D52830E40 ? 000003AE9 ?
100000000 ? 000000002 ?
kserecl0()+157 call kgerelv() 00B7C8EA0 ? 7F1D52830E40 ?
000003AE9 ? 00952980C ?
7FFFB2641C10 ? 000000000 ?
kfioErrorRecord()+7 call kserecl0() 00B7C8EA0 ? 7F1D52830E40 ?
6 000003AE9 ? 000000005 ?
7FFFB2641C60 ? 000000000 ?
kfiorq_dump()+129 call kfioErrorRecord() 7FFFB2642D30 ? 7F1D52830E40 ?
000003AE9 ? 000000005 ?
7FFFB2641C60 ? 000000000 ?
kfioRqSetDump()+565 call kfiorq_dump() 7FFFB2642D30 ? 7F1D52830E40 ?
000003AE9 ? 000000005 ?
7FFFB2641C60 ? 000000000 ?
kfioTranslateIO()+3 call kfioRqSetDump() 7F1D524151C0 ? 7F1D52830E40 ?
079 000003AE9 ? 000000005 ?
kfioRqSetPrepare()+ call kfioTranslateIO() 7F1D524151C0 ? 7F1D52415098 ?
1017 7FFFB26421D4 ? 7FFFB26421D0 ?
0D4F338B0 ? 000000000 ?
kfioSubmitIO()+2852 call kfioRqSetPrepare() 7F1D524151C0 ? 7F1D52415098 ?
7FFFB26425D8 ? 7FFFB2642608 ?
0D4F338B0 ? 000000000 ?
kfioRequestPriv()+1 call kfioSubmitIO() 7FFFB2642E10 ? 000000001 ?
94 7FFFB26425D8 ? 7FFFB2642608 ?
0D4F338B0 ? 000000000 ?
kfioRequest()+701 call kfioRequestPriv() 000000000 ? 000000001 ?
7FFFB2642E18 ? 000000001 ?
000000000 ? 000000000 ?
ksfd_kfioRequest()+ call kfioRequest() 7FFFB2642E10 ? 000000001 ?
644 7FFFB2642E18 ? 000000001 ?
000000000 ? 7FFF00000000 ?
ksfd_osmio()+1050 call ksfd_kfioRequest() 7FFFB2642E10 ? 000000001 ?
7FFFB2642E18 ? 000000001 ?
000000000 ? 000000000 ?
ksfd_io()+2717 call ksfd_osmio() 000000001 ?
FFFFFFFFB2642D30 ?
FFFFFFFFB2642D30 ?
0D400A0B0 ? 000008000 ?
7FFFB2643170 ?
ksfdread()+576 call ksfd_io() 0D400A0B0 ? 000000001 ?
7F1D52417E00 ? 000008000 ?
000000000 ? 000000703 ?
kcc_identify_file() call ksfdread() 0D400A0B0 ? 000000001 ?
+309 7F1D52417E00 ? 000008000 ?
000000000 ? 000000703 ?
kcc_identify()+225 call kcc_identify_file() 0D400A0B0 ? 7F1D52417E00 ?
000000000 ? 060019450 ?
060019630 ? 0DAC34670 ?
kccida()+225 call kcc_identify() 000000000 ? 7F1D52417E00 ?
060019630 ? 7FFFB26434A4 ?
000000000 ? 0DAC34670 ?
ksbabs()+771 call kccida() 7FFFB2643B08 ? 7F1D52417E00 ?
060019630 ? 7FFFB26434A4 ?
000000000 ? 0DAC34670 ?
ksbrdp()+971 call ksbabs() 7FFFB2643B08 ? 7F1D52417E00 ?
060019630 ? 7FFFB26434A4 ?
000000000 ? 0DAC34670 ?
adrci> view /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ckpt_22504.trc
NOTE: disk 4 is missing from group 1
Incident 108831 created, dump file: /s01/orabase/diag/rdbms/prod/PROD1/incident/incdir_108831/PROD1_ckpt_22504_i108831.trc
ORA-00600: internal error code, arguments: [kfioTranslateIO03], [], [], [], [], [], [], [], [], [], [], []
=========Start of 'kfiorq = [0x7fffb2642d30]' dumping =========
Status = UNKWOWN
Flags = READ | SYNC
Mirror side = 0
Fib = 0xd4f338b0
Offset = 1
buffer ptr = 0x7f1d52417e00
Rcount = 32768
err_kfiorq = 15081
Inflight disk IO = 0
Completed disk IO = 0
Oracle error = 0
Intended zone = 48
===Dump of all attached kfiodrq's===
=========End of 'kfiorq = [0x7fffb2642d30]' dumping =========
parent :
############# kfiofib = 0xd4f338b0 #################
Diskgroup Name =
File number = 261.747100215
File type = 1
Flags = 10
Blksize = 16384
File size = 1131 blocks
Blk one offset = 1
Redundancy = 17
Physical blocksz = 512
Open name = +DATA/prod/controlfile/current.261.747100215
Fully-qualified nm =+DATA/prod/controlfile/current.261.747100215
Mapid = 2
Slave ID = -1
Connection = 0x(nil)
############################################
Error ORA-600 signaled at ksedsts()+461<-ksf_short_stack()+77<-kge_snap_callstack()+63<-kge_sigtrace_dump()+69<-kgepop()+712<-kgersel()+175<-kfioTranslateIO()+3138<-kfi
oRqSetPrepare()+1022<-kfioSubmitIO()+2857<-kfioRequestPriv()+199<-kfioRequest()+706<-ksfd_kfioRequest()+649<-ksfd_osmio()+1055<-ksfd_io()+2722<-ksfdread()+581<-kcc_iden
tify_file()+314<-kcc_identify()+230<-kccida()+230<-ksbabs()+771<-ksbrdp()+971<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+266<-ssthrdmain()+252<-main()+201<
-__libc_start_main()+244<-_start()+36
ERROR: unrecoverable error ORA-600 raised in ASM I/O path; terminating process 22504
----- Abridged Call Stack Trace -----
ksedsts()+461<-kfioRequest()+2157<-ksfd_kfioRequest()+649<-ksfd_osmio()+1055<-ksfd_io()+2722<-ksfdread()+581<-kcc_identify_file()+314<-kcc_identify()+230<-kccida()+230<
-ksbabs()+771<-ksbrdp()+971<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+266<-ssthrdmain()+252
<-main()+201<-__libc_start_main()+244<-_start()+36 ----- End of Abridged Call Stack Trace ----- *** 2011-05-30 20:31:56.271 KSU: Terminating fatal process 'oracle@rh2.oracle.com (CKPT)' adrci> ips create package
Created package 2 without any contents, correlation level typical
adrci> ips add problem 7 package 2
Added problem 7 to package 2
adrci> ips finalize package 2
Finalized package 2
adrci> ips generate package 2 in /tmp
Generated package 2 in file /tmp/IPSPKG_20110531224208_COM_1.zip, mode complete
诊断发现由于ASM diskgroup磁盘组中的磁盘设备文件/dev/raw/raw*的权限被修改成了0600,而这些裸设备的拥有者为grid用户,导致oracle用户无法读写这些裸设备,通过将设备文件的权限修改为0660,解决了该问题。