一套Sparc Solaris上的10.2.0.1数据库,告警日志中出现ORA-07445:[_memcmp()+88] [SIGSEGV]内部错误日志,具体日志如下:
Errors in file /global/oracle1/centDB/admin/centDB/udump/centdb_ora_8749.trc: ORA-07445: exception encountered: core dump [_memcmp()+88] [SIGSEGV] [Address not mapped to object] [0x000000010] [] /global/oracle1/centDB/admin/centDB/udump/centdb_ora_8749.trc Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production With the Partitioning, OLAP and Data Mining options ORACLE_HOME = /global/oracle1/ORAHOME1/product/10.2/db_1 System name: SunOS Node name: ora03ud-us Release: 5.10 Version: Generic_142900-13 Machine: sun4u Instance name: centDB Redo thread mounted by this instance: 1 Oracle process number: 41 Unix process pid: 8749, image: oraclecentDB@ora03ud-us *** SERVICE NAME:(SYS$USERS) 2011-03-08 04:54:18.528 *** SESSION ID:(1226.58882) 2011-03-08 04:54:18.528 Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x10, PC: [0xffffffff7d600ca4, _memcmp()+88] *** 2011-03-08 04:54:18.533 ksedmp: internal or fatal error ORA-07445: exception encountered: core dump [_memcmp()+88] [SIGSEGV] [Address not mapped to object] [0x000000010] [] [] ----- Call Stack Trace ----- ksedmp <- ssexhd <- sighndlr <- call_user_handler <- sigacthandler <- memcmp <- kpzgkvl <- kziaia <- kpolnb <- kpolon <- opiodr <- ttcpip <- opitsk <- opiino <- opiodr <- opidrv <- sou2o <- opimai_real <- main <- start (session) sid: 1226 trans: 0, creator: 3bd522c00, flag: (41) USR/- BSY/-/-/-/-/- DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000 txn branch: 0 oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS O/S info: user: root, term: unknown, ospid: , machine: 7cta-031-eqism program: JDBC Thin Client application name: JDBC Thin Client, hash value=0 last wait for 'SQL*Net message from client' blocking sess=0x0 seq=2 wait_time=5705 seconds since wait started=0 driver id=74637000, #bytes=1, =0 Dumping Session Wait History for 'SQL*Net message from client' count=1 wait_time=5705 driver id=74637000, #bytes=1, =0 for 'SQL*Net message to client' count=1 wait_time=5 driver id=74637000, #bytes=1, =0 temporary object counter: 0 ---------------------------------------- Virtual Thread: kgskvt: 3a6bc1be8, sess: 3bcdc71e8, vc: 0, proc: 0 consumer group cur: (upd? 0), mapped: , orig: vt_state: 0x0, vt_flags: 0x20, blkrun: 0 is_assigned: 0, in_sched: 0 (0) vt_active: 0 (pending: 0) used quanta: 0 (cg: 0) cpu start time: 0, quantum status: 0x0 quantum checks to skip: 0, check thresh: 0 idle time: 0, active time: 0 (cg: 0) cpu yields: 0 (cg: 0), waits: 0 (cg: 0), wait time: 0 (cg: 0) queued time outs: 0, time: 0 (cur 0, cg 0) calls aborted: 0, num est exec limit hit: 0 undo current: 0k max: 0k
以上7445内部错误并未导致实例意外终止crash,可以看到其最近的stack call为:memcmp kpzgkvl kziaia kpolnb kpolon opiodr ttcpip opitsk opiino opiodr opidrv sou2o opimai_real main start;通过Metalink搜索可以同Bug 5292883的调用堆栈匹配,Bug Note如下:
Bug 5292883 Dump from OCI client using OCI7 olog() call Affects: Product (Component) Oracle Server (Rdbms) Range of versions believed to be affected Versions < 11 Versions confirmed as being affected 10.2.0.3 Platforms affected Generic (all / most platforms affected) Fixed: This issue is fixed in 10.2.0.2 Patch 10 on Windows Platforms 10.2.0.3 Patch 7 on Windows Platforms 10.2.0.4 (Server Patch Set) 11.1.0.6 (Base Release) Symptoms: Related To: Process May Dump (ORA-7445) / Abend / Abort Dump in or under kpzgkvl / kziaia OCI Description On a 64 bit machines a dump can occur with the following stack if the client uses the olog() OCI call to connect. memcmp()<-kpzgkvl()<-kziaia()<--kpolnb()<-kpolon() Workaround Use OCIServerAttach and OCISessionBegin instead of olog() Hdr: 5292883 10.2.0.2.0 RDBMS 10.2.0.2.0 SECURITY PRODID-5 PORTID-197 ORA-7445 Abstract: ORA-7445: EXCEPTION ENCOUNTERED: CORE DUMP [_MEMCMP()+160] WHEN CONNECTING TO I PROBLEM: -------- 1. Clear description of the problem encountered OCI client connection fails with ORA-7445 [_memcmp()+160]. At the time of error occurence no connection to database could be established by OCI clients. Only sqlplus sessions were able to connect. The alertlog confirms a lot of ORA-7445 which were logged. Client version is 9.2.0.6. ALERT LOG --------- Tue Jun 6 18:59:14 2006 Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Tue Jun 6 19:02:32 2006 Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_16756.trc: ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV] [Address not mapped to object] [0x2000000000] [] [] Tue Jun 6 19:02:32 2006 Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_16756.trc: ORA-81: address range [0x60000000000A7D70, 0x60000000000A7D74) is not readable ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV] [Address not mapped to object] [0x2000000000] [] [] Tue Jun 6 19:04:42 2006 Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_20524.trc: ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV] [Address not mapped to object] [0x2A00000000] [] [] Tue Jun 6 19:04:42 2006 Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_20524.trc: ORA-81: address range [0x60000000000A7D70, 0x60000000000A7D74) is not readable ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV] [Address not mapped to object] [0x2A00000000] [] [] ... 2. Pertinent configuration information (MTS/OPS/distributed/etc) 3 instance RAC database. Errors were logged for 2 of these instances. 3. Indication of the frequency and predictability of the problem Intermittend occurence. Not reproducible at will. 4. Technical impact on the customer. Include persistent after effects. No connections possible from ERP application which fails with ORA-7445. More than 500 ERP users affected. STACK TRACE: ------------ _memcmp()+160 callkpzgkvl()+192 call _memcmp() 2000000002 ? 4000000001229FF0 ? 000000011 ? kziaia()+480 call kpzgkvl() 9FFFFFFFFFFFADB0 ? 9FFFFFFFFFFFAE18 ? 4000000001229FF0 ? 000000011 ? 000000000 ? 9FFFFFFFFFFF6ED8 ? 9FFFFFFFFFFF6EE0 ? 9FFFFFFFFFFF6ED0 ? kpolnb()+1344 call kziaia() 9FFFFFFFFFFF8040 ? 9FFFFFFFFFFF6EE0 ? 9FFFFFFFFFFF6ED8 ? 9FFFFFFFFFFF81E0 ? 9FFFFFFFFFFF81D8 ? 9FFFFFFFFFFF81E8 ? 000000000 ? 400000000233B440 ? kpolon()+336 call kpolnb() 9FFFFFFFFFFF8030 ? 4000000003F47E10 ? 9FFFFFFFFFFF6F80 ? 600000000009DB00 ? 00000820D ? opiodr()+2064 call kpolon() 000000051 ? 60000000000219A8 ? 9FFFFFFFFFFF81F0 ? 9FFFFFFFFFFF81B0 ? 60000000000AAA50 ? 40000000030BE570 ? ttcpip()+1824 call opiodr() 60000000000AA3B0 ? 6000000000015DD0 ? 9FFFFFFFFFFFA9A0 ? 6000000000015DD0 ? 9FFFFFFFFFFF82D0 ? 600000000009DB00 ? 00000001A ? 6000000000021838 ?
该Bug 5292883在10.2.0.1上没有相应的one-off patch补丁,而在11g和10.2.0.4补丁集中得到修复(fix)。如果无法实施补丁的话,那么一般可以通过以下2种途径绕过该问题:
1)限制用户名和密码的长度在9个字符以内
2)若使用OCI,登录使用OCIServerAttach和OCISessionBegin函数