Oracle内部错误ORA-07445:[_memcmp()+88] [SIGSEGV]一例

一套Sparc Solaris上的10.2.0.1数据库,告警日志中出现ORA-07445:[_memcmp()+88] [SIGSEGV]内部错误日志,具体日志如下:

Errors in file /global/oracle1/centDB/admin/centDB/udump/centdb_ora_8749.trc:
ORA-07445: exception encountered: core dump [_memcmp()+88] [SIGSEGV] [Address not mapped to object] [0x000000010] []

/global/oracle1/centDB/admin/centDB/udump/centdb_ora_8749.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /global/oracle1/ORAHOME1/product/10.2/db_1
System name: SunOS
Node name: ora03ud-us
Release: 5.10
Version: Generic_142900-13
Machine: sun4u
Instance name: centDB
Redo thread mounted by this instance: 1
Oracle process number: 41
Unix process pid: 8749, image: oraclecentDB@ora03ud-us

*** SERVICE NAME:(SYS$USERS) 2011-03-08 04:54:18.528
*** SESSION ID:(1226.58882) 2011-03-08 04:54:18.528
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x10, PC: [0xffffffff7d600ca4, _memcmp()+88]
*** 2011-03-08 04:54:18.533
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [_memcmp()+88] [SIGSEGV] [Address not mapped to object] [0x000000010] [] []
----- Call Stack Trace -----
ksedmp <- ssexhd <- sighndlr <- call_user_handler <- sigacthandler
  <- memcmp <- kpzgkvl <- kziaia <- kpolnb <- kpolon
   <- opiodr <- ttcpip <- opitsk <- opiino <- opiodr
   <- opidrv <- sou2o <- opimai_real <- main <- start

(session) sid: 1226 trans: 0, creator: 3bd522c00, flag: (41) USR/- BSY/-/-/-/-/-
     DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
     txn branch: 0
     oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
O/S info: user: root, term: unknown, ospid: , machine: 7cta-031-eqism
     program: JDBC Thin Client
application name: JDBC Thin Client, hash value=0
last wait for 'SQL*Net message from client' blocking sess=0x0 seq=2 wait_time=5705 seconds since wait started=0
      driver id=74637000, #bytes=1, =0
Dumping Session Wait History
 for 'SQL*Net message from client' count=1 wait_time=5705
      driver id=74637000, #bytes=1, =0
 for 'SQL*Net message to client' count=1 wait_time=5
      driver id=74637000, #bytes=1, =0
temporary object counter: 0
 ----------------------------------------
 Virtual Thread:
 kgskvt: 3a6bc1be8, sess: 3bcdc71e8, vc: 0, proc: 0
 consumer group cur: (upd? 0), mapped: , orig:
 vt_state: 0x0, vt_flags: 0x20, blkrun: 0
 is_assigned: 0, in_sched: 0 (0)
 vt_active: 0 (pending: 0)
 used quanta: 0 (cg: 0)
 cpu start time: 0, quantum status: 0x0
 quantum checks to skip: 0, check thresh: 0
 idle time: 0, active time: 0 (cg: 0)
 cpu yields: 0 (cg: 0), waits: 0 (cg: 0), wait time: 0 (cg: 0)
 queued time outs: 0, time: 0 (cur 0, cg 0)
 calls aborted: 0, num est exec limit hit: 0
 undo current: 0k max: 0k

以上7445内部错误并未导致实例意外终止crash,可以看到其最近的stack call为:memcmp kpzgkvl kziaia kpolnb kpolon opiodr ttcpip opitsk opiino opiodr opidrv sou2o opimai_real main start;通过Metalink搜索可以同Bug 5292883的调用堆栈匹配,Bug Note如下:

Bug 5292883  Dump from OCI client using OCI7 olog() call
Affects:

Product (Component)	 Oracle Server (Rdbms)
Range of versions believed to be affected	 Versions < 11
Versions confirmed as being affected	
10.2.0.3
Platforms affected	 Generic (all / most platforms affected)
Fixed:

This issue is fixed in	
10.2.0.2 Patch 10 on Windows Platforms
10.2.0.3 Patch 7 on Windows Platforms
10.2.0.4 (Server Patch Set)
11.1.0.6 (Base Release)
Symptoms:

Related To:

Process May Dump (ORA-7445) / Abend / Abort
Dump in or under kpzgkvl / kziaia
OCI
Description

On a 64 bit machines a dump can occur with the following stack
if the client uses the olog() OCI call to connect.
  memcmp()<-kpzgkvl()<-kziaia()<--kpolnb()<-kpolon() 

Workaround
 Use OCIServerAttach and OCISessionBegin instead of olog()

Hdr: 5292883 10.2.0.2.0 RDBMS 10.2.0.2.0 SECURITY PRODID-5 PORTID-197 ORA-7445
Abstract: ORA-7445: EXCEPTION ENCOUNTERED: CORE DUMP [_MEMCMP()+160] WHEN CONNECTING TO I

PROBLEM:
--------
 1. Clear description of the problem encountered

    OCI client connection fails with ORA-7445 [_memcmp()+160]. At the 
    time of error occurence no connection to database could be established 
    by OCI clients. Only sqlplus sessions were able to connect. The alertlog
    confirms a lot of ORA-7445 which were logged. Client version is 9.2.0.6. 

    ALERT LOG
    ---------
    Tue Jun  6 18:59:14 2006
     Submitted all GCS remote-cache requests
     Post SMON to start 1st pass IR
     Fix write in gcs resources
    Reconfiguration complete
    Tue Jun  6 19:02:32 2006
    Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_16756.trc:
    ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV] 
[Address not mapped to object] [0x2000000000] [] []
    Tue Jun  6 19:02:32 2006
    Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_16756.trc:
    ORA-81: address range [0x60000000000A7D70, 0x60000000000A7D74) is not 
readable
    ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV] 
[Address not mapped to object] [0x2000000000] [] []
    Tue Jun  6 19:04:42 2006
    Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_20524.trc:
    ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV] 
[Address not mapped to object] [0x2A00000000] [] []
    Tue Jun  6 19:04:42 2006
    Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_20524.trc:
    ORA-81: address range [0x60000000000A7D70, 0x60000000000A7D74) is not 
readable
    ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV] 
[Address not mapped to object] [0x2A00000000] [] []
    ...    

 2. Pertinent configuration information (MTS/OPS/distributed/etc)  
 
    3 instance RAC database.
    Errors were logged for 2 of these instances.
    
 3. Indication of the frequency and predictability of the problem  
 
    Intermittend occurence.
    Not reproducible at will.
    
 4. Technical impact on the customer. Include persistent after effects.

    No connections possible from ERP application which fails with ORA-7445.
    More than 500 ERP users affected.

STACK TRACE:
------------
_memcmp()+160        call                  
kpzgkvl()+192        call     _memcmp()            2000000002 ?
                                                   4000000001229FF0 ?
                                                   000000011 ?
kziaia()+480         call     kpzgkvl()            9FFFFFFFFFFFADB0 ?
                                                   9FFFFFFFFFFFAE18 ?
                                                   4000000001229FF0 ?
                                                   000000011 ? 000000000 ?
                                                   9FFFFFFFFFFF6ED8 ?
                                                   9FFFFFFFFFFF6EE0 ?
                                                   9FFFFFFFFFFF6ED0 ?
kpolnb()+1344        call     kziaia()             9FFFFFFFFFFF8040 ?
                                                   9FFFFFFFFFFF6EE0 ?
                                                   9FFFFFFFFFFF6ED8 ?
                                                   9FFFFFFFFFFF81E0 ?
                                                   9FFFFFFFFFFF81D8 ?
                                                   9FFFFFFFFFFF81E8 ?
                                                   000000000 ?
                                                   400000000233B440 ?
kpolon()+336         call     kpolnb()             9FFFFFFFFFFF8030 ?
                                                   4000000003F47E10 ?
                                                   9FFFFFFFFFFF6F80 ?
                                                   600000000009DB00 ?
                                                   00000820D ?
opiodr()+2064        call     kpolon()             000000051 ?
                                                   60000000000219A8 ?
                                                   9FFFFFFFFFFF81F0 ?
                                                   9FFFFFFFFFFF81B0 ?
                                                   60000000000AAA50 ?
                                                   40000000030BE570 ?
ttcpip()+1824        call     opiodr()             60000000000AA3B0 ?
                                                   6000000000015DD0 ?
                                                   9FFFFFFFFFFFA9A0 ?
                                                   6000000000015DD0 ?
                                                   9FFFFFFFFFFF82D0 ?
                                                   600000000009DB00 ?
                                                   00000001A ?
                                                   6000000000021838 ?

该Bug 5292883在10.2.0.1上没有相应的one-off patch补丁,而在11g和10.2.0.4补丁集中得到修复(fix)。如果无法实施补丁的话,那么一般可以通过以下2种途径绕过该问题:
1)限制用户名和密码的长度在9个字符以内
2)若使用OCI,登录使用OCIServerAttach和OCISessionBegin函数

Comment

*

沪ICP备14014813号-2

沪公网安备 31010802001379号