pmon长期持有cache buffers chains导致实例hang住一例

前几日,有客户报一个备用库实例中有多个回话hang住的问题,在hang住前本地有维护人员执行了truncate表的操作。
同事前往客户现场进一步确认了问题,并传回了当时hang状况下的266级systemstate文件。该实例的并发回话数量较少,所以实例状态抓取后总的信息量并不多,这种情形中通过分析systemstate信息往往要好于分析hanganalyze信息。
通过著名的源自于metalink的ass awk脚本可以很快找出各进程的状态,以及重要资源的持有者:
awk -f ass109.awk systemstate.txt

Starting Systemstate 1
…………………………………..
Ass.Awk Version 1.0.9 – Processing systemstate.txt

System State 1
~~~~~~~~~~~~~~~~
1:
2: last wait for ‘pmon timer’
3: waiting for ‘rdbms ipc message’ wait
4: waiting for ‘rdbms ipc message’ wait
5: waiting for ‘latch: cache buffers chains'[Latch 70000048c8c66b0] wait DBW0
6: waiting for ‘latch: cache buffers chains'[Latch 70000048c8c66b0] wait DBW1
7: waiting for ‘latch: cache buffers chains'[Latch 70000048c8c66b0] wait DBW2
8: waiting for ‘rdbms ipc message’ wait
9: waiting for ‘rdbms ipc message’ wait
10: waiting for ‘rdbms ipc message’ wait
11: waiting for ‘enq: RO – fast object reuse'[Enqueue RO-0001000B-00000001] wait
12: waiting for ‘rdbms ipc message’ wait
13: waiting for ‘rdbms ipc message’ wait
14: waiting for ‘rdbms ipc message’ wait
15: waiting for ‘rdbms ipc message’ wait
16: waiting for ‘SQL*Net message from client’ wait
17: waiting for ‘SQL*Net message from client’ wait
18: waiting for ‘SQL*Net message from client’ wait
19: waiting for ‘latch: cache buffers chains'[Latch 70000048c8c66b0] wait
20: waiting for ‘SQL*Net message from client’ wait
21: waiting for ‘SQL*Net message from client’ wait
22: last wait for ‘SQL*Net message from client’ [DEAD]
Cmd: Delete
23: waiting for ‘SQL*Net message from client’ wait
24: waiting for ‘SQL*Net message from client’ wait
25:
26: waiting for ‘SQL*Net message from client’ wait
27: last wait for ‘ksdxexeotherwait’
28: waiting for ‘enq: TX – row lock contention'[Enqueue TX-00120003-00002CC0] wait
Cmd: Update
29: waiting for ‘SQL*Net message from client’ wait
30: waiting for ‘SQL*Net message from client’ wait
31: waiting for ‘SQL*Net message from client’ wait
32: waiting for ‘SQL*Net message from client’ wait
33: waiting for ‘SQL*Net message from client’ wait
34: waiting for ‘Streams AQ: qmn coordinator idle wait’ wait
35: for ‘Streams AQ: waiting for time management or cleanup tasks’ wait
36: waiting for ‘Streams AQ: qmn slave idle wait’ wait
37: waiting for ‘enq: RO – fast object reuse'[Enqueue RO-00010025-00000001] wait
38: waiting for ‘SQL*Net message from client’ wait
39: waiting for ‘SQL*Net message from client’ wait
41: waiting for ‘SQL*Net message from client’ wait
42: last wait for ‘enq: TX – row lock contention’ wait
Blockers
~~~~~~~~

Above is a list of all the processes. If they are waiting for a resource
then it will be given in square brackets. Below is a summary of the
waited upon resources, together with the holder of that resource.
Notes:
~~~~~
o A process id of ‘???’ implies that the holder was not found in the
systemstate.

Resource Holder State
Latch 70000048c8c66b0 ??? Blocker
Enqueue RO-0001000B-00000001 10: waiting for ‘rdbms ipc message’
Enqueue RO-0001000B-00000001 11: 11: is waiting for 10: 11:
Enqueue TX-00120003-00002CC0 22: last wait for ‘SQL*Net message from client’
Enqueue RO-00010025-00000001 10: waiting for ‘rdbms ipc message’
Enqueue RO-00010025-00000001 37: 37: is waiting for 10: 37:

Object Names
~~~~~~~~~~~~
Latch 70000048c8c66b0 holding (efd=14) 70000048c8c66b0 Child ca
Enqueue RO-0001000B-00000001
Enqueue TX-00120003-00002CC0
Enqueue RO-00010025-00000001

首先注意到的是RO队列锁,RO即(REUSE OBJECT),该锁用以协调前台进程与后台进程DBWR和CKPT之间的工作,该队列一般只在drop或truncate对象时可见到。可以看到pid为11和37的进程均在等待pid为10的进程,因为pid=10的进程持有着它们锁需要的RO-0001000B-00000001和RO-00010025-00000001,分析dump文件可以发现该持有进程正是CKPT后台进程;

而该检查点进程也处于BUSY的非空闲等待中,其等待事件为’rdbms ipc message’,即它在等待另一个后台进程给它发送信息。这个时候我们来观察其他忙碌的后台进程可以发现,pid为5,6,7的进程均在等待同一个栓’latch: cache buffers chains'[Latch 70000048c8c66b0];这个三个进程均为DBWR进程,此外还有一个DBW3进程处于’rdbms ipc message’等待中,多个DBWR进程是由于设置了db_writer_processes参数;看起来是ckpt进程准备对需要truncate的对象做对象级别的检查点,以保证该对象所有脏块均已写到磁盘上,所以对dbwr进程发出需要写出的message,继而进入’rdbms ipc message’等待直到dbwr进程完成写出任务,但由于dbwr进程长期无法获取某脏块对应的latch: cache buffers chains,故写出工作一直处于pending状态,这样一个hang链就十分清晰了。

我们来分析’latch: cache buffers chains'[Latch 70000048c8c66b0]这个栓,ass分析systemstate dump时将该栓的holder归为’???’,即无法自dump文件中找到该栓的持有者;进一步直接分析dump文件可以发现:
PROCESS 2:
—————————————-
SO: 70000048f529d40, type: 2, owner: 0, flag: INIT/-/-/0x00
(process) Oracle pid=2, calls cur/top: 70000047d0e75a0/70000048f8956d0, flag: (e) SYSTEM
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 504403177803376304 122 2
last post received-location: kslges
last process to post me: 70000048e533a88 1 6
last post sent: 0 0 148
last post sent-location: ktmpsm
last process posted by me: 70000048e535228 1 22
(latch info) wait_event=0 bits=2
holding (efd=14) 70000048c8c66b0 Child cache buffers chains level=1 child#=61214
Location from where latch is held: kcbgcur: kslbegin:
Context saved from call: 336311247
state=busy(exclusive) (val=0x2000000000000002) holder orapid = 2
waiters [orapid (seconds since: put on list, posted, alive check)]:
5 (480, 1278471465, 0)
6 (480, 1278471465, 0)
19 (480, 1278471465, 0)
7 (216, 1278471465, 0)
waiter count=4
Process Group: DEFAULT, pseudo proc: 70000048e5f91d8
O/S info: user: oracle, term: UNKNOWN, ospid: 74476
OSD pid info: Unix process pid: 74476, image: oracle@DR_570 (PMON)

可以看到pid=2的PMON进程holding 该cache buffers chains子栓,而dbwx进程即waiters 5,6,7;

一般情况下pmon是不会去持有类似于cache buffers chains子栓这样的低级栓的,除非在cleanup失败会话或关闭实例情况下。从ass分析信息来看,当时确实有一个进程处于DEAD状态,即22: last wait for ‘SQL*Net message from client’ [DEAD],而该会话最后所做的是delete from “LINC”.”MSDB_ACCGL” t操作。

虽然无法证明,但极有可能是22号进程在进行delete过程中发生会话失败,PMON进程尝试清理该进程,并获取了相关栓。但该进程始终没有被杀死,即便使用OS 命令:kill -9 处理该进程后仍可以在systemstate中找到该进程的信息。据同事分析,当时之前曾有一度PMON的CPU使用率达到100%,之后PMON进程进入’pmon timer’空闲等待,且一直没有释放对应子栓,令DBWR进程处于长期无法获得栓资源的进而hang住的状态。

在MOS上搜索PMON+cache buffers chains可以发现几个PMON长期持有该类子栓且从不释放的Bug,但版本为Oracle 8等较老版本,且都是不能reproduceable的case。谨以录之:

Hdr: 4126734 8.1.7.4.0 RDBMS 8.1.7.4.0 BUFFER CACHE PRODID-5 PORTID-59
Abstract: ORACLE PROCESS GOES TO CPU LOOP WHILE HOLDING “CHILD CACHE BUFFERS CHAINS” LATCH

*** 01/17/05 07:08 pm ***
TAR:
—-
4249496.999

PROBLEM:
——–
1. Clear description of the problem encountered:

Oracle process goes to CPU loop while holding “Child cache buffers chains

level=1” latch. This condition causes other processes to wait on
either “latch free” or “Child cache buffers chains” which completely hang

the instance.

When looping Oracle process is killed, PMON also goes to CPU loop
cleaning
up that process. PMON also hold the same “Child cache buffers chains
level=1” latch while instance continues to hang.

Hang disappear only after instance is bounced.

2. Pertinent configuration information (MTS/OPS/distributed/etc)

11i Apps Install (version 11.5.9.) running on Oracle 8.1.7.4.0 database.

3. Indication of the frequency and predictability of the problem

Process running pl/sql procedure doing dbms_lob.instr() and
dbms_lob.read()
seem to encounter this problem.

4. Sequence of events leading to the problem

5. Technical impact on the customer. Include persistent after effects.

In a bad day, ct encounters 2 occurence of this problem a day causing
loss
of service in peak processing period.

DIAGNOSTIC ANALYSIS:
——————–
1. Got 3 systemstate dumps while hang is occuring
2. Got 3 errorstack dumps of PMON while it is looping in CPU

WORKAROUND:
———–
bounce instance

RELATED BUGS:
————-
bug 2361194

REPRODUCIBILITY:
—————-
– always reproducible on test system at ct site.

TEST CASE:
———-
– unable to reproduce on 8.1.7.4.0 test system in-house

STACK TRACE:
————
PMON stack while looping:
ksedmp ksdxfdmp ksdxcb sspuser _sigreturn kggchk kcbso1 kcbpsod kcbsod
kssxdl kssdch ksudlc kssxdl ksudlp kssxdl ksuxdl ksuxda ksucln ksbrdp
opirip opidrv sou2o main $START$

SUPPORTING INFORMATION:
———————–
Customers running Release 11i of the E-Business Suite will be entitled
to complimentary Extended Maintenance Support through July 31, 2005.

ocfs certification with 11gr2 rac

We are planning to implement 11gR2 RAC on OCFS2 file system. We are going to have ocr files,voting disk files,database files and flash recovery area files on OCFS2 file system.
Generic Note
————————
ocfs2 is certified for oracle 11gr2 but oracle recommends using asm.
please see this for more information http://download.oracle.com/docs/cd/E11882_01/install.112/e10812/storage.htm#CWLIN262
3.1.3 Supported Storage Options
The following table shows the storage options supported for storing Oracle Clusterware and Oracle RAC files.

Note:

For information about OCFS2, refer to the following Web site:
http://oss.oracle.com/projects/ocfs2/
If you plan to install an Oracle RAC home on a shared OCFS2 location, then you must upgrade OCFS2 to at least version 1.4.1, which supports shared writable mmaps.

For OCFS2 certification status, and for other cluster file system support, refer to the Certify page on My Oracle Support.

Table 3-1 Supported Storage Options for Oracle Clusterware and Oracle RAC
Linux x86-64
11gR2 RAC
RAC for LinuxRAC Technologies Compatibility Matrix (RTCM) for Linux platforms.

http://www.oracle.com/technology/products/database/clustering/certify/tech_generic_linux_new.html

Note 183408.1 Raw Devices and Cluster Filesystems With Real Application Clusters
Note 238278.1 How to find the current OCFS or OCFS2 version for Linux
Note 811306.1 RAC Assurance Support Team: RAC Starter Kit and Best Practices (Linux)

x$ksusecst 内部视图详解

9i 中v$session_wait 是Oracle wait interface的一个主要用户接口,而该动态视图的内容来源于x$ksusecst内部视图:


SQL> select view_definition from v$fixed_view_definition where view_name='GV$SESSION_WAIT';

VIEW_DEFINITION
--------------------------------------------------------------------------------
select s.inst_id,s.indx,s.ksussseq,e.kslednam, e.ksledp1,s.ksussp1,s.ksussp1r,e.
ksledp2, s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,s.ksussp3r, decode(s.ksusstim,
0,0,-1,-1,-2,-2,   decode(round(s.ksusstim/10000),0,-1,round(s.ksusstim/10000)))
, s.ksusewtm, decode(s.ksusstim, 0, 'WAITING', -2, 'WAITED UNKNOWN TIME',  -1, '
WAITED SHORT TIME', 'WAITED KNOWN TIME')  from x$ksusecst s, x$ksled e where bit
and(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and s.ksussseq!=0 and s.ksussop
c=e.indx

SQL> desc x$ksusecst
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ADDR                                               RAW(4)
//即 v$session中 saddr 会话的起始地址
 INDX                                               NUMBER
//即 instance_id
 INST_ID                                            NUMBER
//即 sid
 KSSPAFLG                                           NUMBER
 KSUSEFLG                                           NUMBER
//该session是否仍活着, 1 为 alive
 KSUSENUM                                           NUMBER
//另一个固有编号
 KSUSSSEQ                                           NUMBER
// 相当于v$session 视图的SERIAL#列
 KSUSSOPC                                           NUMBER
// 对应x$ksled视图indx列,等待事件列表的一个序列号
 KSUSSP1                                            NUMBER
// 即v$session_wait表的p1列
 KSUSSP1R                                           RAW(4)
// 即v$session_wait表的p1raw
 KSUSSP2                                            NUMBER
// 即v$session_wait表的p2
 KSUSSP2R                                           RAW(4)
// 即v$session_wait表的p2raw
 KSUSSP3                                            NUMBER
// 即v$session_wait表的p3
 KSUSSP3R                                           RAW(4)
// 即v$session_wait表的p3raw
 KSUSSTIM                                           NUMBER
// 即v$session_wait表的wait_time,但单位为微秒
 KSUSEWTM                                           NUMBER
// 即v$session_wait表的seconds_in_wait,单位仍为秒

粗略写了一个可以代替v$session_wait视图的查询语句,过滤了可能出现的空闲等待事件,同时细化wait_time列到us级别:


select s.inst_id,
       s.indx sid,
       s.ksussseq seq#,
       e.kslednam event,
       e.ksledp1 p1text,
       s.ksussp1 p1,
       s.ksussp1r p1raw,
       e.ksledp2 p2text,
       s.ksussp2 p2,
       s.ksussp2r p2raw,
       e.ksledp3 p3text,
       s.ksussp3 p3,
       s.ksussp3r p3raw,
       s.ksusstim wait_time,
       s.ksusewtm seconds_in_wait,
       decode(s.ksusstim,
              0,
              'WAITING',
              -2,
              'WAITED UNKNOWN TIME',
              -1,
              'WAITED SHORT TIME',
              'WAITED KNOWN TIME') state
 from x$ksusecst s, x$ksled e
 where bitand(s.ksspaflg, 1) != 0
   and bitand(s.ksuseflg, 1) != 0
   and s.ksussseq != 0
   and s.ksussopc = e.indx
   and e.kslednam not in ('pmon timer',
                          'VKTM Logical Idle Wait',
                          'VKTM Init Wait for GSGA',
                          'IORM Scheduler Slave Idle Wait',
                          'rdbms ipc message',
                          'i/o slave wait',
                          'VKRM Idle',
                          'wait for unread message on broadcast channel',
                          'wait for unread message on multiple broadcast channels',
                          'class slave wait',
                          'KSV master wait',
                          'PING',
                          'watchdog main loop',
                          'DIAG idle wait',
                          'ges remote message',
                          'gcs remote message',
                          'heartbeat monitor sleep',
                          'SGA: MMAN sleep for component shrink',
                          'MRP redo arrival',
                          'LNS ASYNC archive log',
                          'LNS ASYNC dest activation',
                          'LNS ASYNC end of log',
                          'simulated log write delay',
                          'LGWR real time apply sync',
                          'parallel recovery slave idle wait',
                          'LogMiner builder: idle',
                          'LogMiner builder: branch',
                          'LogMiner preparer: idle',
                          'LogMiner reader: log (idle)',
                          'LogMiner reader: redo (idle)',
                          'LogMiner client: transaction',
                          'LogMiner: other',
                          'LogMiner: activate',
                          'LogMiner: reset',
                          'LogMiner: find session',
                          'LogMiner: internal',
                          'Logical Standby Apply Delay',
                          'parallel recovery coordinator waits for slave cleanup',
                          'parallel recovery control message reply',
                          'parallel recovery slave next change',
                          'PX Deq: Txn Recovery Start',
                          'PX Deq: Txn Recovery Reply',
                          'fbar timer',
                          'smon timer',
                          'PX Deq: Metadata Update',
                          'Space Manager: slave idle wait',
                          'PX Deq: Index Merge Reply',
                          'PX Deq: Index Merge Execute',
                          'PX Deq: Index Merge Close',
                          'PX Deq: kdcph_mai',
                          'PX Deq: kdcphc_ack',
                          'shared server idle wait',
                          'dispatcher timer',
                          'cmon timer',
                          'pool server timer',
                          'JOX Jit Process Sleep',
                          'jobq slave wait',
                          'pipe get',
                          'PX Deque wait',
                          'PX Idle Wait',
                          'PX Deq: Join ACK',
                          'PX Deq Credit: need buffer',
                          'PX Deq Credit: send blkd',
                          'PX Deq: Msg Fragment',
                          'PX Deq: Parse Reply',
                          'PX Deq: Execute Reply',
                          'PX Deq: Execution Msg',
                          'PX Deq: Table Q Normal',
                          'PX Deq: Table Q Sample',
                          'Streams fetch slave: waiting for txns',
                          'Streams: waiting for messages',
                          'Streams capture: waiting for archive log',
                          'single-task message',
                          'SQL*Net message from client',
                          'SQL*Net vector message from client',
                          'SQL*Net vector message from dblink',
                          'PL/SQL lock timer',
                          'Streams AQ: emn coordinator idle wait',
                          'EMON slave idle wait',
                          'Streams AQ: waiting for messages in the queue',
                          'Streams AQ: waiting for time management or cleanup tasks',
                          'Streams AQ: delete acknowledged messages',
                          'Streams AQ: deallocate messages from Streams Pool',
                          'Streams AQ: qmn coordinator idle wait',
                          'Streams AQ: qmn slave idle wait',
                          'Streams AQ: RAC qmn coordinator idle wait',
                          'HS message to agent',
                          'ASM background timer',
                          'auto-sqltune: wait graph update',
                          'WCR: replay client notify',
                          'WCR: replay clock',
                          'WCR: replay paused',
                          'JS external job',
                          'cell worker idle',
                          'SQL*Net message to client');

Recommended Hidden Parameters for 11gR1

Question #1:
==========
_optimizer_cost_based_transformation=false Currently set to false, should we keep or remove it for 11g upgrade?
It is a workarond for several bugs, including ORA-600 bug 6666870 fixed only in 11,2?

ANSWER
=======
_OPTIMIZER_COST_BASED_TRANSFORMATION controls whether or not the
optimizer tries different transformations against a query
using the cost with and without the transformations in order
to determine if a transformation is useful or not.
The parameter can be set to any of:
“exhaustive”, “iterative”, “linear”, “on”, “off”
giving some control over how much effort is given to costing
various transformations.

Cost based transformation can add a high overhead at parse time
but can yeild considerable benefits by way of a better plan
for the statement.

Known bugs
6666870 11.2 OERI:qctcte1 from cost based transformation
8541212 11.2 OERI [qctcte1] with function based index and OLD style join push predicate

Question #2: _undo_autotune=false Currently set to false, should we remove it for 11g upgrade?
Search key: _undo_autotune 11.1.0.7
Bug.8430038/7291739 ORA-1628 MAX # EXTENTS 32765 REACHED FOR ROLLBACK SEGMENT _SYSSMU105_123755639:
Fixed in 11.2
Patch available

If you leave the _undo_autotune=false in the parameter file in 11.1.0.7, then you will have to manually adjust
UNDO_RETENTION, and none of the historical information would be captured in undostats. It is better to remove this
parameter and allow AUM to administer the tuned retention for you.
However, in 11.1.0.7 there is a bug that can occur for which the workaround is to set it to false. This is Bug 7291739.

So my recommendation is to remove the parameter, allowing _undo_autotune to default to true, then install the fix for
Bug 7291739 in 11.1.0.7.

Question #3:
===========
_unnest_subquery=false – Currently NOT set, but recommended by PeopleSoft in note ID 749100.1
“Operating System, RDBMS & Additional Component Patches Required for Installation PeopleTools 8.49”

ANSWER
=======
_UNNEST_SUBQUERY
This parameter controls whether the optimizer attempts to unnest
correlated subqueries or not.

Known bugs
8245217 11.2 Dump [vopcpl] unnesting subquery

How to check and disable Adaptive Cursor Sharing in 11g

_optimizer_adaptive_cursor_sharing=false disables the feature.

There are 2 new columns in V$sql , IS_BIND_SENSITIVE and IS_BIND_AWARE that indicate the status for individual cursors.

1.) The parameter “_optimizer_adaptive_cursor_sharing” can be changed “on the fly”. This means if you issue an ‘alter system set “_optimizer_adaptive_cursor_sharing” = false |true; ‘ will be reflected in any existing session.

Remember, to disable ACS in 11g ,you should also set alter  system set “_optimizer_extended_cursor_sharing_rel”=’NONE’;

The parameter can be set at session or system level.
When set to NONE it stops the code from maintaining the internal statistical data about the binds.

 

And I advise you set “_optimizer_extended_cursor_sharing” = NONE .

 

2.) show parameter will always retrieve non-default settings also for hidden parameters:

sho parameter adapt
_optimizer_adaptive_cursor_sharing boolean FALSE

1.) non-default hidden (=underscore) parameters are shown with “show parameter ”
2.) the setting of hidden (=underscore) parameters are not supposed to be queried by end users.
3.) You may use 10053 tracing for obtaining the information for optimizer related parameters

sqlplus
set lines 200
set null null
set pages 99
set timi on
set time on

alter session set max_dump_file_size=unlimited;
alter session set events ‘10053 trace name context forever, level 1’;

— execute a statement causing a hardparse:

select /* a new comment */ * from dual;

exit

-> Use an editor or an unix command ( ie grep) and search for the _optimizer_adaptive_cursor_sharing parameter in the tracefile.

If you want to restore Optimizer_enabled_features from 11.2.0.1 to 10.2.0.4 , then you set:

alter session set "_optimizer_undo_cost_change" = '10.2.0.4'; -- 11.2.0.1
alter session set "_optimizer_null_aware_antijoin" = false; -- true
alter session set "_optimizer_extend_jppd_view_types" = false; -- true
alter session set "_replace_virtual_columns" = false; -- true
alter session set "_first_k_rows_dynamic_proration" = false; -- true
alter session set "_bloom_pruning_enabled" = false; -- true
alter session set "_optimizer_multi_level_push_pred" = false; -- true
alter session set "_optimizer_group_by_placement" = false; -- true
alter session set "_optimizer_extended_cursor_sharing_rel" = none; -- simple
alter session set "_optimizer_adaptive_cursor_sharing" = false; -- true
alter session set "_optimizer_improve_selectivity" = false ; -- true
alter session set "_optimizer_enable_density_improvements" = false; -- true
alter session set "_optimizer_native_full_outer_join" = off; -- force
alter session set "_optimizer_enable_extended_stats" = false; -- true
alter session set "_nlj_batching_enabled" = 0; -- 1
alter session set "_optimizer_extended_stats_usage_control" = 255; -- 224
alter session set "_bloom_folding_enabled" = false; -- true
alter session set "_optimizer_coalesce_subqueries" = false; -- true
alter session set "_optimizer_fast_pred_transitivity" = false; -- true
alter session set "_optimizer_fast_access_pred_analysis" = false; -- true
alter session set "_optimizer_unnest_disjunctive_subq" = false; -- true
alter session set "_optimizer_unnest_corr_set_subq" = false; -- true
alter session set "_optimizer_distinct_agg_transform" = false; -- true
alter session set "_aggregation_optimization_settings" = 32; -- 0
alter session set "_optimizer_connect_by_elim_dups" = false; -- true
alter session set "_optimizer_eliminate_filtering_join" = false; -- true
alter session set "_connect_by_use_union_all" = old_plan_mode; --true
alter session set "_optimizer_join_factorization" = false; -- true
alter session set "_optimizer_use_cbqt_star_transformation" = false; -- true
alter session set "_optimizer_table_expansion" = false ; -- true
alter session set "_and_pruning_enabled" = false ; -- true
alter session set "_optimizer_distinct_placement" = false ; -- true
alter session set "_optimizer_use_feedback" = false ; -- true
alter session set "_optimizer_try_st_before_jppd" = false ; -- true

DIAG Background process may consume Large PGA Size

Found that background process of diag is occupied high pga memory usage in RAC of node 1.
Value of PGA memory usage is captured by “select sid, name, value from v$statname n, v$sesstat s where n.statistic# = s.statistic# and n.name like ‘%memory%’ and s.sid=481order by sid;”

Why occuried high pga memory usage of background process of diag in node 1??

====================================================================================================
SID/Serial : 481,1
Foreground : PID: 14326 – oracle@askmac.cn (DIAG)
Shadow : PID: 14326 – oracle@askmac.cn (DIAG)
Terminal : UNKNOWN/ UNKNOWN
OS User : oracle on askmac.cn
Ora User :
Status Flags: ACTIVE DEDICATED BACKGROUND
Tran Active : NONE
Login Time : Fri 17:10:26
Last Call : Fri 17:10:27 – 8,251.4 min
Lock/ Latch : NONE/ NONE
Latch Spin : NONE
Current SQL statement:
Previous SQL statement:
Session Waits:
EVENT P2TEXT P2 seconds_in_w
—————————– ———— ———— ————
DIAG idle wait where 1 0
====================================================================================================

RAC-node 1
===========

SID NAME VALUE
———- —————————————————————- ———-
481 session uga memory 180984
481 session uga memory max 180984
481 session pga memory 1647496248
481 session pga memory max 1647496248
481 redo k-bytes read (memory) 0
481 redo k-bytes read (memory) by LNS 0
481 workarea memory allocated 0
481 sorts (memory) 0

RAC-node 2
===========

SID NAME VALUE
———- —————————————————————- ———-
481 session uga memory 180984
481 session uga memory max 180984
481 session pga memory 5950520
481 session pga memory max 5950520
481 redo k-bytes read (memory) 0
481 redo k-bytes read (memory) by LNS 0
481 workarea memory allocated 0
481 sorts (memory) 0

Bug 5092124 : PGA MEMORY FOR DIAG PROCESS LEAKS WHEN DUMPING KST TRACE

1. Please provide the output of the following query:
sql> select a.sid,a.program,b.name,c.value from v$session a,v$sysstat b,v$sesstat c where a.program like ‘%DIAG%’ and a.sid = c.sid and b.name like ‘%pga%’ and b.statistic# = c.statistic#;

2. Provide the output of the following command:
ps -ef | grep diag

3. Perform following test case:

1. Confirm the size of DIAG’s PGA.
.
select a.sid,a.program,b.name,c.value from v$session a,v$sysstat b,v$sesstat c where a.program like ‘%DIAG%’ and a.sid = c.sid and b.name like ‘%pga%’
and b.statistic# = c.statistic#;
.
SID PROGRAM NAME VALUE
—– ———————– ———————- ———-
169 oracle@jpdel1380 (DIAG) session pga memory 798524
169 oracle@jpdel1380 (DIAG) session pga memory max 798524
.
2. Connect 50 sessions via sqlplus.
.
3. Kill one of shadow process.
.
Eg.
% ps -ef | grep rac1022
rac1022 15626 15618 0 20:31 ? 00:00:00 oraclerac10221
(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
.
% kill -11 15626
.
4. DIAG dump KST traces under cdmp_xxxxx directory.
.
5. Confirm the size of DIAG’s PGA.
.
SID PROGRAM NAME VALUE
—– ———————– ———————- ———-
169 oracle@jpdel1380 (DIAG) session pga memory 2699068
169 oracle@jpdel1380 (DIAG) session pga memory max 2699068
.
6. Perform the same steps as 2-5.
Confirm the size of DIAG’s PGA.
SID PROGRAM NAME VALUE
—– ———————– ———————- ———-
169 oracle@jpdel1380 (DIAG) session pga memory 3944252
169 oracle@jpdel1380 (DIAG) session pga memory max 3944252

==> PGA for DIAG process increases.

1. Please provide the output of the following query:
sql> select a.sid,a.program,b.name,c.value from v$session a,v$sysstat b,v$sesstat c where a.program like ‘%DIAG%’ and a.sid = c.sid and b.name like ‘%pga%’ and b.statistic# = c.statistic#;

2. Provide the output of the following command:
ps -ef | grep diag

3. Perform following test case:

1. Confirm the size of DIAG’s PGA.
.
select a.sid,a.program,b.name,c.value from v$session a,v$sysstat b,v$sesstat c where a.program like ‘%DIAG%’ and a.sid = c.sid and b.name like ‘%pga%’
and b.statistic# = c.statistic#;
.
SID PROGRAM NAME VALUE
—– ———————– ———————- ———-
169 oracle@jpdel1380 (DIAG) session pga memory 798524
169 oracle@jpdel1380 (DIAG) session pga memory max 798524
.
2. Connect 50 sessions via sqlplus.
.
3. Kill one of shadow process.
.
Eg.
% ps -ef | grep rac1022
rac1022 15626 15618 0 20:31 ? 00:00:00 oraclerac10221
(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
.
% kill -11 15626
.
4. DIAG dump KST traces under cdmp_xxxxx directory.
.
5. Confirm the size of DIAG’s PGA.
.
SID PROGRAM NAME VALUE
—– ———————– ———————- ———-
169 oracle@jpdel1380 (DIAG) session pga memory 2699068
169 oracle@jpdel1380 (DIAG) session pga memory max 2699068
.
6. Perform the same steps as 2-5.
Confirm the size of DIAG’s PGA.
SID PROGRAM NAME VALUE
—– ———————– ———————- ———-
169 oracle@jpdel1380 (DIAG) session pga memory 3944252
169 oracle@jpdel1380 (DIAG) session pga memory max 3944252

==> PGA for DIAG process increases.

1. AWR report of one hour from all the instances when the pga usage is high by diag.
2. Database alert.log file from all the instances.
3. init.ora or spfile used in the db.
4. output of the following :
show parameter “_trace_buffer”

Ignore gsd resource failed to start above 10g

On : 10.2.0.1 version, Real Application Cluster

When attempting to start gsd resource.
the following error occurs.

ERROR
———————–
Auto-start failed for the CRS resource .

Trac the issue with note:
Tracing GSD, SRVCTL, GSDCTL, VIPCA and SRVCONFIG (Doc ID 178683.1)

Tracing GSD, SRVCTL, GSDCTL, VIPCA and SRVCONFIG

PURPOSE
-------

The Purpose of this document is to assist in debugging SRVCTL, GSD, GSDCTL, VIPCA,
and SRVCONFIG problems.

SCOPE & APPLICATION
-------------------

This document is for support analysts to troubleshoot SRVCTL, GSD, GSDCTL, VIPCA,
and SRVCONFIG issues.

TRACING GSD, SRVCTL, GSDCTL, VIPCA, and SRVCONFIG
------------------------------------------

To provide verbose output for SRVCTL, GSD, GSDCTL, VIPCA, or SRVCONFIG, tracing can
be enabled to provide additional screen output.

--------------------------------------------------------------------------

10g:

Just set the environment variable SRVM_TRACE to true to trace all of the
SRVM files like gsd, srvctl, vipca, and ocrconfig.

--------------------------------------------------------------------------

9i:

To Trace GSD:
-------------
1. vi the gsd.sh file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHomebingsd.bat file and choose Edit.

2. At the end of the file, look for the following line:

  exec $JRE -classpath $CLASSPATH oracle.ops.mgmt.daemon.OPSMDaemon $MY_OHOME

3. Add the following just before the -classpath in the 'exec $JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the gsd.sh file, the string should now look like this:

  exec $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running gsd.sh:

 [opcbsol1]/u01/home/usupport> gsd.sh
 [main][9:31:8:860] Daemon: argument is /u01/32bit/app/oracle/product/9.0.1
 [main][9:31:8:893] tracing is true; at level 2
 [main][9:31:8:893] trace file is /u01/32bit/app/oracle/product/9.0.1/srvm/log/gsdaemon.log
 cont...

To Trace SRVCTL:
---------------
1. vi the srvctl file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHomebinsrvctl.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.opsctl.OPSCTLDriver "$@"

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the srvctl file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running srvctl:

 [opcbsol1]/u01/home/usupport> srvctl status -p V90321
 [main][9:33:2:968] srvctl: tracing is true at level 2
 [main][9:33:3:38] Going into GetActiveNodes constructor...
 [main][9:33:3:59] Detected Cluster
 [main][9:33:3:60] Cluster existence = true
 [main][9:33:3:95] loaded library
 [main][9:33:3:108] Inside GetActiveNodes.initializeCluster
 [main][9:33:3:264] The status string is: 1
 [main][9:33:3:265] The result string is: Everything ok So Far 1
 cont...

To Trace GSDCTL:
---------------
1. vi the gsdctl file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHomebingsdctl.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.mgmt.daemon.GSDCTLDriver...

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the gsdctl file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running gsdctl:

  [opcbsol1]/u02/32bit/app/oracle/product/9.2.0/bin> gsdctl stat
  [main] [15:41:34:849] [GetActiveNodes.create:Compile]  Going into GetActiveNodes
  [main] [15:41:34:918] [sQueryCluster.:Compile]  Detected Cluster
  [main] [15:41:34:922] [sQueryCluster.isCluster:Compile]  Cluster existence = true
  cont...

To Trace SRVCONFIG:
-------------------
1. vi the srvconfig file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHomebinsrvconfig.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.mgmt.rawdevice.RawDeviceUtil $*

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the srvconfig file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running srvconfig:

  [opcbsol1]/u02/32bit/app/oracle/product/9.2.0/bin> srvconfig -version
  [main] [16:0:58:395] [RawDeviceUtil.getDeviceName:Compile]
  [main] [16:0:58:454] [sQueryCluster.:Compile]  Detected Cluster
  [main] [16:0:58:457] [sQueryCluster.isCluster:Compile]  Cluster existence = true
  cont...

Failed to start GSD on local node

PROBLEM
-------

AIX 5L cannot successfully start gsd on any node of the cluster.
Get error "Failed to start GSD on local node"

SOLUTION
--------
Ensure that the user (oracle) is added to the HAGSUSER UNIX group.

If the gsd still fails, turn on tracing of the GSD.
Simply turning on GSD tracing, allowed for the GSD to start successfully.

Look at note 178683.1 for how to enable GSD tracing.

LOG FILE
-----------------------
Filename =crsd.log
See the following error:
2009-01-02 08:08:27.838: [ CRSCOMM][12351]32Receive message header caa_clsrecv ret 11
2009-01-02 08:08:27.838: [ CRSCOMM][12351]32Error reading response IOException : Didn't receive header part of message
(File: caa_Message.cpp, line: 711

2009-01-02 08:08:27.838: [ CRSEVT][12351]32invokepeer ret 300
2009-01-02 08:08:27.838: [ CRSRES][12351]32Remote start failed to execute on ccdb_b: X_E2E_NoResponse :
(File: caa_CmdRTI.cpp, line: 507

2009-01-02 08:08:27.839: [ CRSRES][12351][ALERT]32Remote start for `ora.ccdb_b.gsd` failed on member `ccdb_b`
2009-01-02 08:08:27.914: [ OCRMAS][3611]th_master:13: I AM THE NEW OCR MASTER at incar 6. Node Number 1
2009-01-02 08:08:27.915: [ OCRRAW][3611]proprioo: for disk 0 (/dev/ro_ocr_raw), id match (1), my id set
(1731740172,1028247821) total id sets (1), 1st set (1731740172,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2009-01-02 08:08:27.916: [ OCRRAW][3611]rrecovernumpage: numpage on device is not correct (0); recalculate (262075)
2009-01-02 08:08:27.922: [ OCRMAS][3611]th_master: Deleted ver keys from cache (master)
2009-01-02 08:08:30.996: [ CLSVER][527]32Returned from grpstat with event 1
2009-01-02 08:08:30.996: [ CLSVER][527]32Doing grpstat on crs_version group
2009-01-02 08:08:58.400: [ CRSCOMM][13127]32CLEANUP: Searching for connections to failed node ccdb_b
2009-01-02 08:08:58.400: [ CRSEVT][13127]32Processing member leave for ccdb_b, incarnation: 7
2009-01-02 08:08:58.402: [ CRSD][13127]32SM: recovery in process: 8
2009-01-02 08:08:58.402: [ CRSEVT][13127]32Do failover for: ccdb_b
2009-01-02 08:08:58.418: [ CRSRES][13127]32 startup = 0
2009-01-02 08:08:58.435: [ CRSRES][13127]32Not failing resource ora.ccdb_a.gsd because it was locked.
2009-01-02 08:08:58.435: [ CRSRES][13127]32X_RES_Unavailable : Resource ora.ccdb_a.gsd is locked
(File: rti.cpp, line: 976

2009-01-02 08:08:58.438: [ CRSRES][13127]32 startup = 0
2009-01-02 08:08:58.444: [ CRSRES][13127]32 startup = 0
2009-01-02 08:08:58.491: [ CRSRES][13898]32startRunnable: setting CLI values

On the customer ‘s environment other Aix platform got the same issues as this machine .
Due to this reason ,we considered the issue is cause of setups and gsd resource won’t impact the oracle or other applications above the version (10G) .

Work arounds
Manually disable the gsd resource :
1.Use crs_unregister to delete the resource from CRS then CRS won’t attempt to start the gsd resource .
Hard code the during checking the status
2.Hard code the gsd.sh return the status Online ,to show the status Online ;

GSD resource won’t impace the CRS or Database above the version 10g

rac中手动关闭的Listener资源会自动重启?

几个月前,在客户的一次演练测试中,一位乙方项目组的经理指出:在RAC环境中使用lsnrctl命令关闭监听,Oracle CRS会自动将该监听重启。客户对他的这个说法十分重视,同时向我咨询,CRS确实会定期对所有资源进行检查,并可能重新启动以外终止的资源;但手动使用lsnrctl关闭监听绝对不能算在以外终止的范畴当中。这位乙方的项目经理年纪已界中年,项目经验十分丰富,而且说起这个问题来信誓旦旦(十分反感这样的自信),不由得别人不信;当时我向客户具体介绍了CRS重启资源的原理,并阐述了我认为“不会重启”的观点,因为不能排除一些意外因素(我认识的Oracle总是带来惊喜),我的口气并不如乙方项目经理那么肯定,客户负责人也只有将信将疑,并认为可以实际测试一下。

当时的测试记录没有保留,我们来看一下RHEL 5.5上Oracle RAC 10.2.0.5版本中的表现(实际与AIX上10.2.0.4的表现一致):

[maclean@rh2 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.....S13.cs application    ONLINE    ONLINE    rh2
ora....db1.srv application    ONLINE    ONLINE    rh2
ora.racdb.db   application    ONLINE    ONLINE    rh1
ora....b1.inst application    ONLINE    ONLINE    rh2
ora....b2.inst application    ONLINE    ONLINE    rh1
ora....SM2.asm application    ONLINE    ONLINE    rh1
ora....H1.lsnr application    ONLINE    ONLINE    rh1
ora.rh1.gsd    application    ONLINE    ONLINE    rh1
ora.rh1.ons    application    ONLINE    ONLINE    rh1
ora.rh1.vip    application    ONLINE    ONLINE    rh1
ora....SM1.asm application    ONLINE    ONLINE    rh2
ora....H2.lsnr application    ONLINE    ONLINE    rh2
ora.rh2.gsd    application    ONLINE    ONLINE    rh2
ora.rh2.ons    application    ONLINE    ONLINE    rh2
ora.rh2.vip    application    ONLINE    ONLINE    rh2


[maclean@rh2 ~]$ ps -ef|grep tns
maclean   4098 17071  0 19:35 pts/0    00:00:00 grep tns
maclean  11062     1  0 11:34 ?        00:00:00 /s01/rac10g/bin/tnslsnr LISTENER_RH2 -inherit


[maclean@rh2 ~]$ lsnrctl stop LISTENER_RH2

LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 27-JUN-2010 19:35:46

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rh2-vip)(PORT=1521)(IP=FIRST)))
The command completed successfully


[maclean@rh2 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.....S13.cs application    ONLINE    ONLINE    rh2
ora....db1.srv application    ONLINE    ONLINE    rh2
ora.racdb.db   application    ONLINE    ONLINE    rh1
ora....b1.inst application    ONLINE    ONLINE    rh2
ora....b2.inst application    ONLINE    ONLINE    rh1
ora....SM2.asm application    ONLINE    ONLINE    rh1
ora....H1.lsnr application    ONLINE    ONLINE    rh1
ora.rh1.gsd    application    ONLINE    ONLINE    rh1
ora.rh1.ons    application    ONLINE    ONLINE    rh1
ora.rh1.vip    application    ONLINE    ONLINE    rh1
ora....SM1.asm application    ONLINE    ONLINE    rh2
ora....H2.lsnr application    OFFLINE   OFFLINE           // TARGET被置为OFFLINE是不会被重启的
ora.rh2.gsd    application    ONLINE    ONLINE    rh2
ora.rh2.ons    application    ONLINE    ONLINE    rh2
ora.rh2.vip    application    ONLINE    ONLINE    rh2


[maclean@rh2 ~]$ crs_start ora.rh2.LISTENER_RH2.lsnr
Attempting to start `ora.rh2.LISTENER_RH2.lsnr` on member `rh2`
Start of `ora.rh2.LISTENER_RH2.lsnr` on member `rh2` succeeded.


[maclean@rh2 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.....S13.cs application    ONLINE    ONLINE    rh2
ora....db1.srv application    ONLINE    ONLINE    rh2
ora.racdb.db   application    ONLINE    ONLINE    rh1
ora....b1.inst application    ONLINE    ONLINE    rh2
ora....b2.inst application    ONLINE    ONLINE    rh1
ora....SM2.asm application    ONLINE    ONLINE    rh1
ora....H1.lsnr application    ONLINE    ONLINE    rh1
ora.rh1.gsd    application    ONLINE    ONLINE    rh1
ora.rh1.ons    application    ONLINE    ONLINE    rh1
ora.rh1.vip    application    ONLINE    ONLINE    rh1
ora....SM1.asm application    ONLINE    ONLINE    rh2
ora....H2.lsnr application    ONLINE    ONLINE    rh2
ora.rh2.gsd    application    ONLINE    ONLINE    rh2
ora.rh2.ons    application    ONLINE    ONLINE    rh2
ora.rh2.vip    application    ONLINE    ONLINE    rh2


[maclean@rh2 ~]$ ps -ef|grep tns
maclean   4629     1  0 19:37 ?        00:00:00 /s01/rac10g/bin/tnslsnr LISTENER_RH2 -inherit
maclean   5212 17071  0 19:38 pts/0    00:00:00 grep tns


[maclean@rh2 ~]$ kill -9 4629

[maclean@rh2 ~]$ ps -ef|grep tns
maclean   5333 17071  0 19:38 pts/0    00:00:00 grep tns


[maclean@rh2 ~]$ date
Sun Jun 27 19:38:59 EDT 2010


//过10分钟再来看看

[maclean@rh2 ~]$ ps -ef|grep tns
maclean   8655     1  0 19:47 ?        00:00:00 /s01/rac10g/bin/tnslsnr LISTENER_RH2 -inherit
maclean   9252 17071  0 19:48 pts/0    00:00:00 grep tns

[maclean@rh2 ~]$ lsnrctl status LISTENER_RH2

LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 27-JUN-2010 19:48:43

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rh2-vip)(PORT=1521)(IP=FIRST)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_RH2
Version                   TNSLSNR for Linux: Version 10.2.0.5.0 - Production
Start Date                27-JUN-2010 19:47:07
Uptime                    0 days 0 hr. 1 min. 35 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /s01/rac10g/network/admin/listener.ora
Listener Log File         /s01/rac10g/network/log/listener_rh2.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.104)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.103)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM1", status BLOCKED, has 1 handler(s) for this service...
Service "+ASM_XPT" has 1 instance(s).
  Instance "+ASM1", status BLOCKED, has 1 handler(s) for this service...
Service "S13" has 1 instance(s).
  Instance "racdb1", status READY, has 2 handler(s) for this service...
Service "racdb" has 2 instance(s).
  Instance "racdb1", status READY, has 2 handler(s) for this service...
  Instance "racdb2", status READY, has 1 handler(s) for this service...
Service "racdbXDB" has 2 instance(s).
  Instance "racdb1", status READY, has 1 handler(s) for this service...
  Instance "racdb2", status READY, has 1 handler(s) for this service...
Service "racdb_XPT" has 2 instance(s).
  Instance "racdb1", status READY, has 2 handler(s) for this service...
  Instance "racdb2", status READY, has 1 handler(s) for this service...
The command completed successfully

LISTENER资源的默认CHECK_INTERVAL为600秒,即10分钟内CRS会检测到LISTENER的意外终止并尝试重新启动该资源。经过上述在生产主机上的测试,乙方的项目经理觉得有些不可思议,同时客户也认同了我的观点。其实如乙方项目经理所作出的那样肯定的论调即使在其他地方也是不少见的,他们大多全通过实践来学习和认识Oracle,这点没有问题,实践可以教会我们大多数东西,但同时如果我们对事物的认识全部来自实践又往往不全面了,乙方项目经理所犯得就是这种错误,可能他在某次case当中遇到过类似的带有错误指导性情况,同时他也没有反复阅读过官方文档并没有在事后去深入了解整个事件的逻辑因果,并凭借着多年的经验果断地为该问题下了十分肯定的结论。
中国企业目前的IT基建大多由集成商完成,在整个it环节中集成商扮演了十分重要的角色;随着阅历的丰富,渐渐发觉集成商处集结着大量如这位经理般,年龄或大或小,经验或多多少,但说起技术来大多没完没了,深怕别人不知道自己会这会那的,他们在发表自己论点的口气决不允许半点质疑!

Find Past Image in RAC Global Cache

Instance 1:

Start dump data blocks tsn: 0 file#:1 minblk 95753 maxblk 95753
Block dump from cache:
Dump of buffer cache at level 4 for tsn=0, rdba=4290057
BH (0x8df55108) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0x8dc86000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 58,28
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0x8df54308,0xd6cb7438] lru: [0x96ecccd8,0xd6e9a080]
  ckptq: [NULL] fileq: [NULL] objq: [0xcedd4f18,0xcedd4f18] objaq: [0x96eccd10,0xcedd4f08]
  use: [0xd6e1ee70,0xd6e1ee70] wait: [NULL]
  st: READING md: EXCL tch: 0 le: 0x7bfa6000                  reading gc,associated with lock element 0x7bfa6000
  flags: only_sequential_access
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0x8df54258) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0x8dc72000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 58,28
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0x96eccca0,0x8df551b8] lru: [0xd6e9a080,0x8df54a98]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  use: [0xd6e22f30,0xd6e22f30] wait: [NULL]
  st: CR md: SHR tch: 0 le: (nil)
  cr: [scn: 0x0.ac6456],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.ac6456],[sfl: 0x2],[lc: 0x0.ac6451]
  flags: only_sequential_access
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0x96eccbf0) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0x960ec000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 58,28
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0x8df54a60,0x8df54308] lru: [0x8df547a8,0x8df551f0]
  obj-flags: object_ckpt_list
  ckptq: [0x8eecba50,0x8df54ca0] fileq: [0xd6eb8370,0x8df679c8] objq: [0xcedd4f28,0xcedd4f28] objaq: [0xcedd4f08,0x8df55228]
  st: PI md: NULL tch: 1 le: 0x7bfa6000                        -- Past Image
  cr: [scn: 0x0.ac6452],[xid: 0x0],[uba: 0x0],[cls: 0x0.ac6452],[sfl: 0x0]
  flags: buffer_dirty remote_transfered
  LRBA: [0x132.a1c.0] LSCN: [0x0.abb823] HSCN: [0x0.ac6451] HSUB: [2]
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0x8df549b0) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0x8dc7c000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 58,28
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0x8df53ea0,0x96eccca0] lru: [0x8df54340,0x8df544b8]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.ac643a],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.ac643a],[sfl: 0x0],[lc: 0x0.0]
  flags: remote_transfered
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0x8df53df0) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0x8dc6c000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 58,28
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0x8df54480,0x8df54a60] lru: [0x8df544b8,0x8df541c8]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.ac642a],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.ac642a],[sfl: 0x2],[lc: 0x0.ac6426]
  flags: only_sequential_access
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0x8df543d0) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0x8dc74000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 58,28
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0x8df54190,0x8df53ea0] lru: [0x8df54a98,0x8df53ed8]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.ac6427],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.ac6427],[sfl: 0x0],[lc: 0x0.0]
  flags: remote_transfered
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0x8df540e0) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0x8dc70000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 58,28
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0x92ec39c0,0x8df54480] lru: [0x8df53ed8,0x8eecc870]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.ac641a],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.ac641a],[sfl: 0x0],[lc: 0x0.0]
  flags: remote_transfered
  cr pin refcnt: 0 sh pin refcnt: 0
Block dump from disk:
buffer tsn: 0 rdba: 0x00417609 (1/95753)
scn: 0x0000.00abb7cf seq: 0x01 flg: 0x06 tail: 0xb7cf0601
frmt: 0x02 chkval: 0xeab4 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x00007F85F7BA9A00 to 0x00007F85F7BABA00
7F85F7BA9A00 0000A206 00417609 00ABB7CF 06010000  [.....vA.........]
7F85F7BA9A10 0000EAB4 00000001 0001495E 00ABB7CE  [........^I......]
7F85F7BA9A20 00000000 00030002 00000000 00080012  [................]
7F85F7BA9A30 00000AE2 01804A41 0009010A 00002001  [....AJ....... ..]
7F85F7BA9A40 00ABB7CF 000B0011 00000822 01804BA9  [........"....K..]
7F85F7BA9A50 001200BD 00008000 00ABB7CD 00010100  [................]
7F85F7BA9A60 0014FFFF 1F830376 00001F83 03760001  [....v.........v.]
7F85F7BA9A70 2C000000 0401002C 641E57C3 0301002C  [...,,....W.d,...]
7F85F7BA9A80 2C1E57C3 C3040100 2C641D57 C3030100  [.W.,....W.d,....]
7F85F7BA9A90 002C1D57 57C30401 002C641C 57C30301  [W.,....W.d,....W]
7F85F7BA9AA0 01002C1C 1B57C304 01002C64 1B57C303  [.,....W.d,....W.]
7F85F7BA9AB0 0401002C 641A57C3 0301002C 2C1A57C3  [,....W.d,....W.,]
7F85F7BA9AC0 C3040100 2C641957 C3030100 002C1957  [....W.d,....W.,.]
7F85F7BA9AD0 57C30401 002C6418 57C30301 01002C18  [...W.d,....W.,..]
7F85F7BA9AE0 1757C304 01002C64 1757C303 0401002C  [..W.d,....W.,...]
7F85F7BA9AF0 641657C3 0301002C 2C1657C3 C3040100  [.W.d,....W.,....]
7F85F7BA9B00 00000000 00000000 00000000 00000000  [................]
        Repeat 44 times
7F85F7BA9DD0 012C0000 5CC30401 002C3B0B 5CC30301  [..,....\.;,....\]
7F85F7BA9DE0 01002C0B 0A5CC304 01002C64 0A5CC303  [.,....\.d,....\.]
7F85F7BA9DF0 0401002C 64095CC3 0301002C 2C095CC3  [,....\.d,....\.,]
7F85F7BA9E00 C3040100 2C64085C C3030100 002C085C  [....\.d,....\.,.]
7F85F7BA9E10 5CC30401 002C6407 5CC30301 01002C07  [...\.d,....\.,..]
7F85F7BA9E20 065CC304 01002C64 065CC303 0401002C  [..\.d,....\.,...]
7F85F7BA9E30 64055CC3 0301002C 2C055CC3 C3040100  [.\.d,....\.,....]
7F85F7BA9E40 2C64045C C3030100 002C045C 5CC30401  [\.d,....\.,....\]
7F85F7BA9E50 002C6403 5CC30301 01002C03 025CC304  [.d,....\.,....\.]
7F85F7BA9E60 01002C64 025CC303 0401002C 64015CC3  [d,....\.,....\.d]
7F85F7BA9E70 0201002C 002C5CC3 5BC30401 002C6464  [,....\,....[dd,.]
7F85F7BA9E80 5BC30301 01002C64 635BC304 01002C64  [...[d,....[cd,..]
7F85F7BA9E90 635BC303 0401002C 64625BC3 0301002C  [..[c,....[bd,...]
7F85F7BA9EA0 2C625BC3 C3040100 2C64615B C3030100  [.[b,....[ad,....]
7F85F7BA9EB0 002C615B 5BC30401 002C6460 5BC30301  [[a,....[`d,....[]
7F85F7BA9EC0 01002C60 5F5BC304 01002C64 5F5BC303  [`,....[_d,....[_]
7F85F7BA9ED0 0401002C 645E5BC3 0301002C 2C5E5BC3  [,....[^d,....[^,]
7F85F7BA9EE0 C3040100 2C645D5B C3030100 002C5D5B  [....[]d,....[],.]
7F85F7BA9EF0 5BC30401 002C645C 5BC30301 01002C5C  [...[\d,....[\,..]
7F85F7BA9F00 5B5BC304 01002C64 5B5BC303 0401002C  [..[[d,....[[,...]
7F85F7BA9F10 645A5BC3 0301002C 2C5A5BC3 C3040100  [.[Zd,....[Z,....]
7F85F7BA9F20 2C64595B C3030100 002C595B 5BC30401  [[Yd,....[Y,....[]
7F85F7BA9F30 002C6458 5BC30301 01002C58 575BC304  [Xd,....[X,....[W]
7F85F7BA9F40 01002C64 575BC303 0401002C 64565BC3  [d,....[W,....[Vd]
7F85F7BA9F50 0301002C 2C565BC3 C3040100 2C64555B  [,....[V,....[Ud,]
7F85F7BA9F60 C3030100 002C555B 5BC30401 002C6454  [....[U,....[Td,.]
7F85F7BA9F70 5BC30301 01002C54 535BC304 01002C64  [...[T,....[Sd,..]
7F85F7BA9F80 535BC303 0401002C 64525BC3 0301002C  [..[S,....[Rd,...]
7F85F7BA9F90 2C525BC3 C3040100 2C64515B C3030100  [.[R,....[Qd,....]
7F85F7BA9FA0 002C515B 5BC30401 002C6450 5BC30301  [[Q,....[Pd,....[]
7F85F7BA9FB0 01002C50 4F5BC304 01002C64 4F5BC303  [P,....[Od,....[O]
7F85F7BA9FC0 0401002C 644E5BC3 0301002C 2C4E5BC3  [,....[Nd,....[N,]
7F85F7BA9FD0 C3040100 2C644D5B C3030100 002C4D5B  [....[Md,....[M,.]
7F85F7BA9FE0 5BC30401 002C644C 5BC30301 01002C4C  [...[Ld,....[L,..]
7F85F7BA9FF0 4B5BC304 01002C64 4B5BC303 0401002C  [..[Kd,....[K,...]
7F85F7BAA000 644A5BC3 0301002C 2C4A5BC3 C3040100  [.[Jd,....[J,....]
7F85F7BAA010 2C64495B C3030100 002C495B 5BC30401  [[Id,....[I,....[]
7F85F7BAA020 002C6448 5BC30301 01002C48 475BC304  [Hd,....[H,....[G]
7F85F7BAA030 01002C64 475BC303 0401002C 64465BC3  [d,....[G,....[Fd]
7F85F7BAA040 0301002C 2C465BC3 C3040100 2C64455B  [,....[F,....[Ed,]
7F85F7BAA050 C3030100 002C455B 5BC30401 002C6444  [....[E,....[Dd,.]
7F85F7BAA060 5BC30301 01002C44 435BC304 01002C64  [...[D,....[Cd,..]
7F85F7BAA070 435BC303 0401002C 64425BC3 0301002C  [..[C,....[Bd,...]
7F85F7BAA080 2C425BC3 C3040100 2C64415B C3030100  [.[B,....[Ad,....]
7F85F7BAA090 002C415B 5BC30401 002C6440 5BC30301  [[A,....[@d,....[]
7F85F7BAA0A0 01002C40 3F5BC304 01002C64 3F5BC303  [@,....[?d,....[?]
7F85F7BAA0B0 0401002C 643E5BC3 0301002C 2C3E5BC3  [,....[>d,....[>,]
7F85F7BAA0C0 C3040100 2C643D5B C3030100 002C3D5B  [....[=d,....[=,.]
7F85F7BAA0D0 5BC30401 002C643C 5BC30301 01002C3C  [...[d,....Z>,....Z]
7F85F7BAA6A0 002C643D 5AC30301 01002C3D 3C5AC304  [=d,....Z=,....Z<]
7F85F7BAA6B0 01002C64 3C5AC303 0401002C 643B5AC3  [d,....Z<,....Z;d]
7F85F7BAAC70 59C30301 01002C3E 3D59C304 01002C64  [...Y>,....Y=d,..]
7F85F7BAAC80 3D59C303 0401002C 643C59C3 0301002C  [..Y=,....Yd,....X>]
7F85F7BAB250 0401002C 643D58C3 0301002C 2C3D58C3  [,....X=d,....X=,]
7F85F7BAB260 C3040100 2C643C58 C3030100 002C3C58  [....Xd,....W>,....]
7F85F7BAB830 2C643D57 C3030100 002C3D57 57C30401  [W=d,....W=,....W]
7F85F7BAB840 002C643C 57C30301 01002C3C 3B57C304  [1, wm 32768, RMno 0, reminc 62, dom 0]
 Block header dump:  0x00417609
 Object id on Block? Y
 seg/obj: 0x1495e  csc: 0x00.abb7ce  itc: 2  flg: O  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0012.008.00000ae2  0x01804a41.010a.09  --U-    1  fsc 0x0000.00abb7cf
0x02   0x0011.00b.00000822  0x01804ba9.00bd.12  C---    0  scn 0x0000.00abb7cd
bdba: 0x00417609
data_block_dump,data header at 0x7f85f7ba9a5c
===============
tsiz: 0x1fa0
hsiz: 0x14
pbl: 0x7f85f7ba9a5c
     76543210
flag=--------
ntab=1
nrow=1
frre=-1
fsbo=0x14
fseo=0x376
avsp=0x1f83
tosp=0x1f83
0xe:pti[0]	nrow=1	offs=0
0x12:pri[0]	offs=0x376
block_row_dump:
tab 0, row 0, @0x376
tl: 8 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 4]  c3 5c 0b 3b
end_of_block_dump
GLOBAL CACHE ELEMENT DUMP (address: 0x7bfa6000):          --lock element
  id1: 0x17609 id2: 0x1 pkey: OBJ#84318 block: (1/95753)
  lock: NG rls: 0x0 acq: 0x3 latch: 7                     -- NULL GLOBAL
  flags: 0x20 fair: 0 recovery: 0 fpin: 'ktswh23: ktsfbkl'
  bscn: 0x0.ac6451 bctx: (nil) write: 0 scan: 0xd000005
  lcp: 0xd36ce7a0 lnk: [0xd36ce7f8,0xd36ce7f8] lch: [0x96eccd20,0x8df55238]
  seq: 34524 hist: 17 146:6 14 8 324 50 38 231 230 227 21 37:2 145:0
  LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
    flg: 0x00080000 state: READING tsn: 0 tsh: 0 mode: EXCL
      pin: 'kduwh01: kdusru'
      addr: 0x8df55108 obj: 84318 cls: DATA bscn: 0x0.0
    flg: 0x08000001 state: PI tsn: 0 tsh: 1
      addr: 0x96eccbf0 obj: 84318 cls: DATA bscn: 0x0.ac6451
      piscn: 0x0.ac6452 clscn: 0x0.ac6452
 GCS SHADOW 0x7bfa6078,1 resp[0xcfe5c5e0,0x17609.1] pkey 84318.0
   grant 0 cvt 2 mdrole 0xc8 st 0x101 lst 0xc0 CONVERTQ rl G1
   master 1 owner 1 sid 0 remote[(nil),0] hist 0x88149530f062c288
   history 0x8.0x5.0xb.0x3.0xf.0x26.0x25.0xa.0x8.0x1.
   cflag 0x0 sender 0 flags 0x10 replay# 0 abast (nil).x0.1 dbmap (nil)
   disk: 0x0000.00000000 write request: 0x0000.00000000
   pi scn: 0x0000.00ac6452 sq[0xcfe5c620,0xcfe5c620]
   msgseq 0x0 updseq 0x0 reqids[22469,0,0] infop 0xcee3d880 lockseq x187
 GCS SHADOW END
 GCS RESOURCE 0xcfe5c5e0 hashq [0xdb595658,0xcfe62578] name[0x17609.1] pkey 84318.0
   grant 0xcecec160 cvt 0x7bfa6078 send 0xcecec160@1,3 write (nil),0@65536
   flag 0x10002 mdrole 0x42 mode 2 scan 0.232 role GLOBAL
   disk: 0x0000.00abb7cf write: 0x0000.00000000 cnt 0x0 hist 0xd7
   xid 0x0000.000.00000000 sid 0 pkwait 0s rmacks 0
   refpcnt 0 weak: 0x0000.00000000
   pkey 84318.0
   hv 119 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 62, dom 0]
   kjga st 0x4, step 0.0.0, cinc 64, rmno 15, flags 0x0
   lb 0, hb 0, myb 32449, drmb 32449, apifrz 0
   GCS SHADOW 0xcecec160,3 resp[0xcfe5c5e0,0x17609.1] pkey 84318.0
     grant 2 cvt 0 mdrole 0xc2 st 0x100 lst 0x40 GRANTQ rl G1
     master 1 owner 2 sid 0 remote[0x61f98bf8,43] hist 0x887299f0510e4126
     history 0x26.0x2.0x39.0x8.0x5.0x3e.0x26.0x39.0x8.0x1.
     cflag 0x0 sender 0 flags 0x10 replay# 0 abast (nil).x0.1 dbmap (nil)
     disk: 0x0000.00000000 write request: 0x0000.00000000
     pi scn: 0x0000.00ac6450 sq[0xcfe5c610,0xcfe5c610]
     msgseq 0x292b updseq 0x0 reqids[19172,0,0] infop (nil) lockseq xc4
   GCS SHADOW END
   GCS SHADOW 0x7bfa6078,1 resp[0xcfe5c5e0,0x17609.1] pkey 84318.0
     grant 0 cvt 2 mdrole 0xc8 st 0x101 lst 0xc0 CONVERTQ rl G1
     master 1 owner 1 sid 0 remote[(nil),0] hist 0x88149530f062c288
     history 0x8.0x5.0xb.0x3.0xf.0x26.0x25.0xa.0x8.0x1.
     cflag 0x0 sender 0 flags 0x10 replay# 0 abast (nil).x0.1 dbmap (nil)
     disk: 0x0000.00000000 write request: 0x0000.00000000
     pi scn: 0x0000.00ac6452 sq[0xcfe5c620,0xcfe5c620]
     msgseq 0x0 updseq 0x0 reqids[22469,0,0] infop 0xcee3d880 lockseq x187
   GCS SHADOW END
 GCS RESOURCE END
2011-06-22 20:51:21.737794 : kjbmbassert [0x17609.1]
2011-06-22 20:51:21.738025 : kjbmsassert(0x17609.1)(2)
End dump data blocks tsn: 0 file#: 1 minblk 95753 maxblk 95753

Instance 2:

Start dump data blocks tsn: 0 file#:1 minblk 95753 maxblk 95753
Block dump from cache:
Dump of buffer cache at level 4 for tsn=0, rdba=4290057
BH (0xafed7940) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0xaf1d8000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 15,19
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0xafed7410,0xd9fb48e0] lru: [0xaff06738,0xd52c01d8]
  obj-flags: object_ckpt_list
  ckptq: [0xd52c3d30,0xafed3d08] fileq: [0xd52c3d50,0xaff06660] objq: [0xaff06760,0xcb9c1060] objaq: [0xaff06770,0xcb9c1040]
  st: XCURRENT md: NULL tch: 1 le: 0x61f98b80
  flags: buffer_dirty remote_transfered
  LRBA: [0x55.f6e1.0] LSCN: [0x0.97bdb9] HSCN: [0x0.97bdc9] HSUB: [1]
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0xafed7360) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0xaf1d0000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 15,19
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0xafed7298,0xafed79f0] lru: [0xd52c01d8,0xafed72d0]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.97bdc7],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.97bdc7],[sfl: 0x0],[lc: 0x0.97bdc7]
  flags: remote_transfered
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0xafed71e8) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0xaf1ce000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 15,19
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0xafed9460,0xafed7410] lru: [0xafed7448,0xafed9498]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.97bdc5],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.97bdc5],[sfl: 0x0],[lc: 0x0.97bdc5]
  flags: remote_transfered
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0xafed93b0) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0xaf1fc000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 15,19
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0xb0f18b48,0xafed7298] lru: [0xafed72d0,0xb0f18b80]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.97bdc3],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.97bdc3],[sfl: 0x0],[lc: 0x0.97bdc3]
  flags: remote_transfered
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0xb0f18a98) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0xb0762000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 15,19
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0xafed8148,0xafed9460] lru: [0xafed9498,0xafed8180]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.97bdc1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.97bdc1],[sfl: 0x0],[lc: 0x0.97bdc1]
  flags: remote_transfered
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0xafed8098) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0xaf1e2000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 15,19
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0xaff06700,0xb0f18b48] lru: [0xb0f18b80,0xafef6d88]
  lru-flags: moved_to_tail
  ckptq: [NULL] fileq: [NULL] objq: [NULL] objaq: [NULL]
  st: CR md: NULL tch: 1 le: (nil)
  cr: [scn: 0x0.97bdbf],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.97bdbf],[sfl: 0x0],[lc: 0x0.97bdbe]
  flags: remote_transfered
  cr pin refcnt: 0 sh pin refcnt: 0
BH (0xaff06650) file#: 1 rdba: 0x00417609 (1/95753) class: 1 ba: 0xaf5d4000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 15,19
  dbwrid: 0 obj: 84318 objn: 84318 tsn: 0 afn: 1 hint: f
  hash: [0xd9fb48e0,0xafed8148] lru: [0xafed3b00,0xafed7a28]
  obj-flags: object_ckpt_list
  ckptq: [0xafed8f48,0xafed96a0] fileq: [0xafed7950,0xd52c3d50] objq: [0xcb9c1060,0xafed7a50] objaq: [0xb5ecd178,0xafed7a60]
  st: PI md: NULL tch: 1 le: 0x61f98b80               --Past Image
  cr: [scn: 0x0.97bdb4],[xid: 0x0],[uba: 0x0],[cls: 0x0.97bdb4],[sfl: 0x0]
  flags: buffer_dirty remote_transfered
  LRBA: [0x55.d365.0] LSCN: [0x0.978da0] HSCN: [0x0.97bdaf] HSUB: [1]
  cr pin refcnt: 0 sh pin refcnt: 0
Block dump from disk:
buffer tsn: 0 rdba: 0x00417609 (1/95753)
scn: 0x0000.00978d91 seq: 0x01 flg: 0x06 tail: 0x8d910601
frmt: 0x02 chkval: 0x81cc type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x00002AB4A7A4EA00 to 0x00002AB4A7A50A00
2AB4A7A4EA00 0000A206 00417609 00978D91 06010000  [.....vA.........]
2AB4A7A4EA10 000081CC 00000001 0001495E 00978D8F  [........^I......]
2AB4A7A4EA20 00000000 00030002 00000000 000C0014  [................]
2AB4A7A4EA30 00000391 01801B5D 001300D0 00002001  [....]........ ..]
2AB4A7A4EA40 00978D91 00210005 000010CF 00C0829C  [......!.........]
2AB4A7A4EA50 00270376 00008000 00978D8D 00010100  [v.'.............]
2AB4A7A4EA60 0014FFFF 1F8317E5 00001F83 17E50001  [................]
2AB4A7A4EA70 2C000000 C3040100 2C642D21 C3030100  [...,....!-d,....]
2AB4A7A4EA80 002C2D21 21C30401 002C642C 21C30301  [!-,....!,d,....!]
2AB4A7A4EA90 01002C2C 2B21C304 01002C64 2B21C303  [,,....!+d,....!+]
2AB4A7A4EAA0 0401002C 642A21C3 0301002C 2C2A21C3  [,....!*d,....!*,]
2AB4A7A4EAB0 C3040100 2C642921 C3030100 002C2921  [....!)d,....!),.]
2AB4A7A4EAC0 21C30401 002C6428 21C30301 01002C28  [...!(d,....!(,..]
2AB4A7A4EAD0 2721C304 01002C64 2721C303 0401002C  [..!'d,....!',...]
2AB4A7A4EAE0 642621C3 0301002C 2C2621C3 C3040100  [.!&d,....!&,....]
2AB4A7A4EAF0 2C642521 C3030100 002C2521 21C30401  [!%d,....!%,....!]
2AB4A7A4EB00 00000000 00000000 00000000 00000000  [................]
        Repeat 371 times
2AB4A7A50240 01012C00 4D22C304 01002C5F 4D22C303  [.,...."M_,...."M]
2AB4A7A50250 0401002C 644C22C3 0301002C 2C4C22C3  [,...."Ld,...."L,]
2AB4A7A50260 C3040100 2C644B22 C3030100 002C4B22  [...."Kd,...."K,.]
2AB4A7A50270 22C30401 002C644A 22C30301 01002C4A  [..."Jd,...."J,..]
2AB4A7A50280 4922C304 01002C64 4922C303 0401002C  [.."Id,...."I,...]
2AB4A7A50290 644822C3 0301002C 2C4822C3 C3040100  [."Hd,...."H,....]
2AB4A7A502A0 2C644722 C3030100 002C4722 22C30401  ["Gd,...."G,...."]
2AB4A7A502B0 002C6446 22C30301 01002C46 4522C304  [Fd,...."F,...."E]
2AB4A7A502C0 01002C64 4522C303 0401002C 644422C3  [d,...."E,...."Dd]
2AB4A7A502D0 0301002C 2C4422C3 C3040100 2C644322  [,...."D,...."Cd,]
2AB4A7A502E0 C3030100 002C4322 22C30401 002C6442  [...."C,...."Bd,.]
2AB4A7A502F0 22C30301 01002C42 4122C304 01002C64  [..."B,...."Ad,..]
2AB4A7A50300 4122C303 0401002C 644022C3 0301002C  [.."A,...."@d,...]
2AB4A7A50310 2C4022C3 C3040100 2C643F22 C3030100  [."@,...."?d,....]
2AB4A7A50320 002C3F22 22C30401 002C643E 22C30301  ["?,....">d,...."]
2AB4A7A50330 01002C3E 3D22C304 01002C64 3D22C303  [>,...."=d,...."=]
2AB4A7A50340 0401002C 643C22C3 0301002C 2C3C22C3  [,...."d,....!>,...]
2AB4A7A50910 643D21C3 0301002C 2C3D21C3 C3040100  [.!=d,....!=,....]
2AB4A7A50920 2C643C21 C3030100 002C3C21 21C30401  [!1, wm 32768, RMno 0, reminc 62, dom 0]
Block header dump:  0x00417609
 Object id on Block? Y
 seg/obj: 0x1495e  csc: 0x00.978d8f  itc: 2  flg: O  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0014.00c.00000391  0x01801b5d.00d0.13  --U-    1  fsc 0x0000.00978d91
0x02   0x0005.021.000010cf  0x00c0829c.0376.27  C---    0  scn 0x0000.00978d8d
bdba: 0x00417609
data_block_dump,data header at 0x2ab4a7a4ea5c
===============
tsiz: 0x1fa0
hsiz: 0x14
pbl: 0x2ab4a7a4ea5c
     76543210
flag=--------
ntab=1
nrow=1
frre=-1
fsbo=0x14
fseo=0x17e5
avsp=0x1f83
tosp=0x1f83
0xe:pti[0]	nrow=1	offs=0
0x12:pri[0]	offs=0x17e5
block_row_dump:
tab 0, row 0, @0x17e5
tl: 8 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 4]  c3 22 4d 5f
end_of_block_dump
GLOBAL CACHE ELEMENT DUMP (address: 0x61f98b80):                  --lock element
  id1: 0x17609 id2: 0x1 pkey: OBJ#84318 block: (1/95753)
  lock: XG rls: 0x0 acq: 0x0 latch: 7                           exclusive global
  flags: 0x20 fair: 0 recovery: 0 fpin: 'kduwh01: kdusru'
  bscn: 0x0.97bafa bctx: (nil) write: 0 scan: 0x0
  lcp: (nil) lnk: [NULL] lch: [0xaff06780,0xafed94e0]
  seq: 40651 hist: 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
    58 58 58 58
  LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
    flg: 0x08000001 state: XCURRENT tsn: 0 tsh: 1 mode: SHR
      addr: 0xafed93b0 obj: 84318 cls: DATA
    flg: 0x08000001 state: PI tsn: 0 tsh: 1
      addr: 0xaff06650 obj: 84318 cls: DATA
      piscn: 0x0.97bdb4 clscn: 0x0.97bdb4
 GCS CLIENT 0x61f98bf8,42 resp[(nil),0x17609.1] pkey 84318.0       only client here,
   grant 2 cvt 0 mdrole 0xc2 st 0x100 lst 0x20 GRANTQ rl G1
   master 1 owner 2 sid 0 remote[0xcecec160,2] hist 0x8f0a41e071e1483c
   history 0x3c.0x10.0x5.0xf.0x7.0x3c.0x10.0x5.0xf.0x1.
   cflag 0x0 sender 1 flags 0x0 replay# 0 abast (nil).x0.1 dbmap (nil)
   disk: 0x0000.00978d9a write request: 0x0000.00000000
   pi scn: 0x0000.0097bdb4 sq[(nil),(nil)]
   msgseq 0x291c updseq 0x0 reqids[7233,0,0] infop (nil) lockseq xbd
   pkey 84318.0
   hv 119 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 62, dom 0]
   kjga st 0x4, step 0.0.0, cinc 64, rmno 15, flags 0x0
   lb 0, hb 0, myb 61121, drmb 32449, apifrz 0
 GCS CLIENT END
2011-06-22 20:34:29.243634 : kjbmbassert [0x17609.1]
2011-06-22 20:34:29.243804 : kjbmsassert(0x17609.1)(1)
End dump data blocks tsn: 0 file#: 1 minblk 95753 maxblk 95753
0x7bfa6000

The cache fusion technology evolves various background processes such as the GCS processes (identified by LMSn) and GES daemon (identified by LMD).

The global cache service (GCS) and global enqueue service (GES) handle the management of the cluster piece of the database software. A global cache element is an Oracle-specific data structure representing a cache fusion resource. There is a 1:1 corresponding relationship between a global cache element and a cache fusion resource in the GCS.

Oracle RAC uses a messaging mechanism to maintain resource statuses. Both GCS and GES use messages containing information to ensure that the current block image can be located. These messages also identify block copies being retained by any specific instance for use by the recovery mechanisms. The recovery-specific information contains sequence numbers to identify the order of changes made to that block since it was read from disk. The global resource directory (GRD) is a repository of information about the current status of resources shared by the instances. The GRD contains two groups of resources: enqueue resources, managed by the GES, and buffer cache resources, managed by the GCS. GCS and GES maintain the contents of the GRD.

The LMSn are the processes that handle remote GCS messages. Oracle RAC software provides for up to 10 GCS processes. The number of LMSn processes varies depending on the number of CPU’s on the node. Oracle by default starts one LMS process for every two CPU’s. LMD is the resource-agent process that manages GES resource requests, such as deadlock detection of GES requests.

GES and GCS

The GES coordinates enqueues that are shared globally.

The GCS is the controlling mechanism that implements cache fusion. It is responsible for block transfers between instances. In RAC, the cache fusion technology manages resources at the global level identified by a three-character lock structure.

Three characters are required to distinguish resources. The first characterizes a traditional resource type: N (Null), S (Shared), or X (Exclusive).

The second represents a role. There are two roles:

  • Local (L): The blocks associated with the resource can be manipulated without further reference to GCS or other instances. For example, when a resource is acquired for the first time, it is acquired with a local role.
  • Global (G): The blocks covered by the resource might not be usable without further information from the GCS or other nodes. For example, if the resource is acquired and it already has dirty buffers on a remote instance, then it takes on a global resource role.

If the resource is in exclusive mode and has a local role, then the following rules apply:

  • Only one instance can have the resource in exclusive mode.
  • All unwritten changes must be in local cache.
  • At checkpoint, instances can write changed blocks to disk without confirmation from GCS.
Mode Definition Description
NL0 Null local 0 The same as N in Oracle OPS with no past image
SL0 Shared local 0 The same as S in Oracle OPS with no past image
XL0 Exclusive local 0 The same as X in Oracle OPS with no past image
NG0 Null global 0 Global N lock and the Instance owns current block image.
SG0 Shared global 0 Instance owns current block image and the resource can be shared with the other nodes, there is no past image. Can write current image.
XG0 Exclusive global 0 Instance owns current block image for modification. Can write current image.
NG1 Null global 1 Instance owns past block image. Can write PI image.
SG1 Shared global 1 Instance owns past block image and the resource can be shared with the other nodes. Can write current and PI images.
XG1 Exclusive global 1 Instance owns past block image for modification. Can write current and PI images.

Writing Block and Recovery Considerations

For recovery purposes, instances that have past images will keep these past images in their buffer cache until notified by the master instance of the resource to release them. A block written record (BWR) is placed in its redo log buffer when an instance writes a block covered by a global resource or when it is told it can free a PI buffer. This record indicates to the recovery process that redo information for the block is not needed at this time. Although the BWR makes recovery more efficient, the instance does not force a flush of the log buffer after creating it because it is not essential for accurate recovery.

Each block PI has a system change number (SCN). Instances regularly synchronize their SCNs, and PI SCN is guaranteed to be later than the previous modification performed on this block and earlier than modifications performed by the next instance. When a write completes, the writer updates the GRD with write completion and the new SCN information. The GCS requests instances to flush all PIs having earlier SCNs than the one in the block written to disk.

Checkpoints are more involved on RAC instances and generate more work. When a checkpoint occurs:

  • The GCS notifies all nodes with PI blocks that the checkpoint occurred.
  • The node with the most current PI will write dirty blocks to disk.
  • Resources are updated accordingly (global resources are changed to local, etc).
  • The Global Resource Directory is modified to reflect the resource changes.

This is important to understand since the impact too many checkpoints have on a RAC system is greater than that of a standalone system. Checkpoints generate interconnect traffic and require each node with PI blocks to modify the GRD within the shared pool.

西宁旅记:管中窥豹

6月11日5点天刚蒙蒙亮,极不自然地醒过来,因为要赶飞机的缘故又是一夜没睡好;从家中打车到浦东机场需用一个多小时,抵达时已经6点半了,我要搭乘7点50分的飞机,自上海出发到西安转机到青海西宁。这次是公司在开拓青海省期间的一次技术交流,对象是西宁市的一家政府下属单位。

早晨的浦东机场已经十分忙碌,考虑到是世博期间人流更胜往常;这里是大魔都一处永不停歇的驿站,在这里人流喷涌,为这个国家带来所有未知机遇。

一行无话,11点40分抵达西安,西安的机场还是挺大的,至少在西部诸省仍是佼佼者;我一直很向往这个古老城市:秦,两汉,唐;这个城市记录了中华最辉煌的年代,在我心目中这种气度不是帝都所能比的,不过这次是无缘游访了。

1点钟不到终于到达目的地,青海省西宁市这个边陲省会;从机场向外看去一望无际的土山让生长在上海没有出过远门的我有些惊讶,西宁机场不大,虽然离地震已经过去一段时间了但仍到处插着”青海长青,玉树不倒”的标语,西宁机场目下正在扩建,灾后重建需要一个现代化的机场。

落地后赶忙和负责西北地区的张,胡两位取得了联系,他们下榻在中心广场的如家,据说是此地的市中心,而自机场打的到市区则似乎是统一的一百元;西宁市的的士规格参差不齐,但车顶上大大的广告牌似乎要比上海的还要先进,部分的士甚至是私家的车子,起步费统一的6元;上车前最好问清楚是否能给发票,不然可能会很麻烦。

载我的那位的哥十分健谈,据他介绍青海自古是移民地区,目前是以汉族为主,而部分口音类似于南京周边的方言。青海亦是三江源头为长江,黄河,澜沧江的源头汇水区,比较好玩的景区有青海湖和贵德,土产有牦牛干,人参果,裸鲤,毛毯等。谈话中时间过得飞快,期间途遇到2个收费站,让我对外地的路费有了新的认识。进入城区后先上到高架,大致可以一览市区全貌,据司机师傅介绍目前西宁市区的房价是在6千左右,要想在此处安家却也不容易啊!

在城区里穿行片刻就抵达了旅馆,与本地的2位从未谋面的同事寒暄过后,自然是要先祭祭五脏的;他们带我去旅馆旁边一个样式古老的美食城就餐,西北向来以面食出名,古来就说西北的水偏碱做面食十分相宜,其名声远胜南方,此外就多是牛羊肉,大盘鸡之属了。牛羊肉做主食,南方人多半受不了,我点了一叠面,口味尚佳,比起上海六七元的阳春面这里要便宜得多了。

吃完饭匆匆赶赴客户单位,路上竟堵了许久,想不到此处下午1点多也会有rush hour。

来到客户单位已是2点多,从客户的张科长,朱工程师口中大致了解了系统规模,这里主用的一套Linux上的RAC系统共有3个节点,硬件是三台Dell的3950 Pc Server以及一台IBM DS4700存储,三个实例接受不同地区的业务,客户目前对这套RAC系统的性能不顶满意,希望此次技术交流能够提出初步的方案,另外他们对与主机上较高的内存使用率表示疑问。

我首先抓取了RDA以及AWR报告,托福于这些工具Oracle的诊断调优信息收集已经十分集成了。这个系统居然还采用了ASM,Linux上RAC系统大致可以采取以下几种存储管理方式:裸设备(最容易也最麻烦),NFS(生产环境无法用),ASM(Oracle 推荐),OCFS2(11g被抛弃了),其他GPFS(譬如RedHat GPFS,Veritas Cluster Filesystem);10g下最时髦的管理方式还属ASM,ASM技术新颖,但普遍认为10gASM还不够成熟,至少客户应用上经验不足,一旦出现问题,除了向Oracle GCS求助外很少能找到其他解决方法。而且这套ASM系统直接使用形如/dev/sd*的块设备作为disk,不同于裸设备,使用块设备将会引起操作系统对块的进一步缓存(OS CACHE),这套系统除了跑数据库外没有其他应用,总内存为24g而每个实例使用的SGA不足5g,绝大多数内存都用作了缓存块设备。

[oracle@qhds2 ~]$ free
total       used       free     shared    buffers     cached
Mem:      16408324   16276472    131852          0      44184   13969312
-/+ buffers/cache:    2262976      14145348
Swap:      2031608      90224    1941384

如上所示空闲物理内存131852即128M,OS cache为13G。Oracle使用的高速缓存(Buffer_Cache)要较OS的cache更为有效,而目前三个实例设置的Buffer_cache为2-3G不等,建议客户调大Buffer_Cache至5-10G,同时为杜绝操作系统因物理内存不足,瞬间换页造成宕机事件发生的可能性,建议客户将Linux内核参数vm.min_free_kbytes 设置为512000(即总是保留500M的空闲物理内存)。

数据库中三个实例,整体负载(Load)处于较低的水平。相对而言1号实例最高,16小时快照内1号实例上Average Active Session(平均活动会话数,为10g中数据库负载的重要指标)为(AAS=DB_TIME/Elapsed=2327/960)2.42;2号实例上AAS为592/960=0.61;3号上AAS为1405/960=1.463。
分析AWR 报告可以发现:

1号实例:

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
enq: TM – contention 98,218 40,208 409 28.8 Application
gc cr multi block request 5,877,437 36,369 6 26.0 Cluster
CPU time 8,511 6.1
gc buffer busy 64,538 5,455 85 3.9 Cluster
db file parallel read 519,589 4,116 8 2.9 User I/O

2号实例:

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
CPU time 7,962 22.4
gc cr multi block request 2,903,270 6,027 2 17.0 Cluster
db file scattered read 843,958 3,070 4 8.6 User I/O
db file parallel read 421,792 3,010 7 8.5 User I/O
PX Deq Credit: send blkd 248,625 1,755 7 4.9 Other

3号实例:

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
CPU time 18,381 21.8
gc cr multi block request 6,534,528 18,204 3 21.6 Cluster
db file scattered read 1,562,020 5,575 4 6.6 User I/O
db file parallel read 563,113 4,281 8 5.1 User I/O
db file sequential read 709,698 2,502 4 3.0 User I/O

主要的几个等待是:

  • ENQ:TM- Contention
  • Gc cr multi block request
  • CPU-TIME

“Gc Multi Block request”为全局缓存多块请求事件,是一种集群类等待事件。当RAC中每个实例所涉及的数据严重交叉时该事件将成为数据库的主要性能瓶颈。在不能通过改善应用程序集群适应性或对表进行实例划分分区的情况下,建议客户将RAC中的多个实例划为主用和备用实例以获取更好的性能。举例来说可以将硬件性能较好的一号主机作为主用主机,而二三号主机作为备用主机平时不接受应用程序会话,在一号主机需要OFFLINE时做应急用。客户使用RAC系统的最主要目的是为了获取高可用性,而通过向客户解释RAC Global Cache大致的工作原理后,客户首肯了这个方案。

EnQ:TM-Contention即TM锁队列等待,一般是应用程序对表执行共享级以上的表锁时(包括:S,SSX,X锁)且锁定时间过长或表上有外键约束引起的,大多数OLTP类型应用中对整个表的S,SX,X锁是不必要的,只需要行级锁(ROW LEVEL LOCK)即可以满足需要。客户本身也在怀疑应用程序存在问题,但苦于没有有力证据,无法督促开发商修改程序,又因为目前这套应用数据量,业务量都较小,这种矛盾仍不尖锐,所以总是没有行之有效地整改。

讨论完这些已将界6点,匆匆和同事回旅馆吃了晚饭。趁着天还没彻底晚,打的找了当地一个专卖土特产的超市,因为不想回去是负重过高,随便买了几包牦牛干,人参果。这忙碌的一天也就算完了,回旅馆睡觉!

第二天早晨抓紧时间再赶到客户处,另看了一套系统,因为要赶下午2点的飞机,所以行色匆匆,客户平常是使用本地java客户端的EM和10g上web形式的EM监控数据库的,在机房中的笔记本上使用SYS用户可以成功登陆,但在科室里的台式机上登陆就报用户名/密码错误的信息,初步怀疑是设置了SQLNET.ORA的某些参数,由于时间无多,这个问题只能摆脱后续的工程师解决了,呵呵!

回程仍需到西安转机,不过由于机票订得晚,经济舱已经满座了,十分不巧地做了会头等舱,也算一种福利吧。

呵呵,西宁二十四小时,对这个城市管中窥豹了!

沪ICP备14014813号-2

沪公网安备 31010802001379号