10g中HASH GROUP BY引起的临时表空间不足

今天早上应用人员反映一个原本在9i上可以顺利完成的CTAS脚本,迁移到10g后运行总是报“ORA-1652: unable to extend temp segment by 128 in tablespace TS_HQY1_TEMP “无法扩展临时表空间的错误。应用人员表示该脚本涉及的数据量在迁移前后变化不大,而且令人匪夷所思的是在新的10g库上临时表空间大小已达40多个G,要远大于原9i库。很显然这不是由于临时表空间过小导致的该问题,更多的原因肯定是出在迁移后Oracle不同的行为方式上。
该脚本每月执行一次用以汇总数据,其中一个单表接近4亿行记录,GROUP BY操作涉及到的数据量十分庞大。我们来具体看一下这个SQL:

create table gprs_bill.zou_201007_cell_id as
select /* g_all_cdr01,60 */
 calling_num mobile_number,
 lac,
 lpad(cell_id, 5, '0') cell_id,
 count(*) c,
 sum(call_duration) call_duration,
 sum(decode(record_type, '00', 1, 0) * call_duration) moc_call_duration,
 sum(decode(record_type, '01', 1, 0) * call_duration) mtc_call_duarion
  from gprs_bill.g_all_cdr01
 where substr(calling_num, 1, 7) in
       (select mobile_prefix from gprs_bill.zou_mobile_prefix)
 group by calling_num, lac, lpad(cell_id, 5, '0');

SQL> set autotrace traceonly exp
SQL> select /* g_all_cdr01,60 */
  2  calling_num mobile_number,
  3  lac,
  4  lpad(cell_id,5,'0') cell_id,
  5  count(*) c,
  6  sum(call_duration) call_duration,
  7  sum(decode(record_type,'00',1,0)*call_duration) moc_call_duration,
  8  sum(decode(record_type,'01',1,0)*call_duration) mtc_call_duarion
  9  from  gprs_bill.g_all_cdr01
 10  where substr(calling_num,1,7) in (select mobile_prefix from gprs_bill.zou_mobile_prefix)
 11  group by
 12  calling_num ,
 13  lac,
 14  lpad(cell_id,5,'0');

Execution Plan
----------------------------------------------------------
Plan hash value: 212866585

--------------------------------------------------------------------------------
-------------------

| Id  | Operation             | Name              | Rows  | Bytes |TempSpc| Cost
 (%CPU)| Time     |

--------------------------------------------------------------------------------
-------------------

|   0 | SELECT STATEMENT      |                   |   229K|  9880K|       |  103
3K  (3)| 03:26:41 |

|   1 |  HASH GROUP BY        |                   |   229K|  9880K|    22M|  103
3K  (3)| 03:26:41 |

|*  2 |   HASH JOIN RIGHT SEMI|                   |   229K|  9880K|       |  103
0K  (3)| 03:26:10 |

|   3 |    TABLE ACCESS FULL  | ZOU_MOBILE_PREFIX |  1692 | 13536 |       |    1
1   (0)| 00:00:01 |

|   4 |    TABLE ACCESS FULL  | G_ALL_CDR01       |   388M|    13G|       |  102
6K  (2)| 03:25:21 |

--------------------------------------------------------------------------------
-------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("MOBILE_PREFIX"=SUBSTR("CALLING_NUM",1,7))

可以看到Oracle使用了HASH GROUP BY 算法以实现数据分组;HASH算法是10g中新引入的分组算法。
下面我们来详细介绍下10g中数据分组的改动:
在10g中GROUP BY操作仍将引发排序操作,但10g中引入了新的算法,这些算法都不保证返回的数据行有序排列;在10g中如果想保证”GROUP BY”后返回的数据有序排列则需要强制使用”ORDER BY”子句,这点和9i是截然不同的。若你没有指定”ORDER BY”子句,则不能保证返回的结果正确排序。
在10g中”GROUP BY”子句更倾向于使用一种HASH算法而非原先的SORT算法来分组数据,HASH算法的CPU COST要低于原先的SORT算法。但这2种算法在10g中都不保证返回数据正常排序,当采用SORT算法时可能”碰巧”出现返回正常排序数据的状况。
MOS建议,如果迁移中出现大量不利的变化,则可以通过修改参数来确保沿用原先的算法。但需要注意的是,即便采用了以下参数仍不能保证10g后”GROUP BY”后返回的数据如9i中那样排序,你需要做的是加入显式的”ORDER BY”子句以保证Oracle为你做到这一点。

alter session set "_gby_hash_aggregation_enabled" = false;
alter session set optimizer_features_enable="9.2.0";
或者
alter session set optimizer_features_enable="8.1.7";

其中_gby_hash_aggregation_enabled隐式参数决定了Oracle是否可以启用新的HASH算法来进行数据分组(也适用于distinct等操作)。

对于以上说法我们通过实验进一步验证:

在11g中的测试如下:
SQL> select  * from v$version;

BANNER
----------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL> select  *  from youyus;

T1                 T2
---------- ----------
A                  10
B                  10
F                  30
G                  30
H                  40
I                  40
J                  40
L                  20
M                  20

已选择9行。
SQL>  analyze table youyus compute statistics for all columns;

表已分析。

SQL> set autotrace on;

SQL>  select t2,count(*) from youyus group by t2;

        T2   COUNT(*)
---------- ----------
        30          2
        20          2
        40          3
        10          2


执行计划
----------------------------------------------------------
Plan hash value: 2940504347

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     4 |     8 |     3  (34)| 00:00:01 |
|   1 |  HASH GROUP BY     |        |     4 |     8 |     3  (34)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------
/*可以看到使用了hash算法,且返回结果未按t2列大小顺序排列*/

SQL> select t2,count(*) from youyus group by t2 order by t2;

        T2   COUNT(*)
---------- ----------
        10          2
        20          2
        30          2
        40          3


执行计划
----------------------------------------------------------
Plan hash value: 1349668650

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     4 |     8 |     3  (34)| 00:00:01 |
|   1 |  SORT GROUP BY     |        |     4 |     8 |     3  (34)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------
/*加入order by子句后,又变回了SORT算法,而且正常排序*/
SQL> alter session set "_gby_hash_aggregation_enabled" = false;

会话已更改。
SQL> alter session set optimizer_features_enable="9.2.0";

会话已更改。
SQL> select t2,count(*) from youyus group by t2;

        T2   COUNT(*)
---------- ----------
        10          2
        20          2
        30          2
        40          3


执行计划
----------------------------------------------------------
Plan hash value: 1349668650

-------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     4 |     8 |    11 |
|   1 |  SORT GROUP BY     |        |     4 |     8 |    11 |
|   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2 |
-------------------------------------------------------------

Note
-----
   - cpu costing is off (consider enabling it)
/*optimizer_features_enable设置为9.2.0后cpu cost被off了;返回数据正确排序,但我们要记住这是"侥幸"*/

SQL> alter session set optimizer_features_enable="10.2.0.5";

会话已更改。
SQL> select t2,count(*) from youyus group by t2;

        T2   COUNT(*)
---------- ----------
        10          2
        20          2
        30          2
        40          3


执行计划
----------------------------------------------------------
Plan hash value: 1349668650

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     4 |     8 |     3  (34)| 00:00:01 |
|   1 |  SORT GROUP BY     |        |     4 |     8 |     3  (34)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------
/*optimizer_features_enable设为10.2.0.5 一切正常*/
SQL> alter session set optimizer_features_enable="11.2.0.1";

会话已更改。

SQL> select t2,count(*) from youyus group by t2;

        T2   COUNT(*)
---------- ----------
        10          2
        20          2
        30          2
        40          3


执行计划
----------------------------------------------------------
Plan hash value: 1349668650

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     4 |     8 |     3  (34)| 00:00:01 |
|   1 |  SORT GROUP BY     |        |     4 |     8 |     3  (34)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------
/*11.2.0.1中没有变化*/
SQL> alter session set optimizer_features_enable="8.1.7";

会话已更改。

SQL> alter session set "_gby_hash_aggregation_enabled" =true;

会话已更改。
/*看看optimizer_features_enable设为8.1.7,而_gby_hash_aggregation_enabled为true,这种"矛盾"情况下的表现*/
SQL> select t2,count(*) from youyus group by t2;

        T2   COUNT(*)
---------- ----------
        30          2
        20          2
        40          3
        10          2


执行计划
----------------------------------------------------------
Plan hash value: 2940504347

-------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     4 |     8 |    10 |
|   1 |  HASH GROUP BY     |        |     4 |     8 |    10 |
|   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     1 |
-------------------------------------------------------------

Note
-----
   - cpu costing is off (consider enabling it)
/*居然仍采用了HASH GROUP BY,看起来类似_gby_hash_aggregation_enabled这类参数优先级要高于optimizer_features_enable*/

9i上的表现如下:
SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE    9.2.0.3.0       Production
TNS for Linux: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production

SQL> analyze table youyus_9i compute statistics for all columns;

Table analyzed.

SQL> select * from youyus_9i;

T1         T2
-- ----------
A          10
B          10
F          30
G          30
H          40
I          40
J          40
L          20
M          20

9 rows selected.

SQL> alter session set optimizer_mode=ALL_ROWS;

Session altered.

SQL> select t2,count(*) from youyus_9i group by t2;

        T2   COUNT(*)
---------- ----------
        10          2
        20          2
        30          2
        40          3


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=4 Bytes=8)
   1    0   SORT (GROUP BY) (Cost=4 Card=4 Bytes=8)
   2    1     TABLE ACCESS (FULL) OF 'YOUYUS_9I' (Cost=2 Card=21 Bytes
          =42)
/*9i下虽然没有指定order by,但我们可以放心返回的数据总是排序的;*/

SQL> alter session set "_gby_hash_aggregation_enabled" =true;
alter session set "_gby_hash_aggregation_enabled" =true
                  *
ERROR at line 1:
ORA-02248: invalid option for ALTER SESSION
/*9i下不存在_gby_hash_aggregation_enabled隐式参数*/

That's great!

应用脚本没有数据一定要正确排序的强制要求,但使用HASH GROUP BY算法后临时表空间的使用量大幅上升,远大于之前在9i上的使用量,最后导致语句无法顺利完成。首先想到的当然是通过修改_gby_hash_aggregation_enabled参数恢复到原先的SORT算法,并观察其临时表空间使用量:

SQL> alter session set "_gby_hash_aggregation_enabled"=false;
Session altered.

SQL> select /* g_all_cdr01,60 */
  2  calling_num mobile_number,
  3  lac,
  4  lpad(cell_id,5,'0') cell_id,
  5  count(*) c,
  6  sum(call_duration) call_duration,
  7  sum(decode(record_type,'00',1,0)*call_duration) moc_call_duration,
  8  sum(decode(record_type,'01',1,0)*call_duration) mtc_call_duarion
  9  from  gprs_bill.g_all_cdr01
 10  where substr(calling_num,1,7) in (select mobile_prefix from gprs_bill.zou_mobile_prefix)
 11  group by
 12  calling_num ,
 13  lac,
 14  lpad(cell_id,5,'0');

Execution Plan
----------------------------------------------------------
Plan hash value: 4013005149

--------------------------------------------------------------------------------
-------------------

| Id  | Operation             | Name              | Rows  | Bytes |TempSpc| Cost
 (%CPU)| Time     |

--------------------------------------------------------------------------------
-------------------

|   0 | SELECT STATEMENT      |                   |   229K|  9880K|       |  103
3K  (3)| 03:26:41 |

|   1 |  SORT GROUP BY        |                   |   229K|  9880K|    22M|  103
3K  (3)| 03:26:41 |

|*  2 |   HASH JOIN RIGHT SEMI|                   |   229K|  9880K|       |  103
0K  (3)| 03:26:10 |

|   3 |    TABLE ACCESS FULL  | ZOU_MOBILE_PREFIX |  1692 | 13536 |       |    1
1   (0)| 00:00:01 |

|   4 |    TABLE ACCESS FULL  | G_ALL_CDR01       |   388M|    13G|       |  102
6K  (2)| 03:25:21 |

--------------------------------------------------------------------------------
-------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("MOBILE_PREFIX"=SUBSTR("CALLING_NUM",1,7))

/*重新执行出现问题的脚本*/
create table gprs_bill.zou_201007_cell_id as
    select /* g_all_cdr01,60 */
    calling_num mobile_number,
    lac,
    lpad(cell_id,5,'0') cell_id,
    count(*) c,
    sum(call_duration) call_duration,
    sum(decode(record_type,'00',1,0)*call_duration) moc_call_duration,
    sum(decode(record_type,'01',1,0)*call_duration) mtc_call_duarion
    from  gprs_bill.g_all_cdr01
    where substr(calling_num,1,7) in (select mobile_prefix from gprs_bill.zou_mobile_prefix)
    group by
    calling_num ,
    lac,
    lpad(cell_id,5,'0');

可以看到在会话级别设置_gby_hash_aggregation_enabled为false后,Oracle不再采用10g中的HASH分组算法;因为该CTAS SQL脚本运行时间较长,我们通过动态视图V$SORT_USAGE来观察其运行期间的排序段使用量:

SQL> set time   on;
14:30:59 SQL> select tablespace,contents,segtype,blocks*8/1024 from v$sort_usage where username='GPRS_BILL';

TABLESPACE                      CONTENTS  SEGTYPE   BLOCKS*8/1024
------------------------------- --------- --------- -------------
TS_HQY1_TEMP                    TEMPORARY SORT               9349

14:35:59 SQL> /

TABLESPACE                      CONTENTS  SEGTYPE   BLOCKS*8/1024
------------------------------- --------- --------- -------------
TS_HQY1_TEMP                    TEMPORARY SORT              10011

/*5分钟内共用10011-9349=662MB 临时空间*/
15:02:46 SQL> select target ,totalwork,sofar,time_remaining,elapsed_seconds from v$session_longops where sofar!=totalwork;

TARGET                                                            TOTALWORK      SOFAR TIME_REMAINING ELAPSED_SECONDS
---------------------------------------------------------------- ---------- ---------- -------------- ---------------
GPRS_BILL.G_ALL_CDR01                                               5575890    5435796            143            5557

15:05:10 SQL> select target ,totalwork,sofar,time_remaining,elapsed_seconds from v$session_longops where sofar!=totalwork;

TARGET                                                            TOTALWORK      SOFAR TIME_REMAINING ELAPSED_SECONDS
---------------------------------------------------------------- ---------- ---------- -------------- ---------------
GPRS_BILL.G_ALL_CDR01                                               5575890    5562082             14            5692

15:05:13 SQL> select tablespace,contents,segtype,blocks*8/1024 from v$sort_usage where username='GPRS_BILL';

TABLESPACE                      CONTENTS  SEGTYPE   BLOCKS*8/1024
------------------------------- --------- --------- -------------
TS_HQY1_TEMP                    TEMPORARY SORT              13835

15:12:22 SQL> select tablespace,contents,segtype,blocks*8/1024 from v$sort_usage where username='GPRS_BILL';

TABLESPACE                      CONTENTS  SEGTYPE   BLOCKS*8/1024
------------------------------- --------- --------- -------------
TS_HQY1_TEMP                    TEMPORARY SORT              13922

/* 排序已经完成,排序段不再增长*/

该分组操作最后排序段使用量为13922MB,在客户可以接受的范围内。看起来新引入的HASH算法虽然有CPU成本低于SORT算法的优势,但可能消耗大量临时空间,可谓有得有失。

dbms_stats收集模式在9i和10g上的区别

大约2个月前,一位业内人士问我为什么9i CBO迁移到10g上会出现许多执行计划改变导致的性能,他当然是为了能考考我;实际上我接触过的环境大多在8i/9i下没有使用CBO优化模式,从8i/9i的RBO模式跨越到10g上较为成熟的CBO优化模式,这当中出现执行计划讹误可以说是情理之中的;而9i CBO到10上的CBO问题也不少,我首先想到的是统计信息收集上存在区别,但具体是什么区别却又说不上。那位业内人士听了我的回答,笑,笑而不语。

Oracle十分博大,博大到可以称为Oracle的世界,很多东西长期不用就会遭人淡忘;我们来复习下9i和10g上统计信息收集的一些改动。

在9i中收集统计信息时其默认的MOTHOD_OPT模式为’FOR ALL COLUMNS SIZE 1’,使用这种模式时Oracle只收集所有列上最基础的统计信息,包括了最小/大值,distinct值等信息;但是不会收集列上的直方图。对那些数据均匀分布和没有出现在SQL语句中where子句中作为条件的列来说,这样的统计信息完全足够了。然而如果列上的数据分布并不均匀就可能导致CBO的执行计划成本计算不准确,这时我们需要手动对这些列上的直方图进行统计。

10g上对dbms_stats包中默认的METHOD_OPT模式做了修正,这显然是引起9i CBO迁移到10g CBO后易发地执行计划变化的一个重要因素,也是那位业内人士所要问的题眼。

新的默认METHOD_OPT值为”FOR ALL COLUMNS SIZE AUTO”,这意味着Oracle将通过内部算法自动决定那些列上需要收集统计信息,而那些列上不需要。是否收集直方图取决于列上数据的分布情况和与对应表相关的工作负载,这种工作负载可以解释为数据库中存在某些需要参考这些列的详细信息来计算执行成本的SQL语句。

这种方式听上去十分理想,似乎Oracle可以默默无闻地为我们抓取所有急需的统计信息。

然而问题是在许多环境中Oracle没有做出是否需要收集列上直方图的正确决定。实践证明Oracle有可能收集许许多多不必要的直方图,同时又放弃了许多需要收集的直方图。

在轻量级的应用环境中这种直方图收集不当的问题造成的影响大多数时间不为人们所察觉,相反在performance critical或已经形成性能瓶颈的环境中则可能是一场不大不小的麻烦。

此外Oracle还改变了列上密度(density)信息的计算方式。该值常被Oracle用来确定谓词选择性,当突然出现额外不必要的直方图时可能造成的广泛显著地性能影响(当然好的影响也可能出现,只是概率上……)。

显然这些莫名出现的不速之客也会给共享池造成影响,library cache与row cache相关的闩可能短期内车水马龙,如果您的应用数据表上有成百上千的列那么情况可能更糟(所以说开发要遵循范式,没有规矩的最后结果往往是应用不可用,项目失败。别告诉我你的应用苟且地活着,那同样意味着项目失败)!

Oracle中SQL解析的流程

Oracle中SQL解析的主要流程:
sql_parse_digram

我们说的游标概念比较复杂,它可以是客户端程序中的游标,服务进程中的私有游标,以及服务器端共享池里的共享游标。假设一个游标被打开了,一般来说它的共享游标信息(包括执行计划,优化树等)总是会在SQL AREA里,无需再次软/硬解析。

SESSION_CACHED_CURSORS是Oracle中的一个初始化参数(修改必须重启实例),指定了每个会话缓存的游标上限(保留在PGA中);客户端程序中open cursor的要求仍会被传递给服务进程,服务进程首先扫描自身缓存的游标信息,如果命中则可以避免软解析,也有人称它为“软软解析”。

HOLD_CURSOR是预编译程序中的一个参数,它指定了私有游标是否因该被缓存,这里不做展开。

在分析工具tkprof中hard parse与soft parse被同等对待,都被认为是parse;软解析即会造成parse总数上升。

软解析避免了硬解析过程中的几个步骤,但仍包括了初始化的语法,语义解析并计算语句HASH值与SQL AREA中已有语句进行对比;若匹配则查询优化等昂贵的操作得以避免。

另请注意,10053事件仅在硬解析过程中被触发。

JailbreakMe.com-最新浏览器模式破解iPhones,iPads和iPod Touches方法

一位黑客建立了该网站(JailbreakMe.com),可以通过浏览器登录的形式破解几乎所有的iOS,这包括了iPhone,iPad,和iPod Touch,将解除Apple对这些设备的软件限制。

用户如果想尝试未经授权的app或者想在多个不同国家使用这些设备,都可以登录http://jailbreakme.com来实现破解。

很显然这是自上周关于破解iPhone等设备并不属于非法行为的判决后,对于Apple反对破解行为并声称破解对设备安全与稳定会造成影响的公然挑衅;我们不得不赞叹于黑客的卓越技术和开发速度。

破解设备将影响到设备正常的保修,Apple这样解释。但是实际上,破解操作可以很容易地被例如重置到出厂设置等操作撤销,如果安装了新版本的Apple mobile Operating System-iOS, 也可以达到撤销破解效果的目的。

“JailbreakMe”迅速成为Twitter上的热门话题,这极有可能是第一个以浏览器为基础的破解程序。在过去,用户一般都采用下载破解程序到个人电脑上,然后连通电脑与需要破解的设备的模式。

已经尝试的用户反映除了针对运行3.2.1版本的iPad存在问题外破解程序近乎完美。从打开Safari浏览器到登陆JailbreakMe.com下载程序完成大约耗时1分钟,程序会告知已将自身加入home screen,并祝你”have fun”!

DataGuard Managed recovery hang

Our team deleted some archivelog by mistake. Rolled the database forwards by RMAN incremental recovery to an SCN. Did a manual recovery to sync it with the primary. Managed recovery is now failing.
alter database recover managed standby database disconnect

Alert log has :

Fri Jan 22 13:50:22 2010
Attempt to start background Managed Standby Recovery process
MRP0 started with pid=12
MRP0: Background Managed Standby Recovery process started
Media Recovery Waiting for thread 1 seq# 193389
Fetching gap sequence for thread 1, gap sequence 193389-193391
Trying FAL server: ITS
Fri Jan 22 13:50:28 2010
Completed: alter database recover managed standby database d
Fri Jan 22 13:53:25 2010
Failed to request gap sequence. Thread #: 1, gap sequence: 193389-193391
All FAL server has been attempted.

Managed recovery was working earlier today after the Rman incremental and resolved two gaps automatically. But it now appears hung with the standby falling behind the primary.

SQL> show parameter fal

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fal_client string ITS_STBY
fal_server string ITS

[v08k608:ITS:oracle]$ tnsping ITS_STBY

TNS Ping Utility for Solaris: Version 9.2.0.7.0 - Production on 22-JAN-2010 15:01:17

Copyright (c) 1997 Oracle Corporation. All rights reserved.

Used parameter files:
/oracle/product/9.2.0/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL= TCP)(Host= v08k608.am.mot.com)(Port= 1526)) (CONNECT_DATA = (SID = ITS)))
OK (10 msec)
[v08k608:ITS:oracle]$ tnsping ITS

TNS Ping Utility for Solaris: Version 9.2.0.7.0 - Production on 22-JAN-2010 15:01:27

Copyright (c) 1997 Oracle Corporation. All rights reserved.

Used parameter files:
/oracle/product/9.2.0/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL= TCP)(Host= 187.10.68.75)(Port= 1526)) (CONNECT_DATA = (SID = ITS)))
OK (320 msec)

Primary has :
SQL> show parameter log_archive_dest_2
log_archive_dest_2 string SERVICE=DRITS_V08K608 reopen=6
0 max_failure=10 net_timeout=1
80 LGWR ASYNC=20480 OPTIONAL
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_state_2 string ENABLE
[ITS]/its15/oradata/ITS/arch> tnsping DRITS_V08K608
TNS Ping Utility for Solaris: Version 9.2.0.7.0 - Production on 22-JAN-2010 15:03:24
Copyright (c) 1997 Oracle Corporation. All rights reserved.
Used parameter files:
/oracle/product/9.2.0/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL= TCP)(Host= 10.177.13.57)(Port= 1526)) (CONNECT_DATA = (SID = ITS)))
OK (330 msec)

The arch process on the primary database might hang due to a bug below so that it couldn’t ship the missing archive log
files to the standby database.

BUG 6113783 ARC PROCESSES CAN HANG INDEFINITELY ON NETWORK
[ Not published so not viewable in My Oracle Support ]
Fixed 11.2, 10.2.0.5 patchset

We could work workaround the issue by killing the arch processes on the primary site and they will be respawned
automatically immediately without harming the primary database.

[maclean@rh2 ~]$ ps -ef|grep arc
maclean   8231     1  0 22:24 ?        00:00:00 ora_arc0_PROD
maclean   8233     1  0 22:24 ?        00:00:00 ora_arc1_PROD
maclean   8350  8167  0 22:24 pts/0    00:00:00 grep arc
[maclean@rh2 ~]$ kill -9 8231 8233
[maclean@rh2 ~]$ ps -ef|grep arc
maclean   8389     1  0 22:25 ?        00:00:00 ora_arc0_PROD
maclean   8391     1  1 22:25 ?        00:00:00 ora_arc1_PROD
maclean   8393  8167  0 22:25 pts/0    00:00:00 grep arc

and alert log will have:

Fri Jul 30 22:25:27 EDT 2010
ARCH: Detected ARCH process failure
ARCH: Detected ARCH process failure
ARCH: STARTING ARCH PROCESSES
ARC0 started with pid=26, OS id=8389
Fri Jul 30 22:25:27 EDT 2010
ARC0: Archival started
ARC1: Archival started
ARCH: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=27, OS id=8391
Fri Jul 30 22:25:27 EDT 2010
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Fri Jul 30 22:25:27 EDT 2010
ARC1: Becoming the heartbeat ARCH

Actually if we don’t kill some fatal process in 10g , oracle will respawn all nonfatal processes.
For example:

[maclean@rh2 ~]$ ps -ef|grep ora_|grep -v grep
maclean  14264     1  0 23:16 ?        00:00:00 ora_pmon_PROD
maclean  14266     1  0 23:16 ?        00:00:00 ora_psp0_PROD
maclean  14268     1  0 23:16 ?        00:00:00 ora_mman_PROD
maclean  14270     1  0 23:16 ?        00:00:00 ora_dbw0_PROD
maclean  14272     1  0 23:16 ?        00:00:00 ora_lgwr_PROD
maclean  14274     1  0 23:16 ?        00:00:00 ora_ckpt_PROD
maclean  14276     1  0 23:16 ?        00:00:00 ora_smon_PROD
maclean  14278     1  0 23:16 ?        00:00:00 ora_reco_PROD
maclean  14338     1  0 23:16 ?        00:00:00 ora_arc0_PROD
maclean  14340     1  0 23:16 ?        00:00:00 ora_arc1_PROD
maclean  14452     1  0 23:17 ?        00:00:00 ora_s000_PROD
maclean  14454     1  0 23:17 ?        00:00:00 ora_d000_PROD
maclean  14456     1  0 23:17 ?        00:00:00 ora_cjq0_PROD
maclean  14458     1  0 23:17 ?        00:00:00 ora_qmnc_PROD
maclean  14460     1  0 23:17 ?        00:00:00 ora_mmon_PROD
maclean  14462     1  0 23:17 ?        00:00:00 ora_mmnl_PROD
maclean  14467     1  0 23:17 ?        00:00:00 ora_q000_PROD
maclean  14568     1  0 23:18 ?        00:00:00 ora_q001_PROD

[maclean@rh2 ~]$ ps -ef|grep ora_|grep -v pmon|grep -v ckpt |grep -v lgwr|grep -v smon|grep -v grep|grep -v dbw|grep -v psp|grep -v mman |grep -v rec|awk '{print $2}'|xargs kill -9

and alert log will have:
Fri Jul 30 23:20:58 EDT 2010
ARCH: Detected ARCH process failure
ARCH: Detected ARCH process failure
ARCH: STARTING ARCH PROCESSES
ARC0 started with pid=20, OS id=14959
Fri Jul 30 23:20:58 EDT 2010
ARC0: Archival started
ARC1: Archival started
ARCH: STARTING ARCH PROCESSES COMPLETE
Fri Jul 30 23:20:58 EDT 2010
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC1 started with pid=21, OS id=14961
ARC1: Becoming the heartbeat ARCH
Fri Jul 30 23:21:29 EDT 2010
found dead shared server 'S000', pid = (10, 3)
found dead dispatcher 'D000', pid = (11, 3)
Fri Jul 30 23:22:29 EDT 2010
Restarting dead background process CJQ0
Restarting dead background process QMNC
CJQ0 started with pid=12, OS id=15124
Fri Jul 30 23:22:29 EDT 2010
Restarting dead background process MMON
QMNC started with pid=13, OS id=15126
Fri Jul 30 23:22:29 EDT 2010
Restarting dead background process MMNL
MMON started with pid=14, OS id=15128
MMNL started with pid=16, OS id=15132

That's all right!

装修记

2010年7月4日的样子,敲了很多墙:

7月31日,开始吊顶,大多数瓷砖都进场了,今天还去把马赛克买了。
买的马赛克:

厅里用的瓷砖到了:

开始铺客厅瓷砖:

利用Toad for Data Analysts软件生成查询语句

下午尝试用Toad For Data Analysts生成查询语句和ER模型图,感觉还不错;同时配有图形化的执行计划示意图,配合Toad的SQL optimizer可以算是一个很不错的数据库开发组合。

生成查询的Query Builder界面:

生成的SQL语句:

图形化的执行计划:

重做日志浪费(redo wastage)

Oracle中联机日志文件(online redo log)在大多平台上以512 字节为一个标准块。

(HPUX,Tru64 Unix上是1024bytes,SCO UNIX,Reliant UNIX上是2048bytes,而MVS,MPE/ix上是4096bytes,虽然以上许多UNIX已经不再流行,实际情况可以通过

select max(l.lebsz) log_block_size_kccle

from sys.x$kccle l

where l.inst_id = userenv(‘Instance’)   语句查询到)

LGWR后台进程写出REDO时未必能填满最后的当前日志块。举例而言,假设redo buffer中有1025字节的内容需要写出,则1025=512+512+1 共占用三个重做日志标准块,前2个标准块被填满而第三个标准块只使用了1个字节。在LGWR完成写出前,需要释放”redo allocation”闩,在此之前SGA中索引”redo buffer”信息的变量将指向未被填满块后面的一个重做块,换而言之有511字节的空间被LGWR跳过了,这就是我们说的redo wastage;我们可以通过分析v$sysstat动态视图中的redo wastage统计信息了解实例生命周期中的重做浪费量。

SQL> col name for a25
SQL> select name,value from v$sysstat where name like '%wastage%';

NAME                           VALUE
------------------------- ----------
redo wastage                  132032

redo wastage的一个图示:

为什么要浪费这部分空间呢?实际上,这种做法十分有益于LGWR的串行I/O模式。redo wastage并不是问题或者Bug,而是Oracle故意为之的。当然过量的redo wastage也不是什么好事,一般是LGWR写出过于频繁的症状表现。9i以后很少有因为隐式参数_log_io_size过小而导致的LGWR过载了,如果在您的系统中发现redo wastage的问题不小,那么无限制地滥用commit操作往往是引起问题的罪魁祸首,减少不必要的commit语句,把commit从循环中移除都将利于减少redo wastage。

我们来看一下关于redo wastage的演示:

SQL> select distinct bytes/1024/1024 from v$log;

BYTES/1024/1024
---------------
             50                          /*确认联机日志文件大小为50MB*/
SQL> archive log list;                 /*确认数据库处于归档状态*/
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /s01/arch
SQL> set time on;
19:49:45 SQL> alter system switch logfile;           /*切换日志,清理现场*/
System altered.
19:51:07 SQL> col name for a25
19:51:16 SQL> select name,value from v$sysstat where name in ('redo size','redo wastage');

NAME                           VALUE
------------------------- ----------
redo size                 1418793324
redo wastage                88286544               /*演示开始时的基础统计值*/
19:51:19 SQL> begin
19:52:10   2  for i in 1..550000 loop
19:52:10   3  insert into tv values(1,'a');
19:52:10   4   commit;
19:52:10   5   end loop;
19:52:10   6   end;
19:52:11   7  /
/*匿名块中commit操作位于loop循环内,将导致大量redo wastage*/
PL/SQL procedure successfully completed.

19:53:07 SQL> select name,value from v$sysstat where name in ('redo size','redo wastage');

NAME                           VALUE
------------------------- ----------
redo size                 1689225404
redo wastage               112011352
/*频繁提交的匿名块产生了 1689225404-1418793324=257MB的redo,其中存在112011352-88286544=22MB的redo wastage*/

19:53:14 SQL>  begin
19:53:33   2  for i in 1..550000 loop
19:53:33   3  insert into tv values(1,'a');
19:53:33   4  end loop;
19:53:33   5    commit;
19:53:33   6   end;
19:53:34   7  /
/* 此匿名块中commit操作被移除loop循环中,批量修改数据后仅在最后提交一次*/
PL/SQL procedure successfully completed.

19:53:59 SQL> select name,value from v$sysstat where name in ('redo size','redo wastage');

NAME                           VALUE
------------------------- ----------
redo size                 1828546240
redo wastage               112061296
/*稀疏提交的匿名块最后产生了1828546240-1689225404=132MB的重做,而redo wastage为112061296-112011352=48k*/

可能您会很奇怪前者不是只比后者多出22MB的redo浪费吗,为什么总的redo量差了那么多?

我们需要注意到commit本身也是要产生redo的,而且其所产生的还不少!就以上演示来看频繁提交的过程中,commit所占用的redo空间几乎接近一半(257-132-22)/257=40%,而每次commit的平均redo量为(257-132-22)*1024*1024/550000=196 bytes。

commit操作是事务ACID的基础之一,合理运用commit可以帮我们构建健壮可靠的应用,而滥用它必将是另一场灾难!

Apple:破解iPhone或许合法,但仍会影响保修

天知道这在法庭上会是什么说法!Apple采取了一种积极防御的态度,他们指出破解(jailbreaking)iPhone,在今天可能是合法的,但仍将影响到保修。此外他们还表示真正会去破解IPhone的人只是那么一小撮,实在没必要因此而太过兴奋。

昨天的法律仲裁可能会是近几年最大的科技趣闻,同时也给了Apple一个机会。Apple公司的法律支持大概要花一阵子时间搞清楚破解(jailbreaking)这档子事了。事实是经过仲裁认定这是百分之百合法的。苹果公司当然没有义务为你提供一个专门用来破解的应用(jailbreaking application)-jailbreaking.app;但他们也不能够直接回绝你:“抱歉,我们禁绝任何破解!”。

一位苹果公司发言人这样对Mac狂热爱好者说:

苹果公司的目标是保证在使用iPhone时能获得美妙的用户体验,而且我们了解到破解其实是有损于这种体验的。正如我们之前所说的,绝大多数用户并没有破解的打算,因为那样做会影响到保修并且导致iPhone极不稳定也不可靠。

大体上我们不用质疑这份声明的诚意,但在末尾关于破解会导致iPhone极不稳定的纯属无中生有。如果破解存在诸多问题,怎么可能使整个社区致力于此?

latch free:cache buffer handles造成的SQL性能问题

数月之前,一位新疆的同事反映客户的某套单节点数据库存在性能问题,某个SQL长时间运行没有结束,在该SQL运行期间大量回话处于latch free等待中。因为相隔遥遥千里,同事之间是通过Email交流的;他首先给我发来了该问题SQL的相关explain_log。其中该语句的执行计划如下:

-----------------------------------------------------------------------------------------------
| Id  | Operation                               |  Name               | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                     |     1 |  1956 |    13 |
|*  1 |  FILTER                                 |                     |       |       |       |
|   2 |   NESTED LOOPS OUTER                    |                     |     1 |  1956 |    13 |
|   3 |    NESTED LOOPS OUTER                   |                     |     1 |  1942 |    12 |
|   4 |     NESTED LOOPS OUTER                  |                     |     1 |  1924 |    11 |
|   5 |      NESTED LOOPS                       |                     |     1 |  1906 |    10 |
|   6 |       NESTED LOOPS                      |                     |     1 |  1856 |     9 |
|   7 |        NESTED LOOPS                     |                     |     5 |  9055 |     8 |
|   8 |         NESTED LOOPS OUTER              |                     |     5 |  8575 |     7 |
|   9 |          NESTED LOOPS OUTER             |                     |     5 |  8280 |     6 |
|  10 |           NESTED LOOPS OUTER            |                     |     5 |  7785 |     5 |
|  11 |            NESTED LOOPS OUTER           |                     |     5 |  7290 |     4 |
|  12 |             NESTED LOOPS OUTER          |                     |     5 |  4130 |     3 |
|* 13 |              TABLE ACCESS BY INDEX ROWID| S_PROD_INT          |     5 |  2960 |     2 |
|* 14 |               INDEX RANGE SCAN          | S_PROD_INT_M53      |   326 |       |     5 |
|  15 |              TABLE ACCESS BY INDEX ROWID| S_PROD_LN           |     1 |   234 |     1 |
|* 16 |               INDEX UNIQUE SCAN         | S_PROD_LN_P1        |     1 |       |       |
|  17 |             TABLE ACCESS BY INDEX ROWID | S_PROD_INT_CRSE     |     1 |   632 |     1 |
|* 18 |              INDEX RANGE SCAN           | S_PROD_INT_CRSE_U1  |     1 |       |     1 |
|  19 |            TABLE ACCESS BY INDEX ROWID  | S_LIT               |     1 |    99 |     1 |
|* 20 |             INDEX UNIQUE SCAN           | S_LIT_P1            |     1 |       |       |
|  21 |           TABLE ACCESS BY INDEX ROWID   | S_LIT               |     1 |    99 |     1 |
|* 22 |            INDEX UNIQUE SCAN            | S_LIT_P1            |     1 |       |       |
|  23 |          TABLE ACCESS BY INDEX ROWID    | S_PROD_INT_TNTX     |     1 |    59 |     1 |
|* 24 |           INDEX RANGE SCAN              | S_PROD_INT_TNTX_U1  |     1 |       |     1 |
|  25 |         TABLE ACCESS BY INDEX ROWID     | S_VOD               |     1 |    96 |     1 |
|* 26 |          INDEX RANGE SCAN               | S_VOD_U2            |     1 |       |     2 |
|* 27 |        TABLE ACCESS BY INDEX ROWID      | S_VOD_VER           |     1 |    45 |     1 |
|* 28 |         INDEX RANGE SCAN                | S_VOD_VER_U1        |     2 |       |     4 |
|  29 |       TABLE ACCESS BY INDEX ROWID       | S_ISS_OBJ_DEF       |     1 |    50 |     1 |
|* 30 |        INDEX RANGE SCAN                 | S_ISS_OBJ_DEF_M3    |     1 |       |     2 |
|  31 |      TABLE ACCESS BY INDEX ROWID        | S_CTLG_CAT          |     1 |    18 |     1 |
|* 32 |       INDEX UNIQUE SCAN                 | S_CTLG_CAT_P1       |     1 |       |       |
|  33 |     TABLE ACCESS BY INDEX ROWID         | S_CTLG_CAT          |     1 |    18 |     1 |
|* 34 |      INDEX UNIQUE SCAN                  | S_CTLG_CAT_P1       |     1 |       |       |
|  35 |    TABLE ACCESS BY INDEX ROWID          | S_CTLG_CAT          |     1 |    14 |     1 |
|* 36 |     INDEX UNIQUE SCAN                   | S_CTLG_CAT_P1       |     1 |       |       |
|  37 |   NESTED LOOPS                          |                     |     1 |   137 |     7 |
|  38 |    NESTED LOOPS                         |                     |     4 |   456 |     6 |
|  39 |     NESTED LOOPS                        |                     |     4 |   364 |     5 |
|  40 |      NESTED LOOPS                       |                     |     2 |   142 |     4 |
|  41 |       NESTED LOOPS                      |                     |     1 |    53 |     3 |
|  42 |        NESTED LOOPS                     |                     |     3 |   120 |     2 |
|* 43 |         INDEX RANGE SCAN                | S_CTLG_CAT_PROD_U1  |     3 |    57 |     2 |
|* 44 |         TABLE ACCESS BY INDEX ROWID     | S_CTLG_CAT          |     1 |    21 |     1 |
|* 45 |          INDEX UNIQUE SCAN              | S_CTLG_CAT_P1       |     1 |       |       |
|* 46 |        TABLE ACCESS BY INDEX ROWID      | S_CTLG              |     1 |    13 |     1 |
|* 47 |         INDEX UNIQUE SCAN               | S_CTLG_P1           |     1 |       |       |
|  48 |       TABLE ACCESS BY INDEX ROWID       | S_PARTY_CTGRY       |     2 |    36 |     1 |
|* 49 |        INDEX RANGE SCAN                 | S_PARTY_CTGRY_F1    |     2 |       |     1 |
|  50 |      TABLE ACCESS BY INDEX ROWID        | S_PARTY_RPT_REL     |     2 |    40 |     1 |
|* 51 |       INDEX RANGE SCAN                  | S_PARTY_RPTREL_F50  |     2 |       |     1 |
|  52 |     TABLE ACCESS BY INDEX ROWID         | S_PARTY_PER         |     1 |    23 |     1 |
|* 53 |      INDEX RANGE SCAN                   | S_PARTY_PER_F1      |     1 |       |     2 |
|* 54 |    INDEX RANGE SCAN                     | S_PARTY_PER_U1      |     1 |    23 |     3 |
|  55 |   NESTED LOOPS                          |                     |     1 |   157 |     8 |
|  56 |    NESTED LOOPS                         |                     |     1 |   144 |     7 |
|  57 |     NESTED LOOPS                        |                     |     1 |   125 |     6 |
|  58 |      NESTED LOOPS                       |                     |     1 |   104 |     5 |
|  59 |       NESTED LOOPS                      |                     |     1 |    86 |     4 |
|  60 |        NESTED LOOPS                     |                     |     4 |   264 |     3 |
|  61 |         NESTED LOOPS                    |                     |     2 |    86 |     2 |
|* 62 |          INDEX RANGE SCAN               | S_PARTY_PER_U1      |     2 |    46 |     4 |
|  63 |          TABLE ACCESS BY INDEX ROWID    | S_POSTN             |     1 |    20 |     1 |
|* 64 |           INDEX UNIQUE SCAN             | S_POSTN_P1          |     1 |       |       |
|* 65 |         INDEX RANGE SCAN                | S_PARTY_PER_U1      |     2 |    46 |     3 |
|* 66 |        INDEX RANGE SCAN                 | S_PARTY_RPT_REL_U2  |     1 |    20 |     1 |
|* 67 |       INDEX RANGE SCAN                  | S_PARTY_CTGRY_U1    |     1 |    18 |     1 |
|* 68 |      TABLE ACCESS BY INDEX ROWID        | S_CTLG_CAT          |     1 |    21 |     1 |
|* 69 |       INDEX UNIQUE SCAN                 | S_CTLG_CAT_P1       |     1 |       |       |
|* 70 |     INDEX RANGE SCAN                    | S_CTLG_CAT_PROD_U1  |     1 |    19 |     1 |
|* 71 |    TABLE ACCESS BY INDEX ROWID          | S_CTLG              |     1 |    13 |     1 |
|* 72 |     INDEX UNIQUE SCAN                   | S_CTLG_P1           |     1 |       |       |
|  73 |   NESTED LOOPS                          |                     |     1 |    58 |     3 |
|  74 |    NESTED LOOPS                         |                     |     3 |   135 |     2 |
|* 75 |     INDEX RANGE SCAN                    | S_CTLG_CAT_PROD_U1  |     2 |    38 |     2 |
|* 76 |     TABLE ACCESS BY INDEX ROWID         | S_CTLG_CAT          |     1 |    26 |     1 |
|* 77 |      INDEX UNIQUE SCAN                  | S_CTLG_CAT_P1       |     1 |       |       |
|* 78 |    TABLE ACCESS BY INDEX ROWID          | S_CTLG              |     1 |    13 |     1 |
|* 79 |     INDEX UNIQUE SCAN                   | S_CTLG_P1           |     1 |       |       |
-----------------------------------------------------------------------------------------------

该SQL涉及近四十个表和索引的链接操作,而且全部采用嵌套循环链接模式,是典型的siebel应用。这个SQL的执行计划十分复杂,我甚至不愿意去看它;既然从执行计划上进一步分析过于违背“惰性”,我首先想到的还是从wait interface入手,毕竟OWI是”useless”的。因为无法远程操作,反复收集信息会让整个过程十分缓慢,一口气问对方要了10046,errorstack,latches的dump信息,以及该数据库的完整RDA报告。

既然是latch free等待,那么总要先定位到相关的latch才好进一步分析,我们首先看10046 trace:

cat    siebdb1_ora_15465.trc |grep "nam='latch free'"|awk '{print $7}' |sort|uniq    -c
1 p1=-4611685984823284696
169 p1=-4611685984823317640
1 p1=-4611685988591416672
1 p1=-4611685988592487696
2 p1=-4611685988692060472
1 p1=-4611685988694684144
1 p1=-4611685988705526816
1 p1=-4611685988733017136
1 p1=-4611685988779484112
1 p1=-4611685988784059264
1 p1=-4611685988787329592
1 p1=-4611685988788323152
1 p1=-4611685988800312448
2 p1=-4611685988805347840
24 p1=-4611685988806714976
2 p1=-4611685988837854328
7 p1=-4611685988841028520
1 p1=-4611685988869432648
1 p1=-4611685988871320920

通过以上脚本处理,我们可以看到在执行过程中出现了各种latch free,出现最频繁的是p1=-4611685984823317640的栓。latch free等待事件中p1所代表的是会话所等待栓的地址。-4611685984823317640换算到16进制为C0000007D2F56B78。我们看看这个栓到底是何方神圣:

 c0000007d2f56b78 cache buffer handles level=3
        Location from where latch is held: kcbzfs:
        Context saved from call: 0
        state=busy
  gotten 2734921346 times wait, failed first 656272521 sleeps 44808579
  gotten 0 times nowait, failed: 0

当会话需要pin住buffer header时它首先要获去buffer handle,得到buffer handle的过程中首先要抢占cache buffer handles栓,为了避免对于cache buffer handles栓的过度争用,每个会话被允许cache一小撮buffer handles,也叫保留集(reserved set)。该保留集的上限由隐式参数_db_handles_cached(默认为5)所控制,在此基础上会话在执行不是十分复杂的SQL时不必反复申请栓。
同时存在一些十分复杂的语句,需要同时pin住大量的缓存,此时若存在空闲的handle,会话被允许索要手头5个cached buffer handles以外的handle。也为了限制单个会话的资源占用量,Oracle通过内部算法总是尽量让每个进程能够pin住”合理份额“的buffer,由此单个会话同一时间所能持有的总handle数等于db_block_buffers/processes,隐式参数_cursor_db_buffers_pinned指出了该值。另cache buffer handles并没有子栓,仅有父栓本身,所以如果系统中有大量复杂SQL,会话所能cache的buffer handles远不足以满足需求时,该栓就成为整个系统性能的瓶颈,甚至可能出现潜在的死锁/饿死问题。
结合我们从trace中获取的实际信息,cache buffer handles的命中率为1-656272521/2734921346=76%;一般来说如果系统中某个栓的命中率低于95%时,我们就可以认为该栓的争用较为激烈了;而此处一个父栓的命中率到了80%以下,我们很难想象其性能恶化状态。
在同事发给我的RDA中探索,从resource limit栏目中可以看到该实例的会话高水位从未超过2500,而实例设定的processes值则为6000,这是一个很多初级DBA或施工方会犯的错误,认为设定高一些的processes值可以避免出现ora-00020错误从而高枕无忧,殊不知Oracle中许多隐式参数的分配是针对process数来计算的,资源总数是有限的,process数多则单个会话所能操控的理论资源数相应减少,这不仅在Oracle中,几乎是放之四海皆准的原理。
极有可能是过高的processes参数影响了_cursor_db_buffers_pinned,而该问题SQL除了十分复杂外还有全部采用nested loop链接方式的问题,这就要求同时时刻需pin住大量不同表和索引的buffer,于是在SQL执行伊始会话尝试pin住了一部分的buffer,但因为最终达到了_cursor_db_buffers_pinned所指定的上限,该会话在持有cache buffer handles栓的情况下处于hang状态,进一步导致了大量其他会话因latch free等待而阻塞。
当时给同事的最主要建议就是降低processes参数,从该系统的V$resource_limit动态视图来看,将processes设到3000就绰绰有余了;同时我们可以增大隐式参数_db_handles_cached以避免系统中对该栓的争用。

几天后,同事回信告诉我开发人员通过其他途径绕过了该问题,但我还是建议他在未来可能的offline计划中加入修改processes和_db_handles_cached参数的操作。

这个case发生几个月后,我无意中从MOS中发现以下note:

Hdr: 9328032 10.2.0.3 RDBMS 10.2.0.3 BUFFER CACHE PRODID-5 PORTID-23
Abstract: INCREASE IN CACHE BUFFER CHAINS/HANDLES WAITS

*** 01/29/10 05:41 am ***
TAR:
----
3-1375394871

PROBLEM:
--------
Today, and despite no significant change in workload we have increased waits
on these two latches:

29Jan2010_1030_1045 : Top 5 Timed Events
   Avg %Total
29Jan2010_1030_1045 : ~~~~~~~~~~~~~~~~~~
  wait   Call
29Jan2010_1030_1045 : Event                                 Waits    Time (s)
  (ms)   Time Wait Class
29Jan2010_1030_1045 : ------------------------------ ------------ -----------
------ ------ ----------
29Jan2010_1030_1045 : db file sequential read           4,912,142      50,169
    10   40.7   User I/O
29Jan2010_1030_1045 : CPU time                                         45,456
         36.9
29Jan2010_1030_1045 : latch: cache buffers chains       1,347,911      12,255
     9   10.0 Concurrenc
29Jan2010_1030_1045 : latch: cache buffer handles       1,364,489      10,662
     8    8.7      Other
29Jan2010_1030_1045 : log file sync                       482,281       7,101
    15    5.8     Commit
29Jan2010_1030_1045 :
-------------------------------------------------------------     

DIAGNOSTIC ANALYSIS:
--------------------
See below

WORKAROUND:
-----------
None

RELATED BUGS:
-------------

REPRODUCIBILITY:
----------------

TEST CASE:
----------

STACK TRACE:
------------

SUPPORTING INFORMATION:
-----------------------

24 HOUR CONTACT INFORMATION FOR P1 BUGS:
----------------------------------------

DIAL-IN INFORMATION:
--------------------

IMPACT DATE:
------------ 
Notes so far:

There seems to be 5 - 12% increase across a number of statistics
between the OK sample and the problem sample.
eg:                          Total           per second  per TX
  26th> execute count       6,322,503        7,025.2      14.6
  29th> execute count       6,828,911        7,552.3      14.7
        8% increase in execute count

  26th> data blocks consistent reads - u 16,567,225  18,408.4  38.2
  29th> data blocks consistent reads - u 18,700,073  20,681.1  40.3
        12.6% increase in consistent read undo records applied

cache buffer handles
~~~~~~~~~~~~~~~~~~~~
 There is about an 8% increase in gets of this latch between
 the two reports:
                               Gets        miss%   wait time
  26th> cache buffer handles  80,320,092   21.0        56
  29th> cache buffer handles  87,158,750   55.9     10661

 And the latch was already showing 21% miss rate before so was
 showing signs of being close to becoming a problem.

 The latch misses stats show:
                            Location      Sleeps   Waiter Sleeps
   cache buffer handles     kcbzgs        726,520  832,967
   cache buffer handles     kcbzfs        638,106  531,875

 This latch is used just to get / release buffer state objects.

 Each process caches up to "_db_handles_cached" of these against
 its own process so once we are warmed up and have processes
 then we should only use this latch if we are after pinning more
 than 5 buffers in the process state.

 As the misses seem balanced across kcbzgs and kcbzfs then the
 gets do not appear to be due to warming up so this points at
 SQLs on the system needing to pin more than 5 buffers at a
 time. If we increase "_db_handles_cached" then we should be able
 to alleviate the load on this latch.

 eg: See PID 4174 from yesterday we see it has 9 buffers pinned
     and is running:
       SELECT   /*+ LEADING (PROD, ITM, ORDX)
          INDEX (ITM S_ORDER_ITEM_BT_W1_X )*/ ...
     So we can see there are SQLs needing many more than 5 buffers
     pinned at a time.
*** 01/29/10 10:06 am ***
cache buffers chains
~~~~~~~~~~~~~~~~~~~~            GETS           Miss%
 26th>  cache buffers chains    866,844,547    1.0    0.0     83
 29th>  cache buffers chains    900,769,954    1.4    0.1  12258
        4% increase in gets , large increase in wait time.

 Previous miss rate was 1%, now 1.4%.

 James already checked and mostly the spread looks reasonable
  across child latches.
                                           per second
     26th>     Transactions:                481.56
     29th>     Transactions:                513.79
               6.5% more transactions processed.

 This is looking like increased load is starting to see the
 latch contention. It may be sensible here to look at the
 higher "buffer gets" SQLs to see if they really are well
 tuned. eg: 8y9msp8p3gv23 in the 1130-1145 reports are
 showing increased executions but also increased buffers
 per execution. It would seem sensible to check out the
 plans and use of the top buffer gets SQLs.

 It would also be wise to confirm we are not hitting bug 6455161
 as that affects this release and I cannot see that patch in place.
 ie: Verify:
     select owner||'.'||Object_name||'.'||subobject_name
       from dba_objects
      where object_id<>data_object_id
        and object_type like 'INDEX%'
      order by owner,object_name,subobject_name;
 That would not be a difference from 26th but if this is hit
 you gets may be far higher than needed for certains types of
 query.
*** 01/29/10 02:14 pm *** (CHG: Sta->11)
*** 01/29/10 02:14 pm ***
*** 01/29/10 02:40 pm *** (CHG: Sta->16)
*** 01/29/10 03:03 pm ***
*** 01/29/10 11:33 pm ***
Many of the indexes are on SIEBEL objects so
bug 6455161 looks like it could occur here.
*** 01/30/10 12:02 am ***
*** 01/30/10 12:41 am ***
I checked a few top buffer gets SQLs from older AWR that we have
awrrpt_1_67673_67677 . It looks like the heaviest SQLs are
probably not using indexes which are exposed to 6455161 but a couple
may be:

 av5skvp0mkxm1
  SELECT /*+ ALL_ROWS */ T3.CONFLICT_ID, T3.LAST_UPD ...
  FROM SIEBEL.S_LST_OF_VAL T1, SIEBEL.S_SRM_ACTION T2, SIEBEL.S_SRM_ACTION T3

   ...

  Indexes on the q1.log list which might be used by that query in NL:
    SIEBEL.S_SRM_ACTION_M1    SIEBEL.S_SRM_ACTION_P1
    SIEBEL.S_SRM_ACTION_U1
    SIEBEL.S_SRM_ACT_PARAM_P1 SIEBEL.S_SRM_ACT_PARAM_U1

 7khg9mx4cv5c5
  Queries from SIEBEL.S_ORG_EXT with lots of other tables

  Might use SIEBEL.S_ORG_EXT_M1 from the list
*** 01/30/10 12:59 am ***
=====
>>> Side issue unrelated to the issue here:
      From patch list patches_28012010.lis
      customer has 6868080 installed. This was superceded by 8244734
=====
*** 01/30/10 07:06 am *** (CHG: Sta->10)
*** 01/30/10 07:06 am ***
Suggestions for this:

a> Set "_db_handles_cached" >= 10

b> Get and install a patch for bug 6455161
    This seems like this is minimal impact at present
    as problem indexes are not used in the top buffer
    gets SQLs BUT one index rebuild could lead to mass
    problems, and it may give some small reductions
    in buffer gets on certain SQLs, just not the top
    ones.
    It is a small focused fix with no clashes on current
    list.

c> Double check some of the top buffer gets SQL plans.
    They may be the best they could be but are worth
    checking as there is little in the AWRs to show
    any special reason for cache buffers chains increase
    other than the slightly higher workload.
*** 02/01/10 07:32 am ***

>> SQL "av5skvp0mkxm1" may be susceptible to 6455161.
*** 02/01/10 08:03 am ***
Reviewing level of "_db_handles_cached" as some of the top
SQLs seem very deep NL outer.

Testing this here.

*** 02/01/10 11:05 am ***
For repeatedly executed deep NL SQLs higher handle caching should
improve loading on the "cache buffer handles" latch, but it
has no impact on the chains latch as NLs and NL outers keep buffers
pinned anyway.

So suggestions are still>>
 Set "_db_handles_cached" >= 40 (given nested depth of some SQL)
   See if extra memory should be added to shared pool to accomodate that

 Install 6455161

 Drill more into SQL to see if any can be improved esp wrt buffer
  gets needed .
 Also
   Recheck V$latch_children spread for cache buffers chains
   and recheck X$BH blocks on warmest children
   See if there is any session skew on the latch waits
   (any sessions more prone to waits than others)

该note所述case与我们的情况相似,经过排查发现最有”嫌疑“的SQL语句中也存在大量的Nested Loop外连接。这似乎是siebel应用一个普遍存在的问题,note中的客户最初发现的症结也是cache buffer handles栓等待问题,Oracle MOS最后给出的建议是:

1.增大db_handles_cache 到10以上,最后又建议增加到40以上

2. 这个note中客户的数据库版本是sparc(64 bit) 10.2.0.3, MOS建议在此基础上应用Bug 6455161的one-off patch

3.检查有大量buffer gets的SQL语句的执行计划

沪ICP备14014813号-2

沪公网安备 31010802001379号