有这样一个case , 用户的10g产品数据库中有一张按照月份划分的RANGE分区表 , 其范围为Less than (maxvalue)的最后一个Partition分区总是为空Empty的, 用户每隔半年会对这个MaxValue Partition 执行Split Partition 操作, 以便存放后续月份的数据, 同时这个表上还有一个非分区索引Nonpartitioned indexes。
满以为Split 这样一个Empry Partition会是一个回车就结束的任务, 但是Performance issue就在这平淡无奇的分区维护工作中出现了, 实际在执行”Alter Table Split partition Pn at (value) into …” 的DDL语句时,发现需要花费超过十多分钟才能完成一次Split。问题就在于,如果是有大量数据的Partition分区 , Split 操作慢一些也是正常的(expected预期内的) , 但是实际这里的Max Partition总是为空的,没有任何一条数据, 为什么Split 一个空的分区也会是这种蜗牛的速度呢?
我们来模拟这个现象, 首先创建一张分区表,Maxvalue的Partition是Empty的,且有一个普通索引:
SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi PL/SQL Release 10.2.0.1.0 - Production CORE 10.2.0.1.0 Production TNS for Linux: Version 10.2.0.1.0 - Production NLSRTL Version 10.2.0.1.0 - Production SQL> select * from global_name; GLOBAL_NAME -------------------------------------------------------------------------------- www.askmac.cn & www.askmac.cn SQL> CREATE TABLE Maclean 2 ( "OWNER" VARCHAR2(30), 3 "OBJECT_NAME" VARCHAR2(128), 4 "SUBOBJECT_NAME" VARCHAR2(30), 5 "OBJECT_ID" NUMBER, 6 "DATA_OBJECT_ID" NUMBER, 7 "OBJECT_TYPE" VARCHAR2(19), 8 "CREATED" DATE, 9 "LAST_DDL_TIME" DATE, 10 "TIMESTAMP" VARCHAR2(19), 11 "STATUS" VARCHAR2(7), 12 "TEMPORARY" VARCHAR2(1), 13 "GENERATED" VARCHAR2(1), 14 "SECONDARY" VARCHAR2(1) 15 ) nologging 16 partition by range(object_id) 17 (partition p1 values less than (99999) tablespace users, 18 partition p2 values less than (maxvalue) tablespace users); Table created. SQL> insert /*+ append */ into maclean select * from maclean1; 38736384 rows created. SQL> commit; Commit complete. SQL> create index ind_obj on maclean(DATA_OBJECT_ID,OBJECT_ID,LAST_DDL_TIME,TIMESTAMP,object_type,owner,status) nologging parallel 2 ; Index created. SQL> alter index ind_obj noparallel; Index altered. SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',cascade=>true,degree=>2); PL/SQL procedure successfully completed. SQL> select num_rows from dba_tables where table_name='MACLEAN'; NUM_ROWS ---------- 38818438 SQL> select * from maclean partition (p2); no rows selected /* Maclean表有大量的数据 ,但是都在p1分区中, p2分区没有任何数据 */
我们执行Split partition 的DDL 语句,并使用10046 level 12 event监控该过程:
oradebug setmypid; oradebug event 10046 trace name context forever,level 12; SQL> alter table maclean split partition p2 at (100001) 2 into (partition p3, partition p4); Table altered. [oracle@vrh8 ~]$ tkprof /s01/admin/G10R21/udump/g10r21_ora_4896.trc g10r21_ora_4896.tkf TKPROF: Release 10.2.0.1.0 - Production on Thu Nov 17 23:42:48 2011 Copyright (c) 1982, 2005, Oracle. All rights reserved. 从tkf 文件中可以找出以下内容: alter table maclean split partition p2 at (100001) into (partition p3, partition p4) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.13 0.30 20 1139 0 0 Execute 1 0.01 0.18 3 6 33 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.14 0.48 23 1145 33 0 select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION ("P2") where ( ( ( ( "OBJECT_ID" < 100001 ) ) ) ) and rownum < 2 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 24.85 39.15 371276 385828 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 3 24.85 39.15 371276 385828 0 0 Misses in library cache during parse: 1 Optimizer mode: FIRST_ROWS Parsing user id: SYS (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- 0 COUNT STOPKEY (cr=385828 pr=371276 pw=0 time=39153836 us) 0 TABLE ACCESS BY GLOBAL INDEX ROWID MACLEAN PARTITION: 2 2 (cr=385828 pr=371276 pw=0 time=39153817 us) 38736384 INDEX FULL SCAN IND_OBJ (cr=385828 pr=371276 pw=0 time=309891137 us)(object id 52832) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ db file sequential read 371276 0.08 19.46 latch: cache buffers lru chain 1 0.00 0.00 select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION ("P2") where ( ( ( ( "OBJECT_ID" >= 100001 OR "OBJECT_ID" IS NULL ) ) ) ) and rownum < 2 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.00 0.00 0 3 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 3 0.00 0.00 0 3 0 0 Misses in library cache during parse: 1 Optimizer mode: FIRST_ROWS Parsing user id: SYS (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- 0 COUNT STOPKEY (cr=3 pr=0 pw=0 time=83 us) 0 PARTITION RANGE SINGLE PARTITION: 2 2 (cr=3 pr=0 pw=0 time=63 us) 0 TABLE ACCESS FULL MACLEAN PARTITION: 2 2 (cr=3 pr=0 pw=0 time=36 us)
可以看到在执行”Alter table Split partition”的时候该DDL 语句产生了另外2条递归SQL(recursive sql)被调用,即上例中红色标注的SQL语句,这2条递归SQL分别以 “OBJECT_ID” >= 100001 OR “OBJECT_ID” IS NULL 和 “OBJECT_ID” < 100001 作为条件查询P2分区中的数据, Split Partition的DDL需要使用这2条SQL来找出是否存在满足分隔条件过滤的数据(注意语句中有rownum<2 , 所以实际最多也只返回1条数据,Oracle这样来判定分隔条件的左端或右端是否有数据)。
但是这里可以看到,其中 以”OBJECT_ID” < 100001 作为条件的语句运行耗时39.15s, 产生了大量的逻辑和物理读, 究其原因是该SQL的执行计划采用了Index FULL SCAN ,而另外一条以 “OBJECT_ID” >= 100001 OR “OBJECT_ID” IS NULL 作为条件的递归SQL语句则采用了TABLE ACCESS FULL MACLEAN PARTITION, 因为实际P2分区中是没有任何数据的,所以后者运行时间是us级别的, 而前者所要扫描的是一整个没有分区的索引, 这产生了大量的”db file sequential read”等待事件, 我们再来看一下其执行计划:
SQL> explain plan for
2 select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
3 1
4 from "SYS"."MACLEAN" PARTITION("P2")
5 where (((("OBJECT_ID" < 100001))))
6 and rownum < 2;
SQL> select * from table(dbms_xplan.display());
PLAN_TABLE_OUTPUT
-------------------------------------------------
Plan hash value: 985826631
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 380K (1)| 01:16:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN | 1 | 13 | 380K (1)| 01:16:01 | 2 | 2 |
|* 3 | INDEX FULL SCAN | IND_OBJ | 38M| | 380K (1)| 01:16:01 | | |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<2)
3 - access("OBJECT_ID"<100001)
filter("OBJECT_ID"<100001)
注意以上递归SQL中的Hint “FIRST_ROWS PARALLEL(“MACLEAN”, 1)”是该Recursive SQL所固有的,换句话说是写死在Split Partition的Oracle代码层里的。
我们可以分析该Recursive SQL 采用INDEX FULL SCAN的原因可能有2种:
1. Split 所指定的分区的分区统计信息不准确,或者已经被清除。 注意一旦我们Split 某个分区后该分区原有统计信息将失效,且被清除。
如下面的例子:
SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',cascade=>true,degree=>2); PL/SQL procedure successfully completed. SQL> col high_value for a20 SQL> select partition_name,high_value,num_rows,blocks from dba_tab_partitions where table_name='MACLEAN'; PARTITION_NAME HIGH_VALUE NUM_ROWS BLOCKS ------------------------------ -------------------- ---------- ---------- P1 99999 38789142 533240 P3 100001 0 0 P4 MAXVALUE 0 0 SQL> alter table maclean split partition p4 at (100010) 2 into (partition p5, partition p4); 这里我们再次Split 当前的MAXVALUE分区p4 SQL> select partition_name,high_value,num_rows,blocks from dba_tab_partitions where table_name='MACLEAN'; PARTITION_NAME HIGH_VALUE NUM_ROWS BLOCKS ------------------------------ -------------------- ---------- ---------- P1 99999 38789142 533240 P3 100001 0 0 P4 MAXVALUE P5 100010
可以发现Split Partiton 会导致原Partiton的统计信息失效,即便使用dbms_stats.lock_table_stats锁住统计信息也无法避免这种失效。
且单个partiton的统计信息失效并不会导致动态采用的发生(dynamic sampling):
SQL> show parameter dyna
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 2
SQL> select * from maclean partition (p4);
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9699K| 860M| 29715 (2)| 00:05:57 | | |
| 1 | PARTITION RANGE SINGLE| | 9699K| 860M| 29715 (2)| 00:05:57 | 4 | 4 |
| 2 | TABLE ACCESS FULL | MACLEAN | 9699K| 860M| 29715 (2)| 00:05:57 | 4 | 4 |
--------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
124 recursive calls
0 db block gets
23 consistent gets
0 physical reads
0 redo size
1139 bytes sent via SQL*Net to client
458 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL> alter system flush shared_pool;
System altered.
SQL> alter session set optimizer_dynamic_sampling=10;
Session altered.
SQL> select * from maclean partition (p4);
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9699K| 860M| 29715 (2)| 00:05:57 | | |
| 1 | PARTITION RANGE SINGLE | | 9699K| 860M| 29715 (2)| 00:05:57 | 4 | 4 |
| 2 | TABLE ACCESS FULL | MACLEAN | 9699K| 860M| 29715 (2)| 00:05:57 | 4 | 4 |
--------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
5812 recursive calls
0 db block gets
1141 consistent gets
22 physical reads
0 redo size
1139 bytes sent via SQL*Net to client
458 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
139 sorts (memory)
0 sorts (disk)
0 rows processed
SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',partname=>'P4');
PL/SQL procedure successfully completed.
SQL> select * from maclean partition (p4);
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 177 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1 | 177 | 2 (0)| 00:00:01 | 4 | 4 |
| 2 | TABLE ACCESS FULL | MACLEAN | 1 | 177 | 2 (0)| 00:00:01 | 4 | 4 |
--------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
1139 bytes sent via SQL*Net to client
458 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
由于Split Partition 本身会导致分区的统计信息失效, 这叫造成由该Split DDL所生成的递归查询SQL语句在解析时CBO Optimizer优化器无法了解该分区的必要统计信息,所以优化器会根据整张Table的统计信息来估算(根据算法来估,而没有实际的统计操作), 实际统计信息中整张表有38818438 行 , 且共有4个分区, 所以P4分区就顺利成长的拥有38818438 /4 = 9704k Rows了 , 实际上例中红色标注的估算值时9699k rows, 因为CBO 优化器得到的统计信息是该分区中有大量的数据 ,这导致其最终选择了 FULL INDEX SCAN的执行计划 ,而不是去扫描其实是空空如也的分区。
这是我们在 10g 中执行Split Partition 操作时需要密切注意的一个问题, 解决方法是没执行一次Split Partition DDL语句之前都收集一遍MaxValue Partiton 的统计信息, 因为该分区是空的 ,所以这种统计是十分迅速的:
exec dbms_stats.gather_table_stats('&OWNER','&TABNAME',partname=>'&PARNAME');
另一种手段则是在每次Split 之前都 手动修改Maxvalue分区的统计信息, 这样做会更快一些:
exec dbms_stats.set_table_stats(ownname => '&OWNER',tabname => '&TABNAME',partname => '&PARNAME', numrows => 0,numblks => 0,force => TRUE);
2. 另一个原因是 相关的递归SQL语句 被嵌入了”First Rows”的hint , 该提示会让CBO更倾向于使用索引以便快速返回前几行的结果,注意因为这些递归SQL实际只要求返回一行结果,所以First Rows 可以说是恰当且明智的; 另外在分区表+本地分区的情景中, 即便这个分区是非空的且存有大量的数据, 那么使用索引都可以说是正确的选择。
但是在这里选择INDEX FULL SCAN 恰恰是不明智的, 很显然Oracle开发部门没有为分区表+ 非分区索引(Non-partitioned Indexes) 或 全局索引(global partitioned indexes)的场景考虑该First Rows提示可能带来的后果, 已知在版本10.2.0.2 和 10.2.0.3 上都存在该不恰当的递归SQL hint造成的Split Partiton性能问题,Bug Note<Bug 6435544: SPLIT PARTITION SLOW BECAUSE OF HINTED RECURSIVE SQL>说明了该问题:
Hdr: 6435544 10.2.0.2.0 RDBMS 10.2.0.2.0 QRY OPTIMIZER PRODID-5 PORTID-226 Abstract: SPLIT PARTITION SLOW BECAUSE OF HINTED RECURSIVE SQL PROBLEM: -------- Split partition operation took more than 45 minutes to complete. Almost all the time is taken up by the following SQL - select /*+ FIRST_ROWS PARALLEL("D_CUSTOMER_ORDER_ITEM_CHANNELS", 8) */ 1 from "BOOKER"."D_CUSTOMER_ORDER_ITEM_CHANNELS" PARTITION ("COIC101_MAX") where ( ( ( ( "LEGAL_ENTITY_ID" < 101 ) ) OR ( "LEGAL_ENTITY_ID" = 101 AND ( "ORDER_DAY" < TO_DATE(' 2007-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN') ) ) ) ) and rownum < 2 DIAGNOSTIC ANALYSIS: -------------------- a. Table is partitioned on - (LEGAL_ENTITY_ID, ORDER_DAY). Index PK_D_CUST_ORDER_ITEM_CHANNELS is on - (CUSTOMER_ORDER_ITEM_ID, MARKETPLACE_ID, LEGAL_ENTITY_ID, ORDER_DAY) b. Table has 555 partitions. Index is a global index. c. From the tkprof output - call count cpu elapsed disk query current rows ----- ------ -------- ---------- ---------- ---------- -------- ---- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 405.50 2796.28 1408207 1444973 0 0 ----- ------ -------- ---------- ---------- ---------- -------- ---- total 3 405.50 2796.28 1408207 1444973 0 0 Misses in library cache during parse: 1 Optimizer mode: FIRST_ROWS Parsing user id: 108 (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- time=6 us)(object id 1160811) d. Although we have partition_name in the query, we go for INDEX FULL scan because of hard coded hint FIRST_ROWS. A FTS of the concerned partition would be much faster in this case. WORKAROUND: ----------- Drop the global index which would force an FTS on a single partition. RELEASE NOTES: ]] Poor performance was possible for ALTER TABLE SPLIT PARTITION if there ]] was an index on the partition key. REDISCOVERY INFORMATION: If you get poor performance for an ALTER TABLE SPLIT PARTITION operation and there is an index on the partition key, you may be hitting this bug. WORKAROUND: None
Metalink给出的Workaround方案是将分区表上的global index 全局索引drop 掉, 这样可以让CBO只能选择对single partition的FULL TABLE SCAN。
实际上这个Solution并很不能让人满意,在产品环境中我们不可能随意drop掉某张关键表上的索引,所以这个solution的可操作性很低。
补充:我们来看一下First Rows Hint 在CBO计算成本时如何起作用的:
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug event 10053 trace name context forever,level 1;
Statement processed.
SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
2 1
3 from "SYS"."MACLEAN" PARTITION("P4")
4 where (((("OBJECT_ID" < 100010))))
5 and rownum < 2; Explained. SQL> oradebug tracefile_name;
10053 trace content ================================================
*************************************
PARAMETERS WITH ALTERED VALUES
******************************
optimizer_mode_hinted = true
optimizer_mode = first_rows
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: MACLEAN Alias: MACLEAN Partition [3]
#Rows: 0 #Blks: 1 AvgRowLen: 0.00
#Rows: 0 #Blks: 1 AvgRowLen: 0.00
Index Stats::
Index: IND_OBJ Col#: 5 4 8 9 6 1 10
LVLS: 3 #LB: 380544 #DK: 50447 LB/K: 7.00 DB/K: 777.00 CLUF: 39208542.00
***************************************
SINGLE TABLE ACCESS PATH
Column (#4): OBJECT_ID(NUMBER) Part#: 3
AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
Column (#4): OBJECT_ID(NUMBER)
AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
Table: MACLEAN Alias: MACLEAN
Card: Original: 0 Rounded: 1 Computed: 0.00 Non Adjusted: 0.00
Access Path: TableScan
Cost: 2.00 Resp: 2.00 Degree: 0
Cost_io: 2.00 Cost_cpu: 7121
Resp_io: 2.00 Resp_cpu: 7121
kkofmx: index filter:"MACLEAN"."OBJECT_ID"<100010 AND ROWNUM<2
Access Path: index (skip-scan)
SS sel: 0.0000e+00 ANDV (#skips): 4073
SS io: 32584.00 vs. table scan io: 2.00
Skip Scan rejected
Access Path: index (FullScan)
Index: IND_OBJ
resc_io: 380547.00 resc_cpu: 10551751028
ix_sel: 1 ix_sel_with_filters: 1
Cost: 382007.38 Resp: 382007.38 Degree: 1
Best:: AccessPath: IndexRange Index: IND_OBJ
Cost: 382007.38 Degree: 1 Resp: 382007.38 Card: 0.00 Bytes: 0
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
***********************
Join order[1]: MACLEAN[MACLEAN]#0
***********************
Best so far: Table#: 0 cost: 382007.3822 card: 0.0000 bytes: 13
(newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:80000
*********************************
Number of join permutations tried: 1
*********************************
prefetching is on for IND_OBJ
Final - First Rows Plan: Best join order: 1
Cost: 382007.3822 Degree: 1 Card: 1.0000 Bytes: 13
Resc: 382007.3822 Resc_io: 380547.0000 Resc_cpu: 12512178128
Resp: 382007.3822 Resp_io: 380547.0000 Resc_cpu: 12512178128
kkoipt: Query block SEL$1 (#0)
******* UNPARSED QUERY IS *******
SELECT /*+ FIRST_ROWS NO_PARALLEL ("MACLEAN") */ 1 "1" FROM "SYS"."MACLEAN" PARTITION ("P4")
"MACLEAN" WHERE ROWNUM<2 AND "MACLEAN"."OBJECT_ID"<100010
kkoqbc-end
: call(in-use=46464, alloc=49080), compile(in-use=39288, alloc=40552)
apadrv-end: call(in-use=46464, alloc=49080), compile(in-use=40072, alloc=40552)
sql_id=ff1ft3uxsq105.
Current SQL statement for this session:
explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
1
from "SYS"."MACLEAN" PARTITION("P4")
where (((("OBJECT_ID" < 100010))))
and rownum < 2
============
Plan Table
============
-------------------------------------------------------+-----------------------------------+---------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | Pstart| Pstop |
-------------------------------------------------------+-----------------------------------+---------------+
| 0 | SELECT STATEMENT | | | | 373K | | | |
| 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY GLOBAL INDEX ROWID | MACLEAN | 1 | 13 | 373K | 01:16:25 | 4 | 4 |
| 3 | INDEX FULL SCAN | IND_OBJ | 37M | | 373K | 01:16:25 | | |
-------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
1 - filter(ROWNUM<2)
3 - access("OBJECT_ID"<100010)
3 - filter("OBJECT_ID"<100010)
Content of other_xml column
===========================
db_version : 10.2.0.1
parse_schema : SYS
plan_hash : 985826631
Outline Data:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.1')
FIRST_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX(@"SEL$1" "MACLEAN"@"SEL$1" ("MACLEAN"."DATA_OBJECT_ID" "MACLEAN"."OBJECT_ID"
"MACLEAN"."LAST_DDL_TIME" "MACLEAN"."TIMESTAMP" "MACLEAN"."OBJECT_TYPE" "MACLE
AN"."OWNER" "MACLEAN"."STATUS"))
END_OUTLINE_DATA
*/
可以看到虽然INDEX FULL SCAN的成本(cost:382007)大于 Access Path: TableScan (cost : 2.00) 很多 ,但是optimizer 最终仍然选择了Index Full Scan , 因为其是满足First Rows 要求的执行计划(红色标注部分)。
于是我开始自己寻找workaround的路径,目标是让优化器忽略”First Rows”的影响。我一开始寄望于能够通过设置一些影响CBO计算cost的优化器参数来让optimizer 迷途知返, 包括设置optimizer_index_cost_adj和”_db_file_optimizer_read_count”的值到一个很大水平,但发现并不起作用:
SQL> alter session set "_db_file_optimizer_read_count"=65535; Session altered. SQL> alter session set optimizer_index_cost_adj=10000; Session altered. SQL> alter system flush shared_pool; System altered. SQL> select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 2 1 3 from "SYS"."MACLEAN" PARTITION("P4") 4 where (((("OBJECT_ID" < 100010)))) 5 and rownum < 2; no rows selected Execution Plan ---------------------------------------------------------- Plan hash value: 985826631 --------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | --------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 38M (1)|127:20:09 | | | |* 1 | COUNT STOPKEY | | | | | | | | | 2 | TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN | 1 | 13 | 38M (1)|127:20:09 | 4 | 4 | |* 3 | INDEX FULL SCAN | IND_OBJ | 39M| | 38M (1)|127:20:09 | | | ---------------------------------------------------------------------------------------------------------------
得益于好奇心,我以’%optimizer%ignore’的Like语句去查了Oracle的隐式参数表,果然有志者事竟成,最终有所斩获:
SQL> col name for a40 SQL> col value for a20 SQL> col describ for a60 SQL> set linesize 200 pagesize 1400 SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ 2 FROM SYS.x$ksppi x, SYS.x$ksppcv y 3 WHERE x.inst_id = USERENV ('Instance') 4 AND y.inst_id = USERENV ('Instance') 5 AND x.indx = y.indx 6 AND x.ksppinm LIKE '%optimizer%ignore%'; NAME VALUE DESCRIB ---------------------------------------- -------------------- --------------------- _optimizer_ignore_hints TRUE enables the embedded hints to be ignored
在metalink上查了下没有关于该”_optimizer_ignore_hints” 隐式参数的任何有用信息, 就注释来看是可以启用是否忽略SQL中嵌入的HINT提示信息, 我们来具体看以下是否其作用:
SQL> alter system flush shared_pool;
System altered.
SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
2 1
3 from "SYS"."MACLEAN" PARTITION("P4")
4 where (((("OBJECT_ID" < 100010))))
5 and rownum < 2;
Explained.
SQL> select * from table(dbms_xplan.display());
PLAN_TABLE_OUTPUT
---------------------------------------------
Plan hash value: 985826631
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 38M (1)|127:20:09 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN | 1 | 13 | 38M (1)|127:20:09 | 4 | 4 |
|* 3 | INDEX FULL SCAN | IND_OBJ | 39M| | 38M (1)|127:20:09 | | |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<2)
3 - access("OBJECT_ID"<100010)
filter("OBJECT_ID"<100010)
SQL> alter session set "_optimizer_ignore_hints"=true;
Session altered.
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug event 10053 trace name context forever , level 1;
Statement processed.
SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
2 1
3 from "SYS"."MACLEAN" PARTITION("P4")
4 where (((("OBJECT_ID" < 100010))))
5 and rownum < 2; Explained. SQL> select * from table(dbms_xplan.display());
PLAN_TABLE_OUTPUT
-----------------------------------------------
Plan hash value: 4280157877
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE SINGLE| | 1 | 13 | 2 (0)| 00:00:01 | 4 | 4 |
|* 3 | TABLE ACCESS FULL | MACLEAN | 1 | 13 | 2 (0)| 00:00:01 | 4 | 4 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<2)
3 - filter("OBJECT_ID"<100010)
惊喜! 该 “_optimizer_ignore_hints”参数确实起到无视HINT提示的作用, 且可以直接在session级别修改,十分方便, 我们透过10053 event来观察该参数是如何其作用的:
*************************************** PARAMETERS USED BY THE OPTIMIZER ******************************** ************************************* PARAMETERS WITH ALTERED VALUES ****************************** _optimizer_ignore_hints = true *************************************** BASE STATISTICAL INFORMATION *********************** Table Stats:: Table: MACLEAN Alias: MACLEAN Partition [3] #Rows: 0 #Blks: 1 AvgRowLen: 0.00 #Rows: 0 #Blks: 1 AvgRowLen: 0.00 Index Stats:: Index: IND_OBJ Col#: 5 4 8 9 6 1 10 LVLS: 3 #LB: 380544 #DK: 50447 LB/K: 7.00 DB/K: 777.00 CLUF: 39208542.00 *************************************** SINGLE TABLE ACCESS PATH Column (#4): OBJECT_ID(NUMBER) Part#: 3 AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0 Column (#4): OBJECT_ID(NUMBER) AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0 Table: MACLEAN Alias: MACLEAN Card: Original: 0 Rounded: 1 Computed: 0.00 Non Adjusted: 0.00 Access Path: TableScan Cost: 2.00 Resp: 2.00 Degree: 0 Cost_io: 2.00 Cost_cpu: 7121 Resp_io: 2.00 Resp_cpu: 7121 kkofmx: index filter:"MACLEAN"."OBJECT_ID"<100010 AND ROWNUM<2 Access Path: index (skip-scan) SS sel: 0.0000e+00 ANDV (#skips): 4073 SS io: 32584.00 vs. table scan io: 2.00 Skip Scan rejected Access Path: index (FullScan) Index: IND_OBJ resc_io: 380547.00 resc_cpu: 10551751028 ix_sel: 1 ix_sel_with_filters: 1 Cost: 382007.38 Resp: 382007.38 Degree: 1 Best:: AccessPath: TableScan *************************************** OPTIMIZER STATISTICS AND COMPUTATIONS *************************************** GENERAL PLANS *************************************** Considering cardinality-based initial join order. *********************** Join order[1]: MACLEAN[MACLEAN]#0 *********************** Best so far: Table#: 0 cost: 2.0008 card: 0.0000 bytes: 13 (newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:80000 ********************************* Number of join permutations tried: 1 ********************************* Final - All Rows Plan: Best join order: 1 Cost: 2.0008 Degree: 1 Card: 1.0000 Bytes: 13 Resc: 2.0008 Resc_io: 2.0000 Resc_cpu: 7121 Resp: 2.0008 Resp_io: 2.0000 Resc_cpu: 7121 Current SQL statement for this session: explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION("P4") where (((("OBJECT_ID" < 100010)))) and rownum < 2 ============ Plan Table ============ -------------------------------------------+-----------------------------------+---------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | Pstart| Pstop | -------------------------------------------+-----------------------------------+---------------+ | 0 | SELECT STATEMENT | | | | 2 | | | | | 1 | COUNT STOPKEY | | | | | | | | | 2 | PARTITION RANGE SINGLE | | 1 | 13 | 2 | 00:00:01 | 4 | 4 | | 3 | TABLE ACCESS FULL | MACLEAN | 1 | 13 | 2 | 00:00:01 | 4 | 4 | -------------------------------------------+-----------------------------------+---------------+ Predicate Information: ---------------------- 1 - filter(ROWNUM<2) 3 - filter("OBJECT_ID"<100010) Content of other_xml column =========================== db_version : 10.2.0.1 parse_schema : SYS plan_hash : 4280157877 Outline Data: /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('10.2.0.1') ALL_ROWS OUTLINE_LEAF(@"SEL$1") FULL(@"SEL$1" "MACLEAN"@"SEL$1") END_OUTLINE_DATA */ Dumping Hints ============= atom_hint=(@=0x988453c0 err=20 resol=1 used=0 token=453 org=1 lvl=2 txt=FIRST_ROWS ) atom_hint=(@=0x988451f8 err=0 resol=1 used=1 token=177 org=1 lvl=3 txt=NO_PARALLEL ("MACLEAN") ) ********** WARNING: SOME HINTS HAVE ERRORS *********
实际 “_optimizer_ignore_hints”参数是起到的作用是使SQL 隐式地回归到默认的optimizer_mode=ALL_ROWS上来, 我们终于战胜了嵌入在SQL语句中的”First Rows” Hint 。
总结
在针对10g 多个版本上的Split Partition 可能因 其DDL所附带的递归SQL 使用了 固定的 而又不恰当的”First rows Hint” 提示而造成语句执行缓慢的问题, 我们可以通过 以下3个方案解决:
A. 通过每次执行Split之前都收集指定分区的统计,并设置”_optimizer_ignore_hints” =true 来规避 分区统计信息失效和 不恰当的”First Rows hint” 所可能带来的问题, 方法如下:
exec dbms_stats.gather_table_stats('&OWNER','&TABNAME',partname=>'&PARNAME'); alter session set "_optimizer_ignore_hints"=true;
推荐使用方案A
补充:
有同学反映:
”隐藏参数_optimizer_ignore_hints在分区表的split操作中并未启到作用。
从我的测试结果来看,虽然导致split操作慢的根源是FIRST_ROWS优化器模式下的分区表select语句:
select /*+ FIRST_ROWS PARALLEL(“MACLEAN”, 1) */ 1
from
MACLEAN PARTITION (“P4”) where ( ( ( (
“OBJECT_ID” < ‘1000010’ ) ) ) ) and rownum < 2;
不过以上的sql语句在_optimizer_ignore_hints参数调整后仍使用FIRST_ROWS的优化器模式,即没有生效。
我觉得这可能与oracle在执行自己内部命令时,会忽略该参数的设置。(直接执行select的sql语句确实会忽略hint)”
实际通过10053 事件追踪该SPLIT PARTITION DDL语句所生产的递归SQL语句,发现带有”FIRST_ROWS”提示的SELECT语句甚至没有解析的过程, 很有可能是该递归SQL语句直接使用了内部硬编码的存储大纲OUTLINES所导致的。
换句话说之前_optimizer_ignore_hints的隐式参数针对我们手动执行的带有FIRST_ROWS HINT的SELECT语句是有效的,而对于DDL所附带的递归SQL无效。
注:实际在对嵌入了HINT的非recursive SQL语句做调优时,若我们无法修改该SQL的HINT,则还是可以利用到”_optimizer_ignore_hints”的。
想了一下可以通过设置较旧的优化器特性了解决该问题(alter session set optimizer_features_enable=’8.0.0′;),该optimizer_features_enable参数同样可以在session级别设置, 如:
TEST A: SQL> set timing on; SQL> oradebug setmypid; Statement processed. SQL> oradebug event 10046 trace name context forever,level 12; Statement processed. SQL> oradebug tracefile_name; /s01/admin/G10R21/udump/g10r21_ora_13646.trc SQL> alter table maclean split partition p4 at (10000081) into (partition p14, partition p4); Table altered. Elapsed: 00:00:42.50 TEST B: SQL> set timing on; SQL> oradebug setmypid; Statement processed. SQL> oradebug tracefile_name; /s01/admin/G10R21/udump/g10r21_ora_13656.trc SQL> alter session set optimizer_features_enable='8.0.0'; Session altered. Elapsed: 00:00:00.01 SQL> alter table maclean split partition p4 at (10000091) into (partition p15, partition p4); Table altered. Elapsed: 00:00:00.05 PARSING IN CURSOR #2 len=152 dep=1 uid=0 oct=3 lid=0 tim=1291531645417871 hv=2124209225 ad='a74b41f0' select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION ("P4") where ( ( ( ( "OBJECT_ID" < 10000091 ) ) ) ) and rownum < 2 END OF STMT PARSE #2:c=1000,e=684,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=2,tim=1291531645417864 BINDS #2: EXEC #2:c=0,e=71,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645417984 FETCH #2:c=0,e=92,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418094 STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='COUNT STOPKEY (cr=3 pr=0 pw=0 time=122 us)' STAT #2 id=2 cnt=0 pid=1 pos=1 obj=0 op='PARTITION RANGE SINGLE PARTITION: 13 13 (cr=3 pr=0 pw=0 time=95 us)' STAT #2 id=3 cnt=0 pid=2 pos=1 obj=52809 op='TABLE ACCESS FULL MACLEAN PARTITION: 13 13 (cr=3 pr=0 pw=0 time=59 us)' ===================== PARSING IN CURSOR #2 len=177 dep=1 uid=0 oct=3 lid=0 tim=1291531645418799 hv=339345368 ad='a74cedd8' select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION ("P4") where ( ( ( ( "OBJECT_ID" >= 10000091 OR "OBJECT_ID" IS NULL ) ) ) ) and rownum < 2 END OF STMT PARSE #2:c=1000,e=589,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=2,tim=1291531645418792 BINDS #2: EXEC #2:c=0,e=67,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418908 FETCH #2:c=0,e=66,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418992 STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='COUNT STOPKEY (cr=3 pr=0 pw=0 time=92 us)' STAT #2 id=2 cnt=0 pid=1 pos=1 obj=0 op='PARTITION RANGE SINGLE PARTITION: 13 13 (cr=3 pr=0 pw=0 time=64 us)' STAT #2 id=3 cnt=0 pid=2 pos=1 obj=52809 op='TABLE ACCESS FULL MACLEAN PARTITION: 13 13 (cr=3 pr=0 pw=0 time=36 us)' WAIT #1: nam='control file sequential read' ela= 31 file#=0 block#=1 blocks=1 obj#=-1 tim=1291531645419383 WAIT #1: nam='control file sequential read' ela= 12 file#=1 block#=1 blocks=1 obj#=-1 tim=1291531645419432 WAIT #1: nam='control file sequential read' ela= 23 file#=0 block#=15 blocks=1 obj#=-1 tim=1291531645419494 WAIT #1: nam='control file sequential read' ela= 10 file#=0 block#=17 blocks=1 obj#=-1 tim=1291531645419556
B. 如果确实遇到了该问题 ,也可以将Index FULL SCAN 所使用的全局索引drop 掉来达到强制使用FULL single table partition SCAN的目的, 实际使用中不推荐
C. 避免使用Global Partitioned index 或 Non-partitioned Index , 而采用Local partitioned index , 这似乎更难做到
D. 也可以通过将原Maxvalue的分区drop掉之后(前提是该分区真的是空的),再添加新分区的做法来绕过该问题
good,有机会用用
补充:
有同学反映:
”隐藏参数_optimizer_ignore_hints在分区表的split操作中并未启到作用。
从我的测试结果来看,虽然导致split操作慢的根源是FIRST_ROWS优化器模式下的分区表select语句:
select /*+ FIRST_ROWS PARALLEL(“MACLEAN”, 1) */ 1
from
MACLEAN PARTITION (“P4”) where ( ( ( (
“OBJECT_ID” < '1000010' ) ) ) ) and rownum set timing on;
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug event 10046 trace name context forever,level 12;
Statement processed.
SQL> oradebug tracefile_name;
/s01/admin/G10R21/udump/g10r21_ora_13646.trc
SQL> alter table maclean split partition p4 at (10000081) into (partition p14, partition p4);
Table altered.
Elapsed: 00:00:42.50
TEST B:
SQL> set timing on;
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug tracefile_name;
/s01/admin/G10R21/udump/g10r21_ora_13656.trc
SQL> alter session set optimizer_features_enable=’8.0.0′;
Session altered.
Elapsed: 00:00:00.01
SQL> alter table maclean split partition p4 at (10000091) into (partition p15, partition p4);
Table altered.
Elapsed: 00:00:00.05
PARSING IN CURSOR #2 len=152 dep=1 uid=0 oct=3 lid=0 tim=1291531645417871 hv=2124209225 ad=’a74b41f0′
select /*+ FIRST_ROWS PARALLEL(“MACLEAN”, 1) */ 1 from “SYS”.”MACLEAN” PARTITION (“P4”) where ( ( ( ( “OBJECT_ID” < 10000091 ) ) ) ) and rownum = 10000091 OR “OBJECT_ID” IS NULL ) ) ) ) and
rownum < 2
END OF STMT
PARSE #2:c=1000,e=589,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=2,tim=1291531645418792
BINDS #2:
EXEC #2:c=0,e=67,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418908
FETCH #2:c=0,e=66,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418992
STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='COUNT STOPKEY (cr=3 pr=0 pw=0 time=92 us)'
STAT #2 id=2 cnt=0 pid=1 pos=1 obj=0 op='PARTITION RANGE SINGLE PARTITION: 13 13 (cr=3 pr=0 pw=0 time=64 us)'
STAT #2 id=3 cnt=0 pid=2 pos=1 obj=52809 op='TABLE ACCESS FULL MACLEAN PARTITION: 13 13 (cr=3 pr=0 pw=0 time=36 us)'
WAIT #1: nam='control file sequential read' ela= 31 file#=0 block#=1 blocks=1 obj#=-1 tim=1291531645419383
WAIT #1: nam='control file sequential read' ela= 12 file#=1 block#=1 blocks=1 obj#=-1 tim=1291531645419432
WAIT #1: nam='control file sequential read' ela= 23 file#=0 block#=15 blocks=1 obj#=-1 tim=1291531645419494
WAIT #1: nam='control file sequential read' ela= 10 file#=0 block#=17 blocks=1 obj#=-1 tim=1291531645419556
big thanks to maclean.
it works
SQL> alter session set optimizer_features_enable=’8.0.0′;
小结的很好;