11g中引入一些功能强大的hint提示,其中就包括了IGNORE_ROW_ON_DUPKEY_INDEX。其官方定义为:
“The IGNORE_ROW_ON_DUPKEY_INDEX hint applies only to single-table INSERT operations. It is not supported for UPDATE, DELETE, MERGE, or multitable insert operations. IGNORE_ROW_ON_DUPKEY_INDEX causes the statement to ignore a unique key violation for a specified set of columns or for a specified index. When a unique key violation is encountered, a row-level rollback occurs and execution resumes with the next input row. If you specify this hint when inserting data with DML error logging enabled, then the unique key violation is not logged and does not cause statement termination.”
针对具有唯一性约束的键,若程序设计时没有考虑到插入具有重复键值的行会引发ORA-00001 unique constraint violated错误,进而可能导致程序过程终止的问题的话;直接修改程序将会是十分复杂的工程。所幸我们在11g中有了”IGNORE_ROW_ON_DUPKEY_INDEX”提示,在INSERT单表的语句中加入该hint可以让Oracle静默地(silently)忽略那些具有重复键值的插入行,而不触发ORA-00001错误,允许程序继续运行下去,这可以说是一种十分简便的折中方案。要在生产环境中使用该特性,我们有必要对比一下其同使用Exception处理违反唯一约束间的性能差别。
SQL> drop table youyus ; Table dropped. SQL> create table youyus (t1 int ,t2 varchar2(20),t3 varchar2(30)) tablespace users; Table created SQL> create unique index youyus_uk on youyus(t1) tablespace users; Index created /*清理现场,添加唯一约束索引*/ SQL> alter system set optimizer_dynamic_sampling=1; System altered. SQL> alter system flush shared_pool; System altered. SQL> alter system flush buffer_cache; System altered. /* 以下过程在0-30000的整数内随机取200000次值,可以确保尝试INSERT大量重复t1键值的行, 以便测试使用DUP_VAL_ON_INDEX Exception时的各项性能参数;此处以及之后我们都将commit置于loop循环外, 从而避免大量commit影响我们的实验结果*/ declare rnd int; begin /* 使用exception处理重复键值插入违反约束的问题*/ for i in 1 .. 200000 loop BEGIN select round(dbms_random.value * 30000) into rnd from dual; insert into youyus (t1, t2, t3) values (rnd, 'DUPLICATE', 'INSERT TEST'); exception when DUP_VAL_ON_INDEX then continue; end; end loop; commit; end; SQL> select plsql_exec_time,cpu_time,elapsed_time,user_io_wait_time from v$sql where sql_text like 'declare%exception%'; PLSQL_EXEC_TIME CPU_TIME ELAPSED_TIME USER_IO_WAIT_TIME --------------- ---------- ------------ ----------------- 4392268 88296566 92345066 621020 /* 逝去时间92s,CPU时间为88s,PLSQL执行时间4s*/ SQL> select count(*) from youyus; COUNT(*) ---------- 29958 SQL> truncate table youyus; Table truncated. SQL> alter system flush shared_pool; System altered. SQL> alter system flush buffer_cache; System altered. /* 在插入前判断插入值是否违反唯一约束应当是一种不错的想法,不过写起来多少有些"麻烦"*/ declare dup_count int; rnd int; begin /* 使用插入前判断(check before insert)是否违反唯一约束的方式*/ for i in 1 .. 200000 loop BEGIN select round(dbms_random.value * 30000) into rnd from dual; select count(*) into dup_count from youyus where t1 = rnd; IF (dup_count = 0) then insert into youyus (t1, t2, t3) values (rnd, 'DUPLICATE', 'INSERT TEST'); END IF; END; END LOOP; commit; end; SQL> select plsql_exec_time,cpu_time,elapsed_time,user_io_wait_time from v$sql where sql_text like 'declare%check%'; PLSQL_EXEC_TIME CPU_TIME ELAPSED_TIME USER_IO_WAIT_TIME --------------- ---------- ------------ ----------------- 2153769 15709301 18265730 679813 /* PLSQL执行时间缩短到2s,整个过程的CPU时间大幅减少到15s*/ /*** 以上对比可以得出Exception处理是一种CPU敏感操作的结论 ***/ SQL> select count(*) from youyus; COUNT(*) ---------- 29968 SQL> truncate table youyus; Table truncated. SQL> alter system flush shared_pool; System altered. SQL> alter system flush buffer_cache; System altered. declare rnd int; begin /* 使用IGNORE_ROW_ON_DUPKEY_INDEX hint方式*/ for i in 1 .. 200000 loop select round(dbms_random.value * 30000) into rnd from dual; insert /*+ IGNORE_ROW_ON_DUPKEY_INDEX(YOUYUS,YOUYUS_UK) */ into youyus (t1, t2, t3) values (rnd, 'DUPLICATE', 'INSERT TEST'); end loop; commit; end; SQL> select plsql_exec_time,cpu_time,elapsed_time,user_io_wait_time from v$sql where sql_text like 'declare%IGNORE%'; PLSQL_EXEC_TIME CPU_TIME ELAPSED_TIME USER_IO_WAIT_TIME --------------- ---------- ------------ ----------------- 2377262 78452903 84209306 623539 SQL> select count(*) from youyus; COUNT(*) ---------- 29959 /*** IGNORE_ROW_ON_DUPKEY_INDEX hint模式下, CPU_TIME对比Exception模式时减少11%,但仍远高于插入前预检查模式; 就修改程序的复杂度而言IGNORE_ROW_ON_DUPKEY_INDEX模式要低于使用Exception模式, 而Exception模式又要低于CHECK_BEFORE_INSERT模式;CHECK_BEFORE_INSERT模式的CPU成本最低,但修改程序时的成本时间最高 ***/ /*需要注意的是IGNORE_ROW_ON_DUPKEY_INDEX提示与我们以往使用的hint略有不同,不正确使用它将导致报错*/ declare rnd int; begin /* 使用IGNORE_ROW_ON_DUPKEY_INDEX hint方式*/ for i in 1 .. 200000 loop select round(dbms_random.value * 30000) into rnd from dual; insert /*+ IGNORE_ROW_ON_DUPKEY_INDEX(YOUYUS,I_AM_ERROR) */ into youyus (t1, t2, t3) values (rnd, 'DUPLICATE', 'INSERT TEST'); end loop; commit; end; / ERROR at line 1: ORA-38913: Index specified in the index hint is invalid ORA-06512: at line 7
上述三者各环节耗时图示:
总结一句,IGNORE_ROW_ON_DUPKEY_INDEX为lazy developer专备。
Hdr: 5103126 10.2.0.2.0 RDBMS 10.2.0.2.0 QRY OPTIMIZER PRODID-5 PORTID-212
Abstract: INSERTS TAKEING 2 TO 3X LONGER IN 10.2.0.2 WITH PRESENCE OF UNIQUE INDEX
PROBLEM:
——–
after upgrade from 9.2 to 10.2 inserts
to tables with unique constraints and duplicate rows have doubled and
sometimes tripled in elapsed time.
no changes to the application code, OS, hardware environment etc…
only upgraded from 9.2.0.6 to 10.2.0.1 and now 10.2.0.2
problem can be easily reproduced.
alter session set max_dump_file_size=unlimited;
alter session set timed_statistics=true;
alter session set events ‘10046 trace name context forever, level 12’;
drop table testins cascade constraints;
drop table testins cascade constraints purge;
create table testins (a number);
alter table testins add constraint pk_a primary key (a);
set timing on
create or replace procedure xyz is
i number;
begin
for i in 1 .. 20000 loop
begin
insert into testins values (1);
exception
when dup_val_on_index then null;
end;
end loop;
end xyz;
/
exec xyz
alter session set events ‘10046 trace name context off’ ;
exit;
DIAGNOSTIC ANALYSIS:
——————–
key here is “inserting duplicates when unique index is present” Non unique
inserts are not a problem for us.
WORKAROUND:
———–
use the append hint
RELATED BUGS:
————-
This seems to be duplicate of Bug 5083397
which is closed as duplicate of Bug 4961613
REPRODUCIBILITY:
—————-
100 % .
reproduced inhouse
TEST CASE:
———-
STACK TRACE:
————
10.2.0.2.0 tkprof output
===============
call count cpu elapsed disk query current
rows
——- —— ——– ———- ———- ———- ———-
———-
Parse 20163 0.37 0.52 0 0 0
0
Execute 40178 11.61 12.92 0 260 180301
97
Fetch 20074 79.32 82.77 0 1020232 0
20027
——- —— ——– ———- ———- ———- ———-
———-
total 80415 91.30 96.21 0 1020492 180301
20124
9.2.0.6.0 tkprof output
—————————
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current
rows
——- —— ——– ———- ———- ———- ———-
———-
Parse 20184 0.33 1.07 0 0 0
0
Execute 40191 7.35 17.07 0 331 180297
100
Fetch 20061 0.51 1.05 0 160130 0
20022
——- —— ——– ———- ———- ———- ———-
———-
total 80436 8.19 19.20 0 160461 180297
20122
SUPPORTING INFORMATION:
———————–
24 HOUR CONTACT INFORMATION FOR P1 BUGS:
—————————————-
DIAL-IN INFORMATION:
——————–
IMPACT DATE:
————
will impact
numerous government based contracts. This bug alone determines if we’ll have
to go back to 9i for a couple more years or move forward to 10.2 now. We
have
a very small window to upgrade this year but our go/no-go decision will be
made
by COB tomorrow!!! These slow inserts cause unacceptable performance and
exceed all performance requirements
BDE Screening
~~~~~~~~~~~~~
Testcase
~~~~~~~~~~~~~~~~~
Files: BDETC.tar.Z (containing setup.sql, tc.sql)
Steps:
sqlplus /nolog @setup
Creates a user TC with a table and unique index and a
PLSQL procedure INSLOOP to insert into the table
N times each an hitting ORA-1 error.
sqlplus /nolog @tc
Calls INSLOOP(20000) without any trace to show the elapsed time
and CPU, then calls INSLOOP(10) with sql trace to show the
recursive SQL issued from kauerr()
Reproduced
~~~~~~~~~~
Reproduced in 10.2.0.1
Reproduced in RDBMS_MAIN_LINUX_060313
Workaround/s
~~~~~~~~~~~~
Do not attempt to insert duplicates
Diagnostic Notes
~~~~~~~~~~~~~~~~
The main CPU overhead seems to come underneath kauerr().
This routine is called for ORA-1 and ORA-1520 during inserts.
It issues recursive SQL to get the constraint owner and name to complete
the error message and this is where the main change occurs.
In 9.2 the kauerr1 SQL from kauerr() executes using an index lookup
against cdef$ then nested loops outwards to get the constraint name and
owner.
In 10g the kauerr1 SQL executes using a HASH join and incurs a large
overhead.
In both 9.2 and 10g the recursive SQL is executed for every individual
ORA-1 error and so this plan difference is very costly in overall CPU use
and elapsed time.
Changing (hinting) kauerr1 to force the execution plan to match 9.2
gives the following timings in RDBMS_MAIN_LINUX_060313:
20000 loop default Elapsed: 00:01:19.72
20000 loop hinted NL Elapsed: 00:00:49.00
and the following on Solaris 64 bit:
20000 default SQL-9.2.0.7 Elapsed: 00:01:25.23
20000 default SQL-10.2.0.1 Elapsed: 00:04:30.87
20000 hinted SQL-10.2.0.1 Elapsed: 00:00:44.65