EnterpriseDB Replication,复制Oracle数据测试(1)

EntepriseDB 复制软件目前支持多种数据库到postgre的复制,其基本结构由发布者(Publication)与订阅者(Subscriptions)组成,Replication软件可针对来自不同类型数据库的多个发布者,将其数据复制到多个订阅者(Subscriptions)数据库中。
其可能的几种拓扑结构,如以下图:



同Oracle中普通的物化视图一样,不支持对订阅者(Subscriptions)数据的修改–Changes must not be made to the data or the definitions of the subscription tables.

EnterpriseDB Replication软件的具体工作模式分成2种:即快照模式(snapshot)与同步模式(synchronization);在第一次启用同步前,需要进行一次快照操作,之后便可以进行较为轻量级同步操作了。若要使用同步模式(synchronization)则要求发布者所包含的表必须具有主键,而在仅使用快照模式的情景中则不需要。(Each table used in a publication must have a primary key with the exception of tables in snapshot-only publications, which do not require a primary key.) 以上模式均支持过滤器(fliter),即可以指定需要复制的具体数据子集。

EnterpriseDB Replication软件其同步(synchronization)模式复制的基本原理是基于trigger的,而非如Quest公司的shareplex或golden gate般抽取重做日志生成SQL的方式。trigger方式会在数据库源端产生一定的性能影响,若在mission critical的生产数据库中实施EDB replication 复制则需要考虑到这一点(这种情况下推荐使用Snapshot模式)。这可能是EDB复制软件比较不成熟的一点,就目前仅对Oracle日志文件的研究认识,挖掘重做日志进而实现数据复制的途径已经没有技术上的难点了。

以下发布者所包含的数据对象或表属性,将在订阅者成功建立时被复制到订阅者所在的数据库:

  • Tables
  • Views (for snapshot-only publications) – created as a table in the subscription database
  • Primary keys
  • Not null constraints
  • Unique constraints
  • Check constraints
  • Indexes

注意:外键约束将不被复制

同时目前复制软件存在一定的限制,Oracle中的hash分区将不被复制,同时Oracle中包含以下数据类型列的表将无法复制:

  • BFILE
  • BINARY_DOUBLE
  • BINARY_FLOAT
  • MLSLABEL
  • XMLTYPE

Oracle中包含以下数据类型列的表,将不能使用同步模式(synchronization replications):

  • BLOB
  • CLOB
  • LONG
  • LONG RAW
  • NCLOB
  • RAW

快照模式情况下,订阅者中复制目标表将首先被truncate截断,之后若订阅者数据库类型是Oracle则将使用JDBC驱动批量的将源端的数据INSERT进来,若数据库类型是EnterpriseDB advanced Sever则将使用Postgre中的Copy命令。

同步模式下复制软件通过在源端配置的触发器记录表,获知源端该时段内所经历的DML操作,进而在目标端生成对应修改的SQL语句(显然同源端的原始SQL不同)。

EnterpriseDB公司推荐在以下情景中使用快照模式以获得更好的性能:

  1. 表相对而言较小
  2. 在复制间隔中绝大多数数据行会被修改

而同步模式则更适宜于以下情景:

  1. 数据表非常巨大
  2. 在复制间隔中仅少数数据会被修改

EnterpriseDB Migration 迁移工具使用测试(2)

下面我们来测试EnterpriseDB Migration 工具对于Oracle 大对象(LOB)的迁移情况;
首先在在Oracle实例Scott模式下创建具有LOB对象的表,如:

SQL> create table tlob (t1 int primary key,t2 clob,t3 blob);
Table created.
-- 并填充数据
SQL> begin
  2  for i in 1..100 loop
  3  insert into tlob values(i,rpad('A',9999,'Z'),hextoraw(i) );
  4  end loop;
  5  commit;
  6  end;
  7  /

PL/SQL procedure successfully completed.

打开EnterpriseDB Migration 工具界面,从树形图中找到需要迁移的表TLOB,选择进行在线迁移:

导出日志:

[Starting Migration]
源数据库连接信息…
连接 =jdbc:oracle:thin:@rh2.home:1521:G10R21
用户 =system
密码=******
目标数据库连接信息…
连接 =jdbc:edb://rh2.home:5444/subuser
用户 =maclean
密码=******
正在导入 Redwood 架构 SCOTT…
表列表: ‘TLOB’
正在创建表…
正在创建表: TLOB
已创建 1 个表。
正在以 8 MB 批次大小加载表数据…
正在将大型对象加载到表: TLOB…
表数据加载摘要: 时间总计 (秒): 1.122 行数总计: 100 大小总计 (MB): 0.380859375
数据加载摘要: 时间总计 (秒): 1.122 行数总计: 100 大小总计 (MB): 0.39
正在创建约束: SYS_C005182

已成功导入架构 SCOTT。

迁移过程已成功完成。

迁移日志已保存到 C:\Users\windesk\.enterprisedb\migrationstudio\build60

******************** 迁移摘要 ********************
Tables: 1 来自 1
Constraints: 1 来自 1

全部对象: 2
成功计数: 2
失败计数: 0

*************************************************************
—————-FINISHED———

下面我们到EnterpriseDB中去验证导入数据:

[enterprisedb@rh2 ~]$ psql
Password:
Welcome to psql 8.3.0.112, the EnterpriseDB interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with edb-psql commands
       \g or terminate with semicolon to execute query
       \q to quit

edb=# \c subuser
You are now connected to database "subuser".
subuser=# desc scott.tlob;
      Table "scott.tlob"
 Column |  Type   | Modifiers
--------+---------+-----------
 t1     | numeric | not null
 t2     | text    |
 t3     | bytea   |
Indexes:
    "sys_c005182" PRIMARY KEY, btree (t1)

subuser=# select count(*) from scott.tlob;
 count
-------
   100

(1 row)

可以看到装换过程中将clob类型转制为text,而blob类型则转制为bytea。postgre中text类型为可变无限长文本类型(variable unlimited length)。

正想去EnterpriseDB网站去查一下官方定义,却发现了以下留言:

We are in the process of updating our website. The site will not be available for the next few minutes. Sorry for the inconvenience.
The EnterpriseDB Web team.

另外bytea类型为一种变长的二进制字串,postgre组织的文档对这2种类型的存储数据上限没有非常明确的叙述,就目前找到的文献可以肯定的是postgre V7中这两种类型大小限制为1G

那么如果Oracle 中Blob/Clob类型大小超过了1G,就可能导致迁移无法正常进行。

EnterpriseDB Migration 迁移工具使用测试(1)

EntperpriseDB 目前作为Postgre开源数据库的企业发行版,在原开源社区的基础上对postgre进行了扩展(contribute),值得关注的技术有infiniteCache,以及其强大的迁移工具Migration tools;下面我们来简单测试该迁移工具.

通过安装postgreplus-advanced-server软件包,其中将默认包括DBA management Server,DBA monitor Console,Migrate Studio,Replication Tools等一系列管理工具。我们需要用到的是Migrate Studio.

打开Migrate Studio首先定义迁移目标端的Enterprisedb实例,事先已经在主机rh2.home上安装了EnterpriseDB 8.3R2版本,同时创建了名为maclean的数据库(database),使用默认端口5444,新建EDB服务器:

该工具同大多数Oracle管理工具一样使用java驱动,但速度较快显得十分轻量级。

接着我们需要创建源端的Oracle服务器,同样在远程主机rh2.home上建立了Version 10.2.0.1的Oracle EE版数据库实例名为g10r21,Listener监听在端口1521上,尝试创建该服务器时将被要求Oracle JDBC包,该包可以到oracle官方网站下载,之后将该包放置到$EDBHOME/jre/lib/ext目录下;新建Oracle服务器:

之后我们可以尝试将Oracle实例中Scott模式下的对象迁移到enterprisedb中,在此之前我们在该模式下建立简单的存储过程,协同测试。

SQL> create or replace procedure scott.count_emp as
2  c int;
3  begin
4  select count(*) into c from scott.emp;
5  dbms_output.put_line('emp count is '||c);
6  end;
7  /

Procedure created.

SQL> set serveroutput on ;
SQL> exec scott.count_emp;
emp count is 14

PL/SQL procedure successfully completed.

接下来开始进行正式迁移,打开源库Oracle实例的树形图,并找到Scott模式,右键进入在线迁移模式, 目标端服务器为target(rh2.home:5444),指定Maclean数据库,架构(模式)仍为Scott,其他均使用默认设置,点击运行开始迁移:

迁移日志如下:

[Starting Migration]

源数据库连接信息...

连接 =jdbc:oracle:thin:@rh2.home:1521:g10r21

用户 =system

密码=******

目标数据库连接信息...

连接 =jdbc:edb://rh2.home:5444/maclean

用户 =maclean

密码=******

正在导入 Redwood 架构 SCOTT...

正在创建架构...scott

正在创建表...

正在创建表: BONUS

正在创建表: DEPT

正在创建表: EMP

正在创建表: SALGRADE

已创建 4 个表。

正在以 8 MB 批次大小加载表数据...

正在加载表: BONUS ...

表数据加载摘要: 时间总计 (秒): 0.128 行数总计: 0

正在加载表: DEPT ...

已迁移 4 行。

表数据加载摘要: 时间总计 (秒): 0.118 行数总计: 4

正在加载表: EMP ...

已迁移 14 行。

表数据加载摘要: 时间总计 (秒): 0.121 行数总计: 14

正在加载表: SALGRADE ...

已迁移 5 行。

表数据加载摘要: 时间总计 (秒): 0.145 行数总计: 5

数据加载摘要: 时间总计 (秒): 0.512 行数总计: 23 大小总计 (MB): 0.0

正在创建约束: PK_DEPT

正在创建约束: PK_EMP

正在创建约束: FK_DEPTNO

正在创建过程: COUNT_EMP

已成功导入架构 SCOTT。

正在创建用户: SCOTT

迁移过程已成功完成。

迁移日志已保存到 C:\Users\windesk\.enterprisedb\migrationstudio\build60

******************** 迁移摘要 ********************

Tables: 4 来自 4

Constraints: 3 来自 3

Procedures: 1 来自 1

Users: 1 来自 1

全部对象: 9

成功计数: 9

失败计数: 0

*************************************************************

----------------FINISHED---------

迁移过程中没有出现错误,下面我们来测试下之前建立的存储过程:

maclean=# exec count_emp;
emp count is 14

EDB-SPL Procedure successfully complete

windows平台上的11g release 2终于发布了

下午无意中打开了oracle主页上11g下载的页面,赫然发现windows平台的安装介质已经发布了。

介质分成2个zip包,1.5g和600m; 11g的安装介质较10g大了许多,因为默认附加了apex与sql developer.

下载地址:

Oracle Database 11g Release 2 (11.2.0.1.0) for Microsoft Windows (32-bit)
Download win32_11gR2_database_1of2.zip (1,625,721,289 bytes)
Download win32_11gR2_database_2of2.zip (631,934,821 bytes)
Oracle Database 11g Release 2 (11.2.0.1.0) for Microsoft Windows (x64)
Download win64_11gR2_database_1of2.zip (1,213,501,989 bytes) (cksum – 3906682109)
Download win64_11gR2_database_2of2.zip (1,007,988,954 bytes) (cksum – 1232608515)

目前32位未提供grid infrastructure 介质,想要体验windows上的11g rac只能使用64bit .

Oracle Database 11g Release 2 Grid Infrastructure (11.2.0.1.0) for Microsoft Windows (x64)

Download win64_11gR2_grid.zip (715,166,425 bytes) (cksum – 3127109177)

根据metalink文档867040.1所述,Oracle database 11g release 2 将默认支持windows 7 以及 windows 2008 r2 .

单机的安装过程十分简单:

一般来说windows 发行版在正式生产环境中不多见,不过安装到笔记本上方便了今后对11gr2新特性的测试。

附release note:

Oracle Database 11g Release 2 Now Available on Windows 32 and 64-bit [ID 1081390.1]

Modified 07-APR-2010     Type ANNOUNCEMENT     Status PUBLISHED

In this Document
What is being announced?
References


Applies to:

Oracle Server – Standard Edition – Version: 11.2.0.1 – Release: 11.2
Oracle Server – Personal Edition – Version: 11.2.0.1 – Release: 11.2
Oracle Server – Enterprise Edition – Version: 11.2.0.1 – Release: 11.2
Microsoft Windows (32-bit)
Microsoft Windows x64 (64-bit)
Microsoft Windows x64 (64-bit) – OS Version: 7
Microsoft Windows (32-bit) – OS Version: 7
Microsoft Windows x64 (64-bit) – Version: 2008 R2

What is being announced?

Oracle Database 11g Release 2 for Windows x86 (32-bit) and x86-64 is now available for download from OTN or the Oracle Store.

New with 11.2 is certification with Windows 2008 R2 and Windows 7.   You can find out more in Note  1065024.1 “Oracle Database 11g Release 2 Certification Highlights”.

隐藏参数_high_priority_processes与oradism

运行在操作系统上的进程存在2种系统时序优先级模式:即 实时模式 Real Time(RT) mode, 与分时模式 Time Sharing(TS) mode.
绝大多数Oracle进程运行在TS模式下:

[oracle@rh1 ~]$ ps -efc|grep ora_|grep -v grep
oracle    8510     1 TS   23 Mar27 ?        00:00:02 ora_pmon_PROD
oracle    8512     1 TS   23 Mar27 ?        00:00:00 ora_psp0_PROD
oracle    8514     1 TS   23 Mar27 ?        00:00:00 ora_mman_PROD
oracle    8516     1 TS   23 Mar27 ?        00:00:02 ora_dbw0_PROD
oracle    8518     1 TS   23 Mar27 ?        00:00:04 ora_lgwr_PROD
oracle    8520     1 TS   23 Mar27 ?        00:00:04 ora_ckpt_PROD
oracle    8522     1 TS   23 Mar27 ?        00:00:08 ora_smon_PROD
oracle    8524     1 TS   23 Mar27 ?        00:00:00 ora_reco_PROD
oracle    8526     1 TS   23 Mar27 ?        00:00:34 ora_cjq0_PROD
oracle    8528     1 TS   23 Mar27 ?        00:00:06 ora_mmon_PROD
oracle    8530     1 TS   24 Mar27 ?        00:00:07 ora_mmnl_PROD
oracle    8538     1 TS   23 Mar27 ?        00:00:00 ora_arc0_PROD
oracle    8540     1 TS   23 Mar27 ?        00:00:00 ora_arc1_PROD
oracle    8548     1 TS   23 Mar27 ?        00:00:00 ora_qmnc_PROD
oracle    8555     1 TS   23 Mar27 ?        00:00:00 ora_q000_PROD
oracle    8559     1 TS   23 Mar27 ?        00:00:00 ora_q001_PROD
oracle   30500     1 TS   23 22:10 ?        00:00:00 ora_j000_PROD

如上所示所有进程均运行在TS模式下且priority均为23|24.
Oracle一般不推荐使用RT模式,因为虽然个别进程可以通过这种方式获得更多的CPU资源,但往往系统的瓶颈并非CPU,即尽管CPU使用率高了,但实际系统TPS并未得到提升。
在10gr2版本后RAC中的LMS进程成为唯一一个使用RT模式的Oracle进程,我们可以通过查询参数_high_priority_processes了解相关信息:

SQL> col name format a40
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%priority%';

NAME                                     VALUE
---------------------------------------- ----------
_high_priority_processes                 LMS*
_os_sched_high_priority                  1

_high_priority_processes通过进程功能名进行匹配,下面我们将提高LGWR及PMON进程的优先级:

SQL> alter system set "_high_priority_processes"='LMS*|LGWR|PMON' scope=spfile;

System altered.

SQL> startup force;
ORACLE instance started.

Total System Global Area  281018368 bytes
Fixed Size                  2083336 bytes
Variable Size             150996472 bytes
Database Buffers          121634816 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.
SQL> !ps -efc|grep ora_|grep -v grep
oracle   31441     1 RR   41 22:50 ?        00:00:00 ora_pmon_PROD
oracle   31445     1 TS   23 22:50 ?        00:00:00 ora_psp0_PROD
oracle   31447     1 TS   23 22:50 ?        00:00:00 ora_mman_PROD
oracle   31449     1 TS   23 22:50 ?        00:00:00 ora_dbw0_PROD
oracle   31451     1 RR   41 22:50 ?        00:00:00 ora_lgwr_PROD
oracle   31455     1 TS   23 22:50 ?        00:00:00 ora_ckpt_PROD
oracle   31457     1 TS   23 22:50 ?        00:00:00 ora_smon_PROD
oracle   31459     1 TS   22 22:50 ?        00:00:00 ora_reco_PROD
oracle   31461     1 TS   23 22:50 ?        00:00:01 ora_cjq0_PROD
oracle   31463     1 TS   23 22:50 ?        00:00:01 ora_mmon_PROD
oracle   31465     1 TS   24 22:50 ?        00:00:00 ora_mmnl_PROD
oracle   31471     1 TS   24 22:50 ?        00:00:00 ora_p000_PROD
oracle   31473     1 TS   24 22:50 ?        00:00:00 ora_p001_PROD
oracle   31475     1 TS   24 22:50 ?        00:00:00 ora_arc0_PROD
oracle   31477     1 TS   22 22:50 ?        00:00:00 ora_arc1_PROD
oracle   31481     1 TS   23 22:50 ?        00:00:00 ora_qmnc_PROD
oracle   31488     1 TS   23 22:50 ?        00:00:00 ora_q000_PROD
oracle   31490     1 TS   23 22:50 ?        00:00:00 ora_q001_PROD
oracle   31500     1 TS   23 22:50 ?        00:00:00 ora_j000_PROD

好了lgwr和pmon进程也进入实时模式了,同时priority值上升到了41.
注意:
Oracle默认仅允许LMS进程(11g中多了VKTM进程)使用RT模式是有它的原因的,所以如果不是Oracle support 推荐,您没有任何修改隐式参数的理由。
其次根据Oracle文档[ID 602419.1]的描述,oradism文件(该文件位于$ORACLE_HOME/bin目录下)不正确的权限将导致RT模式无法被正确使用,该文件默认属于root用户并具有s权限。如下测试:

[oracle@rh1 bin]$ ls -la oradism
-r-sr-s---  1 root oinstall 14931 Mar 11  2008 oradism
[oracle@rh1 bin]$ su - root
Password:
[root@rh1 ~]# chown oracle:oinstall /s01/oracle/product/10.2.0/db_1/bin/oradism
[root@rh1 ~]# exit
logout
[oracle@rh1 bin]$ ls -la oradism
-r-xr-x---  1 oracle oinstall 14931 Mar 11  2008 oradism
[oracle@rh1 bin]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Sun Mar 28 23:07:03 2010

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area  281018368 bytes
Fixed Size                  2083336 bytes
Variable Size             150996472 bytes
Database Buffers          121634816 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.
SQL> col name format a35;
SQL> col value format a10;
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%priority%';

NAME                                VALUE
----------------------------------- ----------
_high_priority_processes            LMS*|LGWR|PMON
_os_sched_high_priority             1
SQL> !ps -efc|grep ora_|grep -v grep
oracle   31994     1 TS   23 23:07 ?        00:00:00 ora_pmon_PROD
oracle   31998     1 TS   23 23:07 ?        00:00:00 ora_psp0_PROD
oracle   32000     1 TS   23 23:07 ?        00:00:00 ora_mman_PROD
oracle   32002     1 TS   23 23:07 ?        00:00:00 ora_dbw0_PROD
oracle   32004     1 TS   24 23:07 ?        00:00:00 ora_lgwr_PROD
oracle   32008     1 TS   22 23:07 ?        00:00:00 ora_ckpt_PROD
oracle   32010     1 TS   23 23:07 ?        00:00:00 ora_smon_PROD
oracle   32012     1 TS   22 23:07 ?        00:00:00 ora_reco_PROD
oracle   32014     1 TS   23 23:07 ?        00:00:01 ora_cjq0_PROD
oracle   32016     1 TS   23 23:07 ?        00:00:01 ora_mmon_PROD
oracle   32018     1 TS   24 23:07 ?        00:00:00 ora_mmnl_PROD
oracle   32026     1 TS   24 23:07 ?        00:00:00 ora_arc0_PROD
oracle   32028     1 TS   23 23:07 ?        00:00:00 ora_arc1_PROD
oracle   32032     1 TS   23 23:07 ?        00:00:00 ora_qmnc_PROD
oracle   32045     1 TS   23 23:07 ?        00:00:00 ora_q000_PROD
oracle   32065     1 TS   23 23:08 ?        00:00:00 ora_q001_PROD
oracle   32072     1 TS   23 23:08 ?        00:00:00 ora_j000_PROD

that’s great, 显然oradism不仅为Oracle实例提供了内存资源控制功能,还包括了进程优先级分配的权限。
我们应当再次声明hidden parameter不应“滥用”于production environment.

ora-7445 [kghalp+0500] [SIGSEGV]错误

今天没有外出(似乎人不到现场就特别容易出问题),早上10点左右接到电话被告知crm11实例上出现了7445错误,准备用web vpn拨上去查看一下,赫然发觉windows 7 不支持这种vpn(准确说ie8和firefox都不支持);无奈无奈只好用拨号。
发现alert log中出现大量 7445错误记录:

Fri Mar 26 09:24:53 2010
Errors in file /oravl01/oracle/admin/CRMDB1/udump/crmdb11_ora_6754320.trc:
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []
Fri Mar 26 09:24:55 2010
Trace dumping is performing id=[cdmp_20100326092455]
Fri Mar 26 09:31:16 2010
Errors in file /oravl01/oracle/admin/CRMDB1/udump/crmdb11_ora_2994552.trc:
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []

看到kghalp函数第一印象 ,是Oracle中堆管理使用的函数;
让我们猜猜字面意思? k -> kernel g -> generic h-> heap a-> allocation p-> point
再让我们来看一下当时的call stack:

Exception signal: 11 (SIGSEGV), code: 51 (Invalid permissions for mapped object), addr: 0x3b, PC: [0x1000973e0, kghalp+0500]
Registers:
iar: 00000001000973e0, msr: a00000000000d0b2
 lr: 00000001013a6df8,  cr: 0000000022292484
r00: 0000000000000010, r01: 0ffffffffffcb160, r02: 000000011022a9c0,
r03: 0000000000000002, r04: 0000000000000000, r05: 0000000000000100,
r06: 0000000000000001, r07: 0000000000000000, r08: 0000000000000000,
r09: 0000000000000000, r10: 00000000101b60d8, r11: 0000000000000004,
r12: 0000000024592484, r13: 000000011026bfe0, r14: 0000000000000000,
r15: 0000000000009000, r16: 0000000110195b2c, r17: 0000000000000000,
r18: 0000000000000001, r19: 0000000000000000, r20: 0000000000001000,
r21: 0000000000000000, r22: 0000000000000100, r23: 0000000000000001,
r24: 0000000000000000, r25: 0000000000000000, r26: 0000000000000001,
r27: 0000000104c7fd44, r28: 0000000000000000, r29: 0000000000000100,
r30: 0000000000000000, r31: 0000000110195a58,
*** 2010-03-26 09:57:28.679
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []
Current SQL statement for this session:
INSERT INTO AUDIT_DDL_LOG (DDL_TIME, SESSION_ID, OS_USER, IP_ADDRESS, TERMINAL, HOST, USER_NAME, DDL_TYPE, OBJECT_TYPE, OWNER, OBJECT_NAME, SQL_TEXT) VALUES (SYSDATE, SYS_CONTEXT('USERENV','SESSIONID'), SYS_CONTEXT('USERENV','OS_USER'), SYS_CONTEXT('USERENV','IP_ADDRESS'), SYS_CONTEXT('USERENV','TERMINAL'), SYS_CONTEXT('USERENV','HOST'), ORA_LOGIN_USER, ORA_SYSEVENT, ORA_DICT_OBJ_TYPE, ORA_DICT_OBJ_OWNER, ORA_DICT_OBJ_NAME, :B1 )
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
70000043da500d0        10  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              000000000 ? 104A54EED ?
ksedmp+0290          bl       ksedst               104A54870 ?
ssexhd+03e0          bl       ksedmp               300001D15 ?
000044C0             ?        00000000
parchk+01f4          bl       kghalp               000000000 ?
                                                   2842288200000001 ?
                                                   000000000 ? 000000000 ?
                                                   000001040 ? 110195B2C ?
ptmak+0168           bl       parchk               FFFFFFFFFFCB560 ?
                                                   FFFFFFFFFFCB430 ?
                                                   FFFFFFFFFFCB430 ?
pdybF00_Init+0244    bl       ptmak                10008049C ? 000000000 ?
                                                   FFFFFFFFFFCB4F0 ? 07FFFFFFF ?
pdy1F79_Init+00c8    bl       pdybF00_Init         110BEB1D0 ?
pdy1F01_Driver+0048  bl       pdy1F79_Init         FFFFFFFFFFCBC40 ?
pdli_new_cog+00f0    bl       pdy1F01_Driver       FFFFFFFFFFCBCE0 ? 000000000 ?
pdlifu+0264          bl       pdli_new_cog         1013885F4 ? FFFFFFFFFFCCB00 ?
                                                   7000004383E7680 ?
phpcog+0010          bl       pdlifu               FFFFFFFFFFCD958 ?
                                                   7000004383E7680 ? 104C95048 ?
phpcmp+0f80          bl       phpcog               FFFFFFFFFFCC4F0 ? 000000000 ?
pcicms2+02d4         bl       phpcmp               FFFFFFFFFFCD958 ?


发生错误的最上层 kghalp 函数由 parchk 调用, 这似乎是一个package check函数(猜测,呵呵). 我们来整理一下思路, parchk 函数调用了 kghalp函数以帮其分配内存,但却得到了一个非法的低地址[[0x00000003B],正常情况下正文段使用的空间; 这看起来显然是一个bug。
让我们来查查support.oracle.com , 键入7445 kghalp 和sigsegv 关键字 (很多时候不需要使用ora 600/7445 lookup tools).
bug 8244533 赫然显目:

Bug 8244533: ORA-07445 [KGHALP] ERRORS COMPILING PACKAGE WITH DEBUG
    STACK TRACE:
    ------------
       ksedst <- ksedmp <- ssexhd <- 000044BC <- parchk        <- ptmak <-
    pdybF00_Init <- pdy1F79_Init <- pdy1F01_Driver <- pdli_new_cog         <-
    pdlifu <- phpcog <- phpcmp <- pcicms2 <- pcicms          <- kkxcms <- kkxswcm
    <- kkxmpbms <- kkxmesu <- xtypls           <- qctopls <- qctcopn <- qctcopn

    Exception signal: 11 (SIGSEGV), code: 51 (Invalid permissions for mapped
    object),
    addr: 0x3b, PC: [0x1000973e0, kghalp+0500]
    Registers:
    iar: 00000001000973e0, msr: a00000000000d0b2
    lr: 000000010139ffb8,  cr: 00000000222a2484
    r00: 0000000000000010, r01: 0ffffffffffe2980, r02: 00000001101e5ab8,
    r03: 0000000000000002, r04: 0000000000000000, r05: 0000000000000100,
    r06: 0000000000000001, r07: 0000000000000000, r08: 0000000000000000,
    r09: 0000000000000000, r10: 0000000010171200, r11: 0000000000000004,
    r12: 00000000245a2484, r13: 000000011021fbc0, r14: 0000000000000000,
    r15: 0000000000009000, r16: 0000000110150c54, r17: 0000000000000000,
    r18: 0000000000000001, r19: 0000000000000000, r20: 0000000000001000,
    r21: 0000000000000000, r22: 0000000000000100, r23: 0000000000000001,
    r24: 0000000000000000, r25: 0000000000000000, r26: 0000000000000001,
    r27: 0000000104c5983c, r28: 0000000000000000, r29: 0000000000000100,
    r30: 0000000000000000, r31: 0000000110150b80,
    *** 16:37:14.603
    ksedmp: internal or fatal error
    ORA-7445: exception encountered: core dump [kghalp+0500] [SIGSEGV]
    [Invalid permissions for mapped object] [0x00000003B] [] []
    Current SQL statement for this session:
    select dummy from dual where  ora_dict_obj_type = 'TABLE'
----- Call Stack Trace -----ptmak pdybF00_Init pdy1F79_Init pdy1F01_Driver pdli_new_cog pdlifuphpcog phpcmp pcicms2 pcicms kkxcms kkxswcm kkxmpbms kkxmesu xtyplsTo Filer.Based on this call stack this would appear a likely match forbug 6951953 Abstract: ORA-7445 [PTMAK] IMPORTING PACKAGE COMPILED DEBUG.This bug is fixed on 10.2.0.5 and there is a 10.2.0.4 patch available for IBM AIX Based Systems (64-bit).It maybe worth while to have the customer apply the patch to seeif it resolves the issue.Also the uploaded files included test.sql is this a reproducable testcase?

这个bug 似乎仅在 IBM AIX on POWER Systems (64-bit) 发生,当以DEBUG 模式编译包时有一定几率出现。
好了,既然已经了解了可能发生的诱因,我们可以进一步分析了,接下来看看 errorstack trace信息中 的SO 记录。

      SO: 70000043d217668, type: 53, owner: 70000048cee2238, flag: INIT/-/-/0x00
      LIBRARY OBJECT LOCK: lock=70000043d217668 handle=700000446261588 mode=N
      call pin=0 session pin=0 hpc=0000 hlc=0000
      htl=70000043d2176e8[70000042b52b368,70000042bb9a808] htb=70000044929b460 ssga=70000044929ad68
      user=70000048cee2238 session=70000048eb33010 count=1 flags=[0000] savepoint=0x4bac1488
      LIBRARY OBJECT HANDLE: handle=700000446261588 mtx=7000004462616b8(1) cdp=1
      name=ALTER TRIGGER "SHUCRM3O"."TRI_PRODUCT_INSTANCE_RELATED" COMPILE DEBUG REUSE SETTINGS
      hash=164e6a8942406cee159f8943a1a3c85e timestamp=03-26-2010 09:52:12
      namespace=CRSR flags=RON/KGHP/TIM/PN0/SML/KST/DBN/MTX/[120100d0]
      kkkk-dddd-llll=0000-0001-0001 lock=N pin=0 latch#=16 hpc=0002 hlc=0002
      lwt=700000446261630[700000446261630,700000446261630] ltm=700000446261640[700000446261640,700000446261640]
      pwt=7000004462615f8[7000004462615f8,7000004462615f8] ptm=700000446261608[700000446261608,700000446261608]
      ref=700000446261660[700000446261660,700000446261660] lnd=700000446261678[700000446261678,700000446261678]
        LIBRARY OBJECT: object=70000045adbc1e8
        type=CRSR flags=EXS[0001] pflags=[0000] status=VALD load=0
        CHILDREN: size=16
        child#    table reference   handle
             5 70000041776f5c0 70000045ae44720 70000042bfa3a20
        DATA BLOCKS:
        data#     heap  pointer    status pins change whr
            0 70000043d9fed20 70000045adbc300 I/P/A/-/-    0 NONE   00

的确有以debug 模式编译对象的语句,不过对象不是包而是trigger ; 看起来只要是可以以debug 模式compile 的对象都有可能引发该问题。
好了,问题到这里已经比较明确了: 应用端以DEBUG模式重新编译包引发了 Oracle bug 8244533,从而导致了对应服务进程的崩溃;总算是虚惊一场,之后通过trace内的machine和user信息找到了实施变更的应用方人员并教育之。

对Oracle中索引叶块分裂而引起延迟情况的测试和分析

在版本10.2.0.4未打上相关one-off补丁的情况下,分别对ASSM和MSSM管理模式表空间进行索引分裂测试,经过测试的结论如下:

l  在10gr2版本中MSSM方式是不能避免索引分裂引起交易超时问题;

l  10.2.0.4上的one-off补丁因为目前仅存在Linux版本,可以考虑声请补丁后具体测试(因目前没有补丁所以处于未知状态)。

l  合并索引是目前最具可行性的解决方案(alter index coalesce)。

l  最新的11gr2中经测试仍存在该问题。

具体测试过程如下:

1.    自动段管理模式下的索引块分裂

SQL> drop tablespace idx1 including contents and datafiles;

Tablespace dropped.

SQL> create tablespace idx1 datafile ‘?/dbs/idx1.dbf’ size 500M

2  segment space management AUTO

3  extent management local uniform size 10M;

创建自动段管理的表空间

Tablespace created.

SQL> create table idx1(a number) tablespace idx1;

Table created.

create index idx1_idx on idx1 (a) tablespace idx1 pctfree 0;

Index created.         创建实验对象表及索引

SQL> insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000;           插入25万条记录

250000 rows created.

SQL> commit;

Commit complete.

SQL>create table idx2 tablespace idx1 as select * from idx1 where 1=2;

Table created.

insert into idx2

select * from idx1 where rowid in

(select rid from

(select rid, rownum rn from

(select rowid rid from idx1 where a between 10127 and 243625 order by a)                    取出后端部分记录,即每250条取一条

)

where mod(rn, 250) = 0

)

/

933 rows created.

SQL> commit;

Commit complete.

SQL> analyze index idx1_idx validate structure; 分析原索引

select blocks,lf_blks,del_lf_rows from index_stats;

Index analyzed.

SQL>

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

1280        499           0               未删除情况下499个叶块

SQL> delete from idx1 where a between 10127 and 243625;                             大量删除

commit;

233499 rows deleted.

SQL> SQL>

Commit complete.

SQL> analyze index idx1_idx validate structure;

select blocks,lf_blks,del_lf_rows from index_stats;

Index analyzed.

SQL>

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

1280        499      233499            删除后叶块数量不变

SQL> insert into idx1 select * from idx2;                   令那些empty 不再empty,但每个块中只有一到二条记录,空闲率仍为75-100%

commit;

933 rows created.

Commit complete.

SQL> insert into idx1 select 250000+rownum from all_objects where rownum <= 126;          造成leaf块分裂前提

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like ‘%split%’  and sid=(select distinct sid from v$mystat);

VALUE NAME

———- —————————————————————-

997 leaf node splits

997 leaf node 90-10 splits

0 branch node splits

0 queue splits                 找出当前会话目前的叶块分裂次数

SQL>insert into idx1 values (251000);                                        此处确实叶块分裂

1 row created.

SQL> commit;

Commit complete.

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like ‘%split%’  and sid=(select distinct sid from v$mystat);

VALUE NAME

———- —————————————————————-

998 leaf node splits

998 leaf node 90-10 splits

0 branch node splits

0 queue splits         可以看到对比之前的查询多了一个叶块分裂

SQL> set linesize 200 pagesize 1500;

SQL> select  executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql

2  where sql_text like ‘%insert%idx1%’ and sql_text not like ‘%v$sql%’;

EXECUTIONS BUFFER_GETS DISK_READS   CPU_TIME ELAPSED_TIME ROWS_PROCESSED

———- ———– ———- ———- ———— ————–

SQL_TEXT

——————————————————————————————————————————————————————————————————–

1        1603          0     271601       271601            933

insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1         156          0      82803        82803            126

insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1         177 0       3728         3728              1

insert into idx1 values (251000)     读了那些实际不空的块,较多buffer_get

1        1409          0      40293        40293            933

insert into idx1 select * from idx2

1      240842          0    3478341      3478341         250000

SQL> insert into idx1 values (251001);                                  不分裂的插入

1 row created.

SQL> commit;

Commit complete.

SQL> select  executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql

2  where sql_text like ‘%insert%idx1%’ and sql_text not like ‘%v$sql%’;

EXECUTIONS BUFFER_GETS DISK_READS   CPU_TIME ELAPSED_TIME ROWS_PROCESSED

———- ———– ———- ———- ———— ————–

SQL_TEXT

——————————————————————————————————————————————————————————————————–

1        1603          0     271601       271601            933

insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1         156          0      82803        82803            126

insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1           9          0       1640         1640              1

insert into idx1 values (251001) 不分裂的插入,少量buffer_gets

1         177          0       3728         3728              1

insert into idx1 values (251000)

1        1409          0      40293        40293            933

insert into idx1 select * from idx2

1      240842          0    3478341      3478341         250000

insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000

如演示1所示,在自动段管理模式下大量删除后插入造成许多块为75%-100%空闲率且不完全为空,此后叶块分裂时将引起插入操作的相关前台进程扫描大量“空块“,若这些块不在内存中(引发物理读)且可能需要延迟块清除等原因时,减缓了该扫描操作的速度,造成叶块分裂缓慢,最终导致了其他insert操作被split操作所阻塞,出现enq:tx index contention等待事件。

2.  手动段管理模式下的索引块分裂

SQL> drop tablespace idx1 including contents and datafiles;

Tablespace dropped.

SQL> create tablespace idx1 datafile ‘?/dbs/idx1.dbf’ size 500M

2  segment space management MANUAL                                      — MSSM的情况

3  extent management local uniform size 10M;

Tablespace created.

SQL> create table idx1(a number) tablespace idx1;

create index idx1_idx on idx1 (a) tablespace idx1 pctfree 0;

Table created.

SQL> SQL> insert into idx1 select rownum from all_objects, all_objects where rownum <= 250

Index created.

SQL> SQL> 000;

commit;

create table idx2 tablespace idx1 as select * from idx1 where 1=2;

insert into idx2

select * from idx1 where rowid in

(select rid from

(select rid, rownum rn from

(select rowid rid from idx1 where a between 10127 and 243625 order by a)

)

where mod(rn, 250) = 0

)

/

commit;

250000 rows created.

SQL> SQL>

Commit complete.

SQL> SQL>

Table created.

SQL> SQL>   2    3    4    5    6    7    8    9

933 rows created.

SQL> SQL>

Commit complete.

SQL> analyze index idx1_idx validate structure;

select blocks,lf_blks,del_lf_rows from index_stats;

Index analyzed.

SQL>

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

1280        499           0

SQL> delete from idx1 where a between 10127 and 243625;

233499 rows deleted.

SQL> commit;

Commit complete.

SQL> insert into idx1 select * from idx2;

commit;

933 rows created.

SQL> SQL>

Commit complete.

SQL> SQL> insert into idx1 select 250000+rownum from all_objects where rownum <= 126;

commit;

126 rows created.

SQL> SQL>

Commit complete.

SQL>

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like ‘%split%’  and sid=(select distinct sid from v$mystat);

VALUE NAME

———- —————————————————————-

1496 leaf node splits

1496 leaf node 90-10 splits

0 branch node splits

0 queue splits

SQL> insert into idx1 values (251000);                                  确实分裂

1 row created.

SQL> commit;

Commit complete.

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like ‘%split%’  and sid=(select distinct sid from v$mystat);

VALUE NAME

———- —————————————————————-

1497 leaf node splits

1497 leaf node 90-10 splits

0 branch node splits

0 queue splits

以上与ASSM时完全一致

SQL> select  executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql

2  where sql_text like ‘%insert%idx1%’ and sql_text not like ‘%v$sql%’;

EXECUTIONS BUFFER_GETS DISK_READS   CPU_TIME ELAPSED_TIME ROWS_PROCESSED

———- ———– ———- ———- ———— ————–

SQL_TEXT

——————————————————————————————————————————————————————————————————–

1        1553          0     283301       283301            933

insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1         153          0      78465        78465            126

insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1        963 0      10422        10422              1              ASSM模式下更大量的空块

insert into idx1 values (251000)

1         984          0      35615        35615            933

insert into idx1 select * from idx2

1      238579          0    3468326      3469984         250000

insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000

SQL> insert into idx1 values (251001);

1 row created.

SQL> commit;

Commit complete.

SQL> select  executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql

2  where sql_text like ‘%insert%idx1%’ and sql_text not like ‘%v$sql%’;

EXECUTIONS BUFFER_GETS DISK_READS   CPU_TIME ELAPSED_TIME ROWS_PROCESSED

———- ———– ———- ———- ———— ————–

SQL_TEXT

——————————————————————————————————————————————————————————————————–

1        1553          0     283301       283301            933

insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1         153          0      78465        78465            126

insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1           7 0       1476         1476              1

insert into idx1 values (251001)    —不分裂的情况与ASSM时一致

1         963 0      10422        10422              1

insert into idx1 values (251000)

1         984          0      35615        35615            933

insert into idx1 select * from idx2

1      238579          0    3468326      3469984         250000

insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000

6 rows selected.

如演示2所示,MSSM情况下叶块分裂读取了比ASSM模式下更多的“空块“;MSSM并不能解决大量删除后叶块分裂需要扫描大量非空块的问题,实际上可能更糟糕。从理论上讲MSSM的freelist只能指出那些未达到pctfree和曾经到达pctfree后来删除记录后使用空间下降到pctused的块(doc:A free list is a list of free data blocks that usually includes blocks existing in a number of different extents within the segment. Free lists are composed of blocks in which free space has not yet reached PCTFREE or used space has shrunk below PCTUSED.),换而言之MSSM模式下”空块“会更多。

3.  自动段管理模式下coalesce后的索引块分裂

SQL> drop tablespace idx1 including contents and datafiles;

Tablespace dropped.

SQL> create tablespace idx1 datafile ‘?/dbs/idx1.dbf’ size 500M

2  segment space management AUTO                                       — ASSM coalesce情况

3  extent management local uniform size 10M;

Tablespace created.

SQL> create table idx1(a number) tablespace idx1;

create index idx1_idx on idx1 (a) tablespace idx1 pctfree 0;

Table created.

SQL> SQL>

Index created.

SQL> SQL> insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000;

commit;

create table idx2 tablespace idx1 as select * from idx1 where 1=2;

insert into idx2

select * from idx1 where rowid in

(select rid from

(select rid, rownum rn from

(select rowid rid from idx1 where a between 10127 and 243625 order by a)

)

where mod(rn, 250) = 0

)

/

commit;

250000 rows created.

SQL> SQL>

Commit complete.

SQL> SQL>

Table created.

SQL> SQL>   2    3    4    5    6    7    8    9

933 rows created.

SQL> SQL>

Commit complete.

SQL> SQL> SQL>

SQL>

SQL> analyze index idx1_idx validate structure;

select blocks,lf_blks,del_lf_rows from index_stats;

Index analyzed.

SQL>

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

1280        499           0

SQL> delete from idx1 where a between 10127 and 243625;

commit;

233499 rows deleted.

SQL> SQL>

Commit complete.

SQL> alter index idx1_idx coalesce;

Index altered.

SQL> analyze index idx1_idx validate structure;

select blocks,lf_blks,del_lf_rows from index_stats;

Index analyzed.

SQL>

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

1280         33           0 — coalesc lf块合并了

SQL> insert into idx1 select * from idx2;

933 rows created.

SQL> SQL> commit;

Commit complete.

SQL>

SQL> insert into idx1 select 250000+rownum from all_objects where rownum <= 126;

commit;

126 rows created.

SQL> SQL>

Commit complete.

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like ‘%split%’  and sid=(select distinct sid from v$mystat);

VALUE NAME

———- —————————————————————-

1999 leaf node splits

1995 leaf node 90-10 splits

0 branch node splits

0 queue splits

SQL> insert into idx1 values (251000);                                       确实分裂

1 row created.

SQL> commit;

Commit complete.

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like ‘%split%’  and sid=(select distinct sid from v$mystat);

VALUE NAME

———- —————————————————————-

2000 leaf node splits

1996 leaf node 90-10 splits

0 branch node splits

0 queue splits

SQL> select  executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql

2  where sql_text like ‘%insert%idx1%’ and sql_text not like ‘%v$sql%’;

EXECUTIONS BUFFER_GETS DISK_READS   CPU_TIME ELAPSED_TIME ROWS_PROCESSED

———- ———– ———- ———- ———— ————–

SQL_TEXT

——————————————————————————————————————————————————————————————————–

1        1603          0     268924       268924            933

insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1         156          0      78349        78349            126

insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1          23 0       2218         2218              1                             少量buffer gets

insert into idx1 values (251000)

1         191          0      15596        15596            933

insert into idx1 select * from idx2

1      240852          0    3206130      3206130         250000

insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000

SQL> insert into idx1 values (251001);

1 row created.

SQL> commit;

Commit complete.

SQL> select  executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql

2  where sql_text like ‘%insert%idx1%’ and sql_text not like ‘%v$sql%’;

EXECUTIONS BUFFER_GETS DISK_READS   CPU_TIME ELAPSED_TIME ROWS_PROCESSED

———- ———– ———- ———- ———— ————–

SQL_TEXT

——————————————————————————————————————————————————————————————————–

1        1603          0     268924       268924            933

insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1         156          0      78349        78349            126

insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1           9 0       1574         1574              1

insert into idx1 values (251001)

1          23 0       2218         2218              1

insert into idx1 values (251000)

1         191          0      15596        15596            933

insert into idx1 select * from idx2

1      240852          0    3206130      3206130         250000

insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000

6 rows selected.

如演示三所示在删除后进行coalesce操作,合并操作将大量空块分离出了索引结构(move empty out of index structure),之后的叶块分裂仅读取了少量必要的块。

4.  手动段管理模式下coalesce后的索引块分裂

SQL> drop tablespace idx1 including contents and datafiles;

Tablespace dropped.

SQL> create tablespace idx1 datafile ‘?/dbs/idx1.dbf’ size 500M

2  segment space management MANUAL                               — mssm情况下 coalesce

3  extent management local uniform size 10M;

Tablespace created.

SQL> create table idx1(a number) tablespace idx1;

create index idx1_idx on idx1 (a) tablespace idx1 pctfree 0;

Table created.

SQL> SQL> insert into idx1 select rownum from all_objects, all_objects where rownum <= 250

Index created.

SQL> SQL> 000;

commit;

create table idx2 tablespace idx1 as select * from idx1 where 1=2;

insert into idx2

select * from idx1 where rowid in

(select rid from

(select rid, rownum rn from

(select rowid rid from idx1 where a between 10127 and 243625 order by a)

)

where mod(rn, 250) = 0

)

/

commit;

250000 rows created.

SQL> SQL>

Commit complete.

SQL> SQL>

Table created.

SQL> SQL>   2    3    4    5    6    7    8    9

933 rows created.

SQL> SQL>

Commit complete.

SQL> SQL> SQL>

SQL>

SQL> analyze index idx1_idx validate structure;

select blocks,lf_blks,del_lf_rows from index_stats;

Index analyzed.

SQL>

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

1280        499           0

SQL> delete from idx1 where a between 10127 and 243625;

commit;

233499 rows deleted.

SQL> SQL>

Commit complete.

SQL> analyze index idx1_idx validate structure;

select blocks,lf_blks,del_lf_rows from index_stats;

Index analyzed.

SQL>

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

1280        499      233499

SQL> alter index idx1_idx coalesce;

Index altered.

SQL> analyze index idx1_idx validate structure;

select blocks,lf_blks,del_lf_rows from index_stats;

Index analyzed.

SQL>

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

1280         33           0

SQL> insert into idx1 select * from idx2;

933 rows created.

SQL> SQL> commit;

Commit complete.

SQL>

SQL> insert into idx1 select 250000+rownum from all_objects where rownum <= 126;

commit;

126 rows created.

SQL> SQL>

Commit complete.

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like ‘%split%’  and sid=(select distinct sid from v$mystat);

VALUE NAME

———- —————————————————————-

2502 leaf node splits

2494 leaf node 90-10 splits

0 branch node splits

0 queue splits

SQL> insert into idx1 values (251000);                       确实分裂

1 row created.

SQL> commit;

Commit complete.

SQL> select ss.value,sy.name from v$sesstat ss ,v$sysstat sy where ss.statistic#=sy.statistic# and name like ‘%split%’  and sid=(select distinct sid from v$mystat);

VALUE NAME

———- —————————————————————-

2503 leaf node splits

2495 leaf node 90-10 splits

0 branch node splits

0 queue splits

SQL> select  executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql

2  where sql_text like ‘%insert%idx1%’ and sql_text not like ‘%v$sql%’;

EXECUTIONS BUFFER_GETS DISK_READS   CPU_TIME ELAPSED_TIME ROWS_PROCESSED

———- ———– ———- ———- ———— ————–

SQL_TEXT

——————————————————————————————————————————————————————————————————–

1        1553          0     281059       281059            933

insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1         153          0      77817        77817            126

insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1          19          0       2010         2010              1                       少量buffer get

insert into idx1 values (251000)

1         126          0      15364        15364            933

insert into idx1 select * from idx2

1      238644          0    3229737      3230569         250000

insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000

SQL> insert into idx1 values (251001);

1 row created.

SQL> commit;

Commit complete.

SQL> select  executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql

2  where sql_text like ‘%insert%idx1%’ and sql_text not like ‘%v$sql%’;

EXECUTIONS BUFFER_GETS DISK_READS   CPU_TIME ELAPSED_TIME ROWS_PROCESSED

———- ———– ———- ———- ———— ————–

SQL_TEXT

——————————————————————————————————————————————————————————————————–

1        1553          0     281059       281059            933

insert into idx2 select * from idx1 where rowid in (select rid from (select rid, rownum rn from (select rowid rid from idx1 where a between 10127 and 243625 order by a) ) where mod(rn, 250) = 0 )

1         153          0      77817        77817            126

insert into idx1 select 250000+rownum from all_objects where rownum <= 126

1          7 0       1460         1460              1

insert into idx1 values (251001)

1          19 0       2010         2010              1

insert into idx1 values (251000)

1         126          0      15364        15364            933

insert into idx1 select * from idx2

1      238644          0    3229737      3230569         250000

insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000

6 rows selected.

如演示4所示,MSSM模式下合并操作与ASSM情况下大致一样,合并操作可以有效解决该问题。

5.  Coalesce合并操作的锁影响

SQL> create table coal (t1 int);

Table created.

SQL> create index pk_t1 on coal(t1);

Index created.

SQL> begin

2    for i in 1..3000 loop

3      insert into coal values(i);

4      commit;

5      end loop;

6      end;

7  /

PL/SQL procedure successfully completed.

SQL> delete coal where t1>500;

2500 rows deleted.

SQL> commit;

Commit complete.

SQL> analyze index pk_t1 validate structure;

Index analyzed.    注意analyze validate操作会block一切dml操作

SQL> select blocks,lf_blks,del_lf_rows from index_stats;

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

8          6        2500          删除后的状态

此时另开一个会话,开始dml操作:

SQL> update coal set t1=t1+1;

500 rows updated.

回到原会话

SQL> alter index pk_T1 coalesce;             — coalesce 未被阻塞

Index altered.

在另一个会话中commit,以便执行validate structure

SQL> analyze index pk_t1 validate structure;

Index analyzed.

SQL> select blocks,lf_blks,del_lf_rows from index_stats;

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

8          3         500

显然coalesce的操作没有涉及有dml操作的块

在没有dml操作的情况下:

SQL> truncate table coal;

Table truncated.

SQL> begin

2    for i in 1..3000 loop

3      insert into coal values(i);

4      commit;

5      end loop;

6      end;

7  /

PL/SQL procedure successfully completed.

SQL> analyze index pk_t1 validate structure;

Index analyzed.

SQL> select blocks,lf_blks,del_lf_rows from index_stats;

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

8          6           0

SQL> delete coal where t1>500;

2500 rows deleted.

SQL> commit;

Commit complete.

SQL> analyze index pk_t1 validate structure;

Index analyzed.

SQL> select blocks,lf_blks,del_lf_rows from index_stats;

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

8          6        2500

SQL> alter index pk_t1 coalesce;

Index altered.

SQL> analyze index pk_t1 validate structure;

Index analyzed.

SQL> select blocks,lf_blks,del_lf_rows from index_stats;

BLOCKS    LF_BLKS DEL_LF_ROWS

———- ———- ———–

8          1           0

没有dml时,coalesce 操作涉及了所有块

如演示5所示coalesce会避开dml操作涉及的块,但在coalesec的短暂间歇出现在索引上有事务的块不会太多。且coalesce操作不会降低索引高度。

附件是关于rebuild及coalesce索引操作的详细描述:

6.  Coalesce操作总结

优点:

l  是一种快速的操作,对整体性能影响最小(not performance sensitive)。

l  不会锁表,绕过有事务的索引块。

l  可以有效解决现有的问题。

l  不会降低索引高度,引起再次的root split

缺点:

l  需要针对个别对象,定期执行合并操作;无法一劳永逸地全局地解决该问题。

7.  Linux 10.2.0.4上相关补丁的技术交流

Metalink bug 8286901 note中叙述了一位用户遇到相同的问题并提交了SR,当时oracle support给出了one-off补丁,但该用户在apply了该补丁后仍未解决问题。

以下为note 原文:

It is similar to bug8286901, but after applied patch8286901, still see enq tx
contentiona with high “failed probes on index block reclamation”

Issue encountered by customer and Oracle developer (Stefan Pommerenk).


He describes is thus:


"Space search performed by the index splitter can't find space in neighboring


blocks, and then instead of allocating new space, we go and continue to


search for space elsewhere, which manifests itself in block reads from disk,


block cleanouts, and subsequent blocks written due to aggressive MTTR


setting."




"To clarify: the cleanouts are not the problem per se. The culprit seems to


be that the space search performed by the index splitter can't find space in


neighboring blocks, and then instead of allocating new space, we go and


continue to search for space elsewhere, which manifests itself in block reads


from disk, block cleanouts, and subsequent blocks written due to aggressive


MTTR setting. This action has caused other sessions to get blocked on TX


enqueue contention, blocked on the splitting session. Advice was to set 10224


trace event for the splitter for a short time only in order to get


diagnostics as to why the space search rejected most blocks.


> A secondary symptom are the bitmap level 1 block updates, which may or may


not be related to the space search; I've not seen them before, maybe because


I didn't really pay attention :P , but the symptoms seen in the ASH trace


indicate it's the same problem. Someone in space mgmt has to look at it to


confirm it is the same problem."

与该用户进行了mail私下交流,他的回复:

I still have a case open with Oracle. I believe that this is a bug in the Oracle code. The problem is that it has been difficult to create a reproducible test case for Oracle support. My specific issue was basically put on hold pending the results of another customer’s service request that appeared to have had the same issue, (9034788). Unfortunately they couldn’t reproduce the issue in that case either.

I believe that there is a correlation between the enq TX – index contention wait event and a spike in the number of ‘failed probes on index block reclamation. I have specifically asked Oracle to explain why there is a spike in the ‘failed probes on index block reclamation’ during the same time frame as the enq TX index contention wait event, but they have not answered my question.

I was hoping that some investigation by Oracle Support into the failed probes metric might get someone on the right track to discovering the bug. That hasn’t happened though.

Hi ,

Thanks for your sharing .  The bug (or specific ktsp behave) is fatal in response time sensitive  OLTP env.

I would like to ask my customer to coalesce those index where massive deleted regularly.

Thanks for your help again!

Yes, I saw that. I have applied patch 8286901 and set the event for version 10.2.0.4, but the problem still occurs periodically. And as I mentioned before, we see a correlation between enq TX waits and the failed probes on index block reclamation. Which is why I still think that it is a bug. I agree that trying to rebuild or coalesce the indexes are simply attempts to workaround the issue and not solve the root cause.

Early on when I started on this issue I did do some index dumps and could clearly see that we had lots of blocks with only 1 or 2 records after our mass delete jobs. I have provided Oracle Support with this information as well as oradump files while the problem is occurring, but they don’t seem to be able to find anything wrong so far.

If you are interested in seeing if you are experiencing a high ‘failed probes on index block reclamation’ event run the query below.

select SS.snap_id,
SS.stat_name,
TO_CHAR(S.BEGIN_INTERVAL_TIME, ‘DAY’) DAY,
S.BEGIN_INTERVAL_TIME,
S.END_INTERVAL_TIME,
SS.value,
SS.value – LAG(SS.VALUE, 1, ss.value) OVER (ORDER BY SS.SNAP_ID) AS DIFF
from DBA_HIST_SYSSTAT SS,
DBA_HIST_SNAPSHOT S
where S.SNAP_ID = SS.SNAP_ID
AND SS.stat_NAME = ‘failed probes on index block reclamation’
ORDER BY SS.SNAP_ID ;

  1. 在11gr2上的测试

在最新的11gr2中进行了测试,仍可以重现该问题(如图单条insert引起了6675buffer_gets,这是在更大量数据的情况下)。

我们可以猜测Oracle提供的one-off补丁中可能是为叶块分裂所会扫描的“空块”附加了一个上限,在未达到上限的情况下扫描仍会发生。而在主流的公开的发行版本中Oracle不会引入该补丁的内容。尝试在没有缓存的情况下引起分裂问题,分裂引起了大约4000个块的物理读,但该操作仍在0.12秒(有缓存是0.02秒,如图)内完成了(该测试使用普通ata硬盘,读取速度在100MB/S: Timing buffered disk reads:  306 MB in  3.00 seconds = 101.93 MB/sec);从1月21日的ash视图中可以看到引起split的260会话处于单块读等待(db file sequential read)中,且已等待了43950us约等于44ms;这与良好io的经验值10ms左右有较大出入;我们可以确信io性能问题也是引发此叶块分裂延迟如此显性的一个重要因素。

具体结论

综上所述,在之前讨论的几个方案中,MSSM方式是不能避免索引分裂引起交易超时问题的;不删除数据的方案在许多对象上不可行;10.2.0.4上的one-off补丁因为目前仅存在Linux版本,可以考虑声请补丁后具体测试(因目前没有补丁所以处于未知状态)。Coalesce合并索引是目前既有的最具可操作性且无副作用的解决方案。

简易高负载进程记录脚本

Oracle 10g 中引入了v$osstat 视图方便了dba了解主机负载情况,同时也可以通过oem网页观察到一段时间内主机上负载较高的进程;但如果db未开启oem管理界面,则无法了解过去时段内高负载服务进程的相关信息。以下脚本可以给予一定的帮助。

CREATE TABLE "SYS"."HIGHLOAD_HISTORY"
(
"SAMPLE_TIME" DATE,
"SPID"     NUMBER(10,0),
"LOAD"     VARCHAR2(7 BYTE),
"SID"      VARCHAR2(30 BYTE),
"USERNAME" VARCHAR2(40 BYTE),
"MACHINE"  VARCHAR2(64 BYTE),
"PROGRAM"  VARCHAR2(48 BYTE),
"SQL_ID"   VARCHAR2(13 BYTE),
"SQL_FULLTEXT" CLOB,
"INST_ID" NUMBER(2,0),
"STATUS"  VARCHAR2(8 BYTE)
)    --建立记录高负载进程信息的表,内容包括了cpu使用率,及sql(并不十分准确,因为获取spid后需要进行查询)
ps aux|grep $ORACLE_SID|awk '{ if($3>=0.3) print "insert into highload_history select sysdate rec_time,"$2,","$3"%",", ss.sid,ss.username,ss.machine,ss.program,ss.sql_id,(select sql_fulltext from v$sqlarea sq where sq.sql_id=ss.sql_id),(select instance_number from v$instance),ss.status from v$session ss,v$process pr where  pr.addr=ss.paddr and pr.spid=",$2";"}'  | sqlplus / as sysdba  --直接运行即可

优化模式区别(all_rows & first_rows_n)

FIRST_ROWS优化模式以最快速度地检索出结果 集中的一行为其指导目标。当系统用户正在使用OLTP系统检索单条记录时,该 优化模式最为有效。但是该模式对于批处理密集型(batch)作业环境来说并不是最理想 的选择,在这种环境中一个查询通常需要检索许多行。FIRST_ROWS提示 一般会强制使用某些索引,而在默认环境(ALL_ROWS)中可能不采用这些索引。在使 用UPDATE和DELETE语句时FIRST_ROWS模式会被忽略,因这些DML操 作中所查询到的所有记录都会被更新或删除。另当使用以下分组语句(如GROUP BY,DISTINCT,INTERSECT,MINUS和UNION)时FIRST_ROWS模式均被ALL_ROWS模式取代,因为这些语句进行分组时必须检索所有行。当语句中有ORDER BY子句时,如果索引扫描可以进行实际的排序工作,则优化器将避免额外的排 序。当索引扫描可用并且索引处于内部表(inner table)时,优化器将更倾向于NESTED LOOPS即嵌套循环而非SORT MERGE排 序连接。

另10g中现有的FIRST_ROWS模式的变体FIRST_ROWS_N来 指定以多少行数最快返回。这个值介于10~1000之间,这个使用FIRST_ROWS_N的新方法是完全基于成本的方法,它对于N的取值较敏感,若N甚小,优化器就会产生包 括嵌套循环以及索引查找的计划。如果N值较大,优化器也可能生成由散列连接和全表扫描组 成的计划(类似于ALL_ROWS)。 又FIRST_ROW与FIRST_ROWS_N存 在不同,FIRST_ROW模式中保量了部分基于规则的代码,而FIRST_ROWS_N模式则是完完全全基于统计信息计算相应成本,如Oracle文档所述:

ALL_ROWS优化模式指导查询以最快速度检索出所 有行(最佳吞吐量)。当系统用户 处于需要大量批处理报告的环境中,该模式较理想。

在实际的SQL硬解析过程中,FIRST_ROWS_N模式将首先以ALL_ROWS模 式的方式计算一次各执行计划的具体代价,之后将我们需要的N条记录代入成本计算中代替实 际全部的候选行(CARD)以得出FIRST_ROWS_N中 的计划成本。

create table test as select  * from dba_objects;

create table testa as select * from test;

alter session set events’10053 trace name context forever,level 1′;    –使用10053事 件获取成本计算过程trace

alter session set optimizer_mode=all_rows;

select test.owner from test,testa where test.object_id=testa.object_id

alter session set events’10053 trace name context off’;

下为ALL_ROWS模式中,最佳连接方式的选 取:

NL Join

Outer table: Card: 9622.00  Cost: 35.37  Resp: 35.37  Degree: 1  Bytes: 7

Inner table: TESTA  Alias: TESTA

Access Path: TableScan

NL Join:  Cost: 318924.52  Resp: 318924.52  Degree: 0

Cost_io: 315358.00  Cost_cpu: 27736509932

Resp_io: 315358.00  Resp_cpu: 27736509932

Access Path: index (index (FFS))

Index: INDA_ID

resc_io: 5.69  resc_cpu: 1304190

ix_sel: 0.0000e+00  ix_sel_with_filters: 1

Inner table: TESTA  Alias: TESTA

Access Path: index (FFS)

NL Join:  Cost: 56375.98  Resp: 56375.98  Degree: 0

Cost_io: 54762.00  Cost_cpu: 12551800804

Resp_io: 54762.00  Resp_cpu: 12551800804

Access Path: index (AllEqJoinGuess)

Index: INDA_ID

resc_io: 1.00  resc_cpu: 8171

ix_sel: 1.0393e-04  ix_sel_with_filters: 1.0393e-04

NL Join: Cost: 9667.48  Resp: 9667.48  Degree: 1

Cost_io: 9657.00  Cost_cpu: 81507910

Resp_io: 9657.00  Resp_cpu: 81507910

Best NL cost: 9667.48

resc: 9667.48 resc_io: 9657.00 resc_cpu: 81507910

resp: 9667.48 resp_io: 9657.00 resp_cpu: 81507910

Join Card:  9622.00 = outer (9622.00) * inner (9622.00) * sel (1.0393e-04)

Join Card – Rounded: 9622 Computed: 9622.00

SM Join

Outer table:

resc: 35.37  card 9622.00  bytes: 7  deg: 1  resp: 35.37

Inner table: TESTA  Alias: TESTA

resc: 7.17  card: 9622.00  bytes: 3  deg: 1  resp: 7.17

using dmeth: 2  #groups: 1

SORT resource      Sort statistics

Sort width:          70 Area size:      131072 Max Area size:    12582912

Degree:               1

Blocks to Sort:      17 Row size:           14 Total Rows:           9622

Initial runs:         2 Merge passes:        1 IO Cost / pass:         10

Total IO sort cost: 27      Total CPU sort cost: 13931876

Total Temp space used: 254000

SM join: Resc: 203.62  Resp: 203.62  [multiMatchCost=0.00]

HA Join

Outer table:

resc: 35.37  card 9622.00  bytes: 7  deg: 1  resp: 35.37

Inner table: TESTA  Alias: TESTA

resc: 7.17  card: 9622.00  bytes: 3  deg: 1  resp: 7.17

using dmeth: 2  #groups: 1

Cost per ptn: 0.81  #ptns: 1

hash_area: 124 (max=3072)   Hash join: Resc: 43.35  Resp: 43.35  [multiMatchCost=0.00]

HA Join (swap)

Outer table:

resc: 7.17  card 9622.00  bytes: 3  deg: 1  resp: 7.17

Inner table: TEST  Alias: TEST

resc: 35.37  card: 9622.00  bytes: 7  deg: 1  resp: 35.37

using dmeth: 2  #groups: 1

Cost per ptn: 0.81  #ptns: 1

hash_area: 124 (max=3072)   Hash join: Resc: 43.35  Resp: 43.35  [multiMatchCost=0.00]

HA cost: 43.35

resc: 43.35 resc_io: 42.00 resc_cpu: 10480460

resp: 43.35 resp_io: 42.00 resp_cpu: 10480460

Best:: JoinMethod: Hash

Cost: 43.35  Degree: 1  Resp: 43.35  Card: 9622.00  Bytes: 10

***********************

Best so far: Table#: 0  cost: 35.3706  card: 9622.0000  bytes: 67354

Table#: 1  cost: 43.3476  card: 9622.0000  bytes: 96220

可以看到连接中二表上的候选行都是9622条,实际结果集也是9622条。

我们来看FIRST_ROWS_10情况下的trace:

alter session set events’10053 trace name context forever,level 1′;

alter session set optimizer_mode=first_rows_10;

select test.owner from test,testa where test.object_id=testa.object_id;

alter session set events’10053 trace name context off’;

Now joining: TEST[TEST]#0

***************

NL Join

Outer table: Card: 11.00  Cost: 2.00  Resp: 2.00  Degree: 1  Bytes: 3

Inner table: TEST  Alias: TEST

Access Path: TableScan

NL Join:  Cost: 368.08  Resp: 368.08  Degree: 0

Cost_io: 364.00  Cost_cpu: 31713898

Resp_io: 364.00  Resp_cpu: 31713898

Access Path: index (AllEqJoinGuess)

Index: IND_ID

resc_io: 2.00  resc_cpu: 15503

ix_sel: 1.0393e-04  ix_sel_with_filters: 1.0393e-04

NL Join (ordered): Cost: 24.02  Resp: 24.02  Degree: 1

Cost_io: 24.00  Cost_cpu: 178973

Resp_io: 24.00  Resp_cpu: 178973

Best NL cost: 24.02

resc: 24.02 resc_io: 24.00 resc_cpu: 178973

resp: 24.02 resp_io: 24.00 resp_cpu: 178973

Join Card:  11.00 = outer (11.00) * inner (9622.00) * sel (1.0393e-04)

Join Card – Rounded: 11 Computed: 11.00

SM Join

Outer table:

resc: 7.17  card 9622.00  bytes: 3  deg: 1  resp: 7.17

Inner table: TEST  Alias: TEST

resc: 35.37  card: 9622.00  bytes: 7  deg: 1  resp: 35.37

using dmeth: 2  #groups: 1

SORT resource      Sort statistics

Sort width:          70 Area size:      131072 Max Area size:    12582912

Degree:               1

Blocks to Sort:      22 Row size:           18 Total Rows:           9622

Initial runs:         2 Merge passes:        1 IO Cost / pass:         14

Total IO sort cost: 36      Total CPU sort cost: 14055006

Total Temp space used: 320000

SORT resource      Sort statistics

Sort width:          70 Area size:      131072 Max Area size:    12582912

Degree:               1

Blocks to Sort:      17 Row size:           14 Total Rows:           9622

Initial runs:         2 Merge passes:        1 IO Cost / pass:         10

Total IO sort cost: 27      Total CPU sort cost: 13931876

Total Temp space used: 254000

SM join: Resc: 109.14  Resp: 109.14  [multiMatchCost=0.00]

SM cost: 109.14

resc: 109.14 resc_io: 105.00 resc_cpu: 32173386

resp: 109.14 resp_io: 105.00 resp_cpu: 32173386

SM Join (with index on outer)

Access Path: index (FullScan)

Index: IND_ID

resc_io: 167.00  resc_cpu: 5134300

ix_sel: 1  ix_sel_with_filters: 1

Cost: 167.66  Resp: 167.66  Degree: 1

Outer table:

resc: 167.66  card 11.00  bytes: 7  deg: 1  resp: 167.66

Inner table: TESTA  Alias: TESTA

resc: 7.17  card: 9622.00  bytes: 3  deg: 1  resp: 7.17

using dmeth: 2  #groups: 1

SORT resource      Sort statistics

Sort width:          70 Area size:      131072 Max Area size:    12582912

Degree:               1

Blocks to Sort:      17 Row size:           14 Total Rows:           9622

Initial runs:         2 Merge passes:        1 IO Cost / pass:         10

Total IO sort cost: 27      Total CPU sort cost: 13931876

Total Temp space used: 254000

SM join: Resc: 203.62  Resp: 203.62  [multiMatchCost=0.00]

HA Join

Outer table:

resc: 35.37  card 9622.00  bytes: 7  deg: 1  resp: 35.37

Inner table: TESTA  Alias: TESTA

resc: 7.17  card: 9622.00  bytes: 3  deg: 1  resp: 7.17

using dmeth: 2  #groups: 1

Cost per ptn: 0.81  #ptns: 1

hash_area: 124 (max=3072)   Hash join: Resc: 43.35  Resp: 43.35  [multiMatchCost=0.00]

HA Join (swap)

Outer table:

resc: 7.17  card 9622.00  bytes: 3  deg: 1  resp: 7.17

Inner table: TEST  Alias: TEST

resc: 2.00  card: 11.00  bytes: 7  deg: 1  resp: 2.00

using dmeth: 2  #groups: 1

Cost per ptn: 0.69  #ptns: 1

hash_area: 124 (max=3072)   Hash join: Resc: 9.85  Resp: 9.85  [multiMatchCost=0.00]

HA cost: 9.85

resc: 9.85 resc_io: 9.00 resc_cpu: 6646477

resp: 9.85 resp_io: 9.00 resp_cpu: 6646477

First K Rows: copy A one plan, tab=TESTA

Best:: JoinMethod: Hash

Cost: 9.85  Degree: 1  Resp: 9.85  Card: 9622.00  Bytes: 17

***********************

Best so far: Table#: 0  cost: 2.0012  card: 11.0000  bytes: 77

Table#: 1  cost: 9.8546  card: 9622.0000  bytes: 163574

可以看到此次计算中代入了用户希望最先返回的结果 条数11(为10+1),通过设 置连接对象的候选结果集(Card)以到达相关优化目的,相应的COST均有所下降。

下为FIRST_ROWS_1000的情况:

alter session set events’10053 trace name context forever,level 1′;

alter session set optimizer_mode=first_rows_1000;

select test.owner from test,testa where test.object_id=testa.object_id;

alter session set events’10053 trace name context off’;

NL Join

Outer table: Card: 1000.00  Cost: 5.04  Resp: 5.04  Degree: 1  Bytes: 7

Inner table: TESTA  Alias: TESTA

Access Path: TableScan

NL Join:  Cost: 33147.66  Resp: 33147.66  Degree: 0

Cost_io: 32777.00  Cost_cpu: 2882616819

Resp_io: 32777.00  Resp_cpu: 2882616819

Access Path: index (index (FFS))

Index: INDA_ID

resc_io: 5.69  resc_cpu: 1304190

ix_sel: 0.0000e+00  ix_sel_with_filters: 1

Inner table: TESTA  Alias: TESTA

Access Path: index (FFS)

NL Join:  Cost: 5861.74  Resp: 5861.74  Degree: 0

Cost_io: 5694.00  Cost_cpu: 1304492819

Resp_io: 5694.00  Resp_cpu: 1304492819

Access Path: index (AllEqJoinGuess)

Index: INDA_ID

resc_io: 1.00  resc_cpu: 8171

ix_sel: 1.0393e-04  ix_sel_with_filters: 1.0393e-04

NL Join: Cost: 1006.09  Resp: 1006.09  Degree: 1

Cost_io: 1005.00  Cost_cpu: 8474019

Resp_io: 1005.00  Resp_cpu: 8474019

Best NL cost: 1006.09

resc: 1006.09 resc_io: 1005.00 resc_cpu: 8474019

resp: 1006.09 resp_io: 1005.00 resp_cpu: 8474019

Join Card:  1000.00 = outer (1000.00) * inner (9622.00) * sel (1.0393e-04)

Join Card – Rounded: 1000 Computed: 1000.00

SM Join

Outer table:

resc: 35.37  card 9622.00  bytes: 7  deg: 1  resp: 35.37

Inner table: TESTA  Alias: TESTA

resc: 7.17  card: 9622.00  bytes: 3  deg: 1  resp: 7.17

using dmeth: 2  #groups: 1

SORT resource      Sort statistics

Sort width:          70 Area size:      131072 Max Area size:    12582912

Degree:               1

Blocks to Sort:      22 Row size:           18 Total Rows:           9622

Initial runs:         2 Merge passes:        1 IO Cost / pass:         14

Total IO sort cost: 36      Total CPU sort cost: 14055006

Total Temp space used: 320000

SORT resource      Sort statistics

Sort width:          70 Area size:      131072 Max Area size:    12582912

Degree:               1

Blocks to Sort:      17 Row size:           14 Total Rows:           9622

Initial runs:         2 Merge passes:        1 IO Cost / pass:         10

Total IO sort cost: 27      Total CPU sort cost: 13931876

Total Temp space used: 254000

SM join: Resc: 109.14  Resp: 109.14  [multiMatchCost=0.00]

SM cost: 109.14

resc: 109.14 resc_io: 105.00 resc_cpu: 32173386

resp: 109.14 resp_io: 105.00 resp_cpu: 32173386

SM Join (with index on outer)

Access Path: index (FullScan)

Index: IND_ID

resc_io: 167.00  resc_cpu: 5134300

ix_sel: 1  ix_sel_with_filters: 1

Cost: 167.66  Resp: 167.66  Degree: 1

Outer table:

resc: 167.66  card 1000.00  bytes: 7  deg: 1  resp: 167.66

Inner table: TESTA  Alias: TESTA

resc: 7.17  card: 9622.00  bytes: 3  deg: 1  resp: 7.17

using dmeth: 2  #groups: 1

SORT resource      Sort statistics

Sort width:          70 Area size:      131072 Max Area size:    12582912

Degree:               1

Blocks to Sort:      17 Row size:           14 Total Rows:           9622

Initial runs:         2 Merge passes:        1 IO Cost / pass:         10

Total IO sort cost: 27      Total CPU sort cost: 13931876

Total Temp space used: 254000

SM join: Resc: 203.62  Resp: 203.62  [multiMatchCost=0.00]

HA Join

Outer table:

resc: 35.37  card 9622.00  bytes: 7  deg: 1  resp: 35.37

Inner table: TESTA  Alias: TESTA

resc: 7.17  card: 9622.00  bytes: 3  deg: 1  resp: 7.17

using dmeth: 2  #groups: 1

Cost per ptn: 0.81  #ptns: 1

hash_area: 124 (max=3072)   Hash join: Resc: 43.35  Resp: 43.35  [multiMatchCost=0.00]

HA Join (swap)

Outer table:

resc: 7.17  card 9622.00  bytes: 3  deg: 1  resp: 7.17

Inner table: TEST  Alias: TEST

resc: 5.04  card: 1000.00  bytes: 7  deg: 1  resp: 5.04

using dmeth: 2  #groups: 1

Cost per ptn: 0.70  #ptns: 1

hash_area: 124 (max=3072)   Hash join: Resc: 12.91  Resp: 12.91  [multiMatchCost=0.00]

HA cost: 12.91

resc: 12.91 resc_io: 12.00 resc_cpu: 7038524

resp: 12.91 resp_io: 12.00 resp_cpu: 7038524

First K Rows: copy A one plan, tab=TESTA

Best:: JoinMethod: Hash

Cost: 12.91  Degree: 1  Resp: 12.91  Card: 9622.00  Bytes: 17

***********************

Best so far: Table#: 0  cost: 5.0389  card: 1000.0000  bytes: 7000

Table#: 1  cost: 12.9051  card: 9622.0000  bytes: 163574

可以看到此处代入了1000为某一连接对象的候选行数。

MOS上有一个著名的《MIGRATING TO THE COST-BASED OPTIMIZER》教材,详细介绍了RBO和CBO的区别:
[gview file=”http://askmac.cn/wp-content/uploads/resource/40178_rbo_rip.doc”]

如何找出Oracle中需要或值得重建的索引

This script determines whether an index is a good candidate for a rebuild or for
a bitmap index.  All indexes for a given schema or for a subset of schema’s are
analyzed (except indexes under SYS and SYSTEM)

Instructions

Execution Environment:
<SQL, SQL*Plus, iSQL*Plus>

Access Privileges:
Requires DBA privileges in order to be executed.

Usage:
sqlplus <user>/<pw> @rebuild.index.sql

Instructions:
Copy the script into the file ind_an.sql. Execute the script from SQL*Plus connected
with a user with DBA privileges.  The script requires to parameters:

1. Name of the output file where the report while be generated
2. Name of the SCHEMA to be analyzed.

PROOFREAD THIS SCRIPT BEFORE USING IT! Due to differences in the way text
editors, e-mail packages, and operating systems handle text formatting (spaces,
tabs, and carriage returns), this script may not be in an executable state
when you first receive it. Check over the script to ensure that errors of
this type are corrected.

Description

This script determines whether an index is a good candidate for a rebuild or for
a bitmap index.  All indexes for a given schema or for a subset of schema’s are
analyzed (except indexes under SYS and SYSTEM).

REM =============================================================
REM
REM                         rebuild_indx.sql
REM
REM  Copyright (c) Oracle Software, 1998 - 2000
REM
REM  Author  : Jurgen Schelfhout
REM
REM  The sample program in this article is provided for educational
REM  purposes only and is NOT supported by Oracle Support Services.
REM  It has been tested internally, however, and works as documented.
REM  We do not guarantee that it will work for you, so be sure to test
REM  it in your environment before relying on it.
REM
REM  This script will analyze all the indexes for a given schema
REM  or for a subset of schema's. After this the dynamic view
REM  index_stats is consulted to see if an index is a good
REM  candidate for a rebuild or for a bitmap index.
REM
REM  Database Version : 7.3.X and above.
REM
REM  NOTE:  If running this on 10g, you must exclude the
REM  objects in the Recycle Bin
REM        cursor c_indx is
REM          select owner, table_name, index_name
REM            from dba_indexes
REM           where owner like upper('&schema')
REM             and table_name not like 'BIN$%'
REM             and owner not in ('SYS','SYSTEM');
REM
REM  Additional References for Recycle Bin functionality:
REM  Note.265254.1 Flashback Table feature in Oracle Database 10g
REM  Note.265253.1 10g Recyclebin Features And How To Disable it(_recyclebin)
REM
REM =============================================================

prompt
ACCEPT spoolfile CHAR prompt 'Output-file : ';
ACCEPT schema CHAR prompt 'Schema name (% allowed) : ';
prompt
prompt
prompt Rebuild the index when :
prompt   - deleted entries represent 20% or more of the current entries
prompt   - the index depth is more then 4 levels.
prompt Possible candidate for bitmap index :
prompt   - when distinctiveness is more than 99%
prompt

spool &spoolfile;
set serveroutput on;
set verify off;
set linesize 140;
declare
  c_name        INTEGER;
  ignore        INTEGER;
  height        index_stats.height%TYPE := 0;
  lf_rows       index_stats.lf_rows%TYPE := 0;
  del_lf_rows   index_stats.del_lf_rows%TYPE := 0;
  distinct_keys index_stats.distinct_keys%TYPE := 0;
  cursor c_indx is
    select owner, table_name, index_name
      from dba_indexes
     where owner like upper('&schema')
       and owner not in ('SYS', 'SYSTEM');
begin
  dbms_output.enable(1000000);
  dbms_output.put_line('Owner           Index Name                              % Deleted Entries Blevel Distinctiveness');
  dbms_output.put_line('--------------  ---------------------------------            ------------  -----           -----');

  c_name := DBMS_SQL.OPEN_CURSOR;
  for r_indx in c_indx loop
    DBMS_SQL.PARSE(c_name,
                   'analyze index ' || r_indx.owner || '.' ||
                   r_indx.index_name || ' validate structure',
                   DBMS_SQL.NATIVE);
    ignore := DBMS_SQL.EXECUTE(c_name);

    select HEIGHT,
           decode(LF_ROWS, 0, 1, LF_ROWS),
           DEL_LF_ROWS,
           decode(DISTINCT_KEYS, 0, 1, DISTINCT_KEYS)
      into height, lf_rows, del_lf_rows, distinct_keys
      from index_stats;
    /*
    - Index is considered as candidate for rebuild when :
    -   - when deleted entries represent 20% or more of the current entries
    -   - when the index depth is more then 4 levels.(height starts counting from 1 so > 5)
    - Index is (possible) candidate for a bitmap index when :
    -   - distinctiveness is more than 99%
    */
    if (height > 5) OR ((del_lf_rows / lf_rows) > 0.2) then
      dbms_output.put_line(rpad(r_indx.owner, 16, ' ') ||
                           rpad(r_indx.index_name, 40, ' ') ||
                           lpad(round((del_lf_rows / lf_rows) * 100, 3),
                                17,
                                ' ') || lpad(height - 1, 7, ' ') ||
                           lpad(round((lf_rows - distinct_keys) * 100 /
                                      lf_rows,
                                      3),
                                16,
                                ' '));
    end if;

  end loop;
  DBMS_SQL.CLOSE_CURSOR(c_name);
end;
/
spool off;
set verify on;

Sample Output:

SQL> @rebuild_index

Output-file : index_rebuild
Schema name (% allowed) : maclean

Rebuild the index when :
- deleted entries represent 20% or more of the current entries
- the index depth is more then 4 levels.
Possible candidate for bitmap index :
- when distinctiveness is more than 99%

Owner           Index Name                              % Deleted Entries Blevel Distinctiveness
--------------  ---------------------------------            ------------  -----           -----
MACLEAN         SYS_MTABLE_00000CFD4_IND_2                             25      0              25
MACLEAN         SYS_MTABLE_00000D3F3_IND_2                         33.333      0          33.333
PL/SQL procedure successfully completed.

沪ICP备14014813号-2

沪公网安备 31010802001379号