在OEL5上安装配置Oracle Gird Control 10.2.0.5

早期的Grid Control问题实在太多了,以至于把10.2.0.1的Grid Control升级到10.2.0.5几乎是不可能完成的任务;此外10.2.0.5以前的gc不支持11g作为repository database仓库数据库,不仅于此10.2.0.1版本是不支持rhel5或OEL5的,如果想安装的话rhel/oel 4是仅有理想的平台。这就这为我们制造了许多局限。如果是Fresh Installation的话似乎只安装软件(software only),而不在安装10.2.0.1阶段配置oms,在升级到10.2.0.5后再进行oms的config会是一种比较理想的安装方法。不过我们仍需要应付一个复杂的配置过程,写这个文档的目的是帮助我们应付(包括已安装过的人,因为很容易忘记)这种窘况。

1.我们需要一个已经存在的Oracle数据库,当然它应当是纯净的(没有相关的em repository),其版本最好是10.2.0.5或者11.2.0.2;我们假设你的数据库已经满足了一切安装grid control的前提要求,这包括设置几个初始化参数和装有dbms_shared_pool包等等。

2.其次你需要下载对应的软件,这包括了完全版本的10.2.0.1安装介质和10.2.0.5 gc patchset,并且最好有10.2.0.5版的agent(个人经验agent不太稳定,有时候需要重配,所以有介质的话会方便些):

  • Linux_Grid_Control_full_102010_disk1.zip
  • Linux_Grid_Control_full_102010_disk2.zip
  • Linux_Grid_Control_full_102010_disk3.zip
  • gc_x86_64_10205_part1of2.zip
  • gc_x86_64_10205_part2of2.zip
  • Linux_x86_64_Grid_Control_agent_download_10_2_0_5_0.zip

要下载这一大堆介质可能是我们安装过程中最麻烦的一件事情,特别是当你的网络状况欠佳的季节。
因为都是zip包,所以你只需要使用unzip命令将以上介质一一解压到合适目录就可以了。

3.配置OMS所在主机的内核参数和rpm包,下面给出了相关配置文件的示例值:

/etc/sysctl.conf:
kernel.shmall = 2097152
kernel.shmmax = 536870912
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
# semaphores: semmsl, semmns, semopm, semmni
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144

/* 注意以上参数并不一定适合你的主机,具体如何配置请参见Metalink文档 */

同时修改/etc/security/limits.conf参数文件:
*               soft    nproc   2047
*               hard    nproc   16384
*               soft    nofile  1024
*               hard    nofile  65536

/* 星号换上你的安装用户名,如oracle或者其他dba组成员 */

安装适当的rpm包,安装database时需要的包一律也都需要,此外请特别留意安装一下几个包:
compat-libstdc++-296-2.96-138.i386
libstdc++-devel-4.1.2-48.el5.x86_64
libstdc++-devel-4.1.2-48.el5.i386
glibc-devel-2.5-49.x86_64
glibc-devel-2.5-49.i386

并建立下列符号连接:
ln -s /usr/lib/libgdbm.so.2.0.0 /usr/lib/libdb.so.2

4.上述工作完成后我们需要修改response文件以满足安装的需要,在10.2.0.1安装介质的解压目录下操作:

[root@nas media]# ls
dcommon  doc  index.htm  install  libskgxn  oms  rdbms  response  runInstaller  stage

[root@nas media]# vi response/em_using_existing_db.rsp 

/* 修改em_using_existing_db.rsp响应文件 */

包括以下参数需要从默认值修改为指定值:
UNIX_GROUP_NAME="dba"
#dba应当是有效的安装用户所在组

FROM_LOCATION="/s01/media/oms/Disk1/stage/products.xml"
#FROM LOCATION指向安装介质stage目录下的products.xml文件

BASEDIR="/s01/app/gc"
#BASEDIR指向grid control安装的基础目录

INSTALLATION_NAME="oms10g"
#安装名

s_reposHost="rh3.oracle.com"
#repository数据库的主机名或ip地址

s_reposPort="1521"
#repository数据库的监听端口

s_reposSID="EMREP"
#repository数据库的sid

s_reposDBAPwd="maclean"
#repository数据库的sys用户密码

s_mgmtTbsName="/s01/orabase/oradata/EMREP/mgmt.dbf"
#repository数据库今后的mgmt表空间的数据文件名

s_ecmTbsName=s_mgmtTbsName="/s01/orabase/oradata/EMREP/mgmt_ecm.dbf"
#repository数据库今后的ecm表空间的数据文件名

s_securePassword="maclean"
#agent将来使用的secure密码

s_securePasswordConfirm="maclean"
#确认上一步的密码

b_lockedSelected=false
#确定agent交互是否被锁

s_reposPwd="maclean"
#确定仓库数据库中模式拥有者(sysman)的密码

s_reposPwdConfirm="maclean"
#确认上一步的密码

5.以静默方式安装grid control 10.2.0.1,但不配置oms:

[maclean@nas ~]$ export  TMP=/tmp

[maclean@nas ~]$  /s01/media/install/runInstaller -noconfig -silent -ignoreSysPrereqs -responseFile \
/s01/media/response/em_using_existing_db.rsp  use_prereq_checker=false b_skipDBValidation=true -force

以上安装完成后,运行相关脚本:
[maclean@nas ~]$ su - root -c "/home/maclean/oraInventory/orainstRoot.sh"

[maclean@nas ~]$ su - root -c "/s01/app/gc/oms10g/allroot.sh"

并使用opmonctl命令关闭http等服务:
[maclean@nas ~]$ /s01/app/gc/oms10g/opmn/bin/opmnctl stopall
opmnctl: stopping opmn and all managed processes...

5.接下来我们需要安装grid control 10.2.0.5补丁集,同样的需要修改一个response响应文件:

[maclean@nas 10205]$ unzip /tmp/gc_x86_64_10205_part2of2.zip
Archive:  /tmp/gc_x86_64_10205_part2of2.zip
extracting: p3731593_10205_Linux-x86-64.zip

[maclean@nas 10205]$ unzip p3731593_10205_Linux-x86-64.zip
..............

[maclean@nas ~]$ cp /s01/10205/3731593/Disk1/response/patchset.rsp /s01/10205/3731593/Disk1/response/oms_patchset.rsp

/* 对oms_patchset.rsp修改已有的参数为指定值 */

[maclean@nas ~]$ vi /s01/10205/3731593/Disk1/response/oms_patchset.rsp

ORACLE_HOME="/s01/app/gc/oms10g"
b_softwareonly=true
s_sysPassword="maclean"
sl_pwdInfo={ "maclean" }
oracle.iappserver.st_midtier:szl_InstanceInformation={ "maclean" }

ORACLE_HOME_NAME="oms10g"
#另外增加以上条目

[maclean@nas ~]$ /s01/10205/3731593/Disk1/runInstaller -noconfig -silent \
-responseFile /s01/10205/3731593/Disk1/response/oms_patchset.rsp

/* 以上10.2.0.5补丁安装完成后,同样需要执行root.sh */

[maclean@nas ~]$ su - root -c "/s01/app/gc/oms10g/root.sh"

6.完成上述安装后可以开始配置OMS了:

[maclean@nas ~]$ export PERL5LIB=/s01/app/gc/oms10g/perl/lib/5.6.1

[maclean@nas ~]$ /s01/app/gc/oms10g/perl/bin/perl /s01/app/gc/oms10g/sysman/install/ConfigureGC.pl \
/s01/app/gc
Base Directory: /s01/app/gc

 Starting ito execute Configuration Assistants: 

Running the configuration assistants using the following command:
/s01/app/gc/oms10g/oui/bin/runConfig.sh INV_PTR_LOC=/s01/app/gc/oms10g/oraInst.loc
ORACLE_HOME=/s01/app/gc/oms10g ACTION=configure MODE=perform
COMPONENT_XML={encap_oms.1_0_0_0_0.xml}
perform - mode is starting for action: configure

以上命令的具体形式为:
<OMS ORACLE_HOME>/perl/bin/perl <OMS ORACLE_HOME>/sysman/install/ConfigureGC.pl <Parent Directory filepath> 

配置会消耗大量的时间,建议通过.../oms10g/cfgtoollogs/oui/configActions<>.log.日志文件监控配置过程:

[root@nas oui]# cd /s01/app/gc/oms10g/cfgtoollogs/oui

[root@nas oui]# tail -f configActions2011-01-23_08-57-20-AM.log
... return status = 0 (success)
Oracle JAAS [Sun Jan 23 08:57:43 CST 2011]  $ORACLE_HOME/j2ee/home/config/jazn-data.xml is synchronized successfully to dcm repository.
Please check the log file [/s01/app/gc/oms10g/cfgtoollogs/jaznca.log] for details.

The plug-in Java Security Configuration Assistant has successfully been performed
------------------------------------------------------
------------------------------------------------------
The plug-in Web Cache Configuration Assistant is running

2
Start traversing...
got process-manager node
got ias-instance node
attrValue=IASPT

attrValue=DSA

attrValue=HTTP_Server

attrValue=LogLoader

attrValue=dcm-daemon

attrValue=OC4J

attrValue=WebCache

Entity found.

got ias-instance node
Current status is : enabled
Changing the value of port to enabled
 Modified ...
Before After WaitForComplete
After WaitForComplete
Completed smiSetStatus
Checking status ... enableconfiguration
In ... getWebcachePort
WebCache Default Port :7777
In ... updateApacheConf
Apache Port Value : 7777
Apache Before WaitForComplete
Apache After WaitForComplete
Checking for Apache updation status
Apache httpd.conf updated
smiTearDown
Will be checking the status ...
Webcache Configuration finished successfully

The plug-in Web Cache Configuration Assistant has successfully been performed
------------------------------------------------------
------------------------------------------------------
The plug-in Oracle Application Server Instance Configuration Assistant is running

The plug-in Oracle Application Server Instance Configuration Assistant has successfully been performed
------------------------------------------------------
------------------------------------------------------
The plug-in OC4J Instance Configuration Assistant is running

Reading ini file - '/s01/app/gc/oms10g/j2ee/deploy.ini'
Adding web-app 'IsWebCacheWorkingWeb.war' for app 'IsWebCacheWorking'.
Adding web-app 'wsrp-samples.war' for app 'portletapp'.
Initializing DCM...done.
OC4J instance 'home' already exists.
Starting OC4J instance 'home'...done.
Deploying application 'portletapp' to OC4J instance 'home'.
Notification ==> Application Deployer for portletapp STARTS [ 2011-01-23T08:58:00.972CST ]
Notification ==> Undeploy previous deployment
Notification ==> Removing files for app file:/s01/app/gc/oms10g/j2ee/home/applications/portletapp.ear
Notification ==> Copy the archive to /s01/app/gc/oms10g/j2ee/home/applications/portletapp.ear
Notification ==> Unpack portletapp.ear begins...
Notification ==> Unpack portletapp.ear ends...
Notification ==> Initialize portletapp.ear begins...
Notification ==> Initialize portletapp.ear ends...
Notification ==> Initialize wsrp-samples begins...
Notification ==> Initialize wsrp-samples ends...
Notification ==> deleting:  /s01/app/gc/oms10g/j2ee/home/applications/portletapp.ear
Notification ==> deleting:  /s01/app/gc/oms10g/j2ee/home/applications/portletapp/wsrp-samples.war
Notification ==> Application Deployer for portletapp COMPLETES [ 2011-01-23T08:58:01.319CST ] 

Deploying application 'IsWebCacheWorking' to OC4J instance 'home'.
Notification ==> Application Deployer for IsWebCacheWorking STARTS [ 2011-01-23T08:58:01.328CST ]
Notification ==> Undeploy previous deployment
Notification ==> Removing files for app file:/s01/app/gc/oms10g/j2ee/home/applications/IsWebCacheWorking.ear
Notification ==> Copy the archive to /s01/app/gc/oms10g/j2ee/home/applications/IsWebCacheWorking.ear
Notification ==> Unpack IsWebCacheWorking.ear begins...
Notification ==> Unpack IsWebCacheWorking.ear ends...
Notification ==> Initialize IsWebCacheWorking.ear begins...
Notification ==> Initialize IsWebCacheWorking.ear ends...
Notification ==> Initialize IsWebCacheWorkingWeb begins...
Notification ==> Initialize IsWebCacheWorkingWeb ends...
Notification ==> deleting:  /s01/app/gc/oms10g/j2ee/home/applications/IsWebCacheWorking.ear
Notification ==> deleting:  /s01/app/gc/oms10g/j2ee/home/applications/IsWebCacheWorking/IsWebCacheWorkingWeb.war
Notification ==> Application Deployer for IsWebCacheWorking COMPLETES [ 2011-01-23T08:58:01.362CST ] 

Calling updateConfig to notify DCM of new deployments...done.
Adding dependent libraries for application 'portletapp'...done.
Adding OC4J mount points for application 'portletapp'...done.
Adding OC4J mount points for application 'IsWebCacheWorking'...done.
Calling SMI to save changes.
SMISession.saveChanges succeeded.
Binding web app 'wsrp-samples' to default-web-site for application 'portletapp' in OC4J instance 'home'
Web app 'wsrp-samples' bound successfully.
Binding web app 'IsWebCacheWorkingWeb' to default-web-site for application 'IsWebCacheWorking' in OC4J instance 'home'
Web app 'IsWebCacheWorkingWeb' bound successfully.
Calling updateConfig to notify DCM of new web-bindings...done.
Adding application 'portletapp' to the DCM repository...done.
Application 'portletapp' deployed successfully.
Adding application 'IsWebCacheWorking' to the DCM repository...done.
Application 'IsWebCacheWorking' deployed successfully.
Stopping OC4J instance 'home'...done.
Calling SMI to retry init of failed plugins...done.
Terminating DCM...done.
Copying /s01/app/gc/oms10g/j2ee/deploy.ini to /s01/app/gc/oms10g/j2ee/deploy.ini.1295744298019.bak.
Writing any undeployed entries back to /s01/app/gc/oms10g/j2ee/deploy.ini.

Oc4jDeploy tool completed successfully!

The plug-in OC4J Instance Configuration Assistant has successfully been performed
------------------------------------------------------
------------------------------------------------------
The plug-in Register DCM Plug-Ins With EM is running

Operation successful.

The plug-in Register DCM Plug-Ins With EM has successfully been performed
------------------------------------------------------
------------------------------------------------------
The plug-in DCM Repository Backup Assistant is running

backup created: InstalledImage_EnterpriseManager0.nas

The plug-in DCM Repository Backup Assistant has successfully been performed
------------------------------------------------------
------------------------------------------------------
The plug-in EM Technology Stack Upgrade is running

The plug-in EM Technology Stack Upgrade has successfully been performed
------------------------------------------------------
------------------------------------------------------
The plug-in OMS Configuration is running

Operation Stopping OPMN Processes is in progress.
Operation EM Deploying is in progress.
Operation Creating OMS Respository is in progress.
Operation Configuring OMS is in progress.
OMS is being Secured and Lock is set to false.
Precompiling JSPs.
Performing installation of CLI services for client.
Operation Restarting OPMN Processes is in progress.

The plug-in OMS Configuration has successfully been performed
------------------------------------------------------
------------------------------------------------------
The plug-in Agent Configuration Assistant is running

Performing free port detection on host=nas
Securing the agent
Performing targets discovery and agent configuration

The plug-in Agent Configuration Assistant has failed its perform method
------------------------------------------------------
The action configuration has failed its perform method
###################################################

7.如果以上oms configuration顺利完成那么Grid Control的网页界面已经可以登录了,但我们在本地服务器上的agent仍是10.2.0.1版本的(所以Agent Configuration Assistant失败了),这里我推荐将原agent目录删除后通过10.2.0.5的agent介质(Linux_x86_64_Grid_Control_agent_download_10_2_0_5_0.zip)重新安装并部署,这样可以很大程度上避免出现问题。

索引使用空间异常增长一例

客户的某套系统上有一个表空间近日使用率异常增长,该表空间用以存储索引段,经过定位发现一个原本只有200M左右的索引使用将近30+G的空间,而且表现为绝大多数是未格式化的数据块。以下为通过 show_space脚本收集的段信息:

Unformatted Blocks = 1772568
Blocks with 00-25% free space = 0
Blocks with 26-50% free space = 2173
Blocks with 51-75% free space = 0
Blocks with 76-100% free space = 0
Full Blocks = 60762

Unformatted Blocks总数1772568,共使用1772568*16k=27G. 一个索引占用如此多的未格式化块似乎不可思议,首先想到的可能是一个威力惊人的Bug。

探索MOS,可以发现以下这个有趣的note:

Applies to:

Oracle Server – Enterprise Edition – Version: 10.2.0.3
Information in this document applies to any platform.

Symptoms

The problem as observed is that the space allocated for a table/index(partition) is far too high and keeps growing.

Using the AVG_ROW_SIZE * NUM_ROWS shows that the space as needed would be far less that the allocated space (in some cases allocated xxGb while xxMb would be expected).

Cause

Bug 5987262 TABLESPACE IS ABNORMALLY INCREASED BY UNFORMATTED BLOCKS.   This bug was closed as duplicate of unpublished Bug 5890312 HANG OBSERVED WHILE CREATING CTXCAT INDEX

The following PL/SQL block will show the space usage in the (partitioned) table

set serveroutput on
exec dbms_output.enable(1000000);

declare

unf number;
unfb number;
fs1 number;
fs1b number;
fs2 number;
fs2b number;
fs3 number;
fs3b number;
fs4 number;
fs4b number;
full number;
fullb number;

own dba_tables.owner%type;
tab dba_tables.table_name%type;

yesno varchar2(3);

type parts is table of dba_tab_partitions%rowtype;
partlist parts;

type cursor_ref is ref cursor;
c_cur cursor_ref;

begin

own:=upper(‘&owner’);
tab:=upper(‘&table_name’);
dbms_output.put_line(‘——————————————————————————–‘);

open c_cur for select partitioned from dba_tables
where owner=own and table_name=tab;
fetch c_cur into yesno;
close c_cur;

dbms_output.put_line(‘Owner:     ‘||own);
dbms_output.put_line(‘Table:     ‘||tab);

dbms_output.put_line(‘————————————————‘);

if yesno=’NO’
then
dbms_space.space_usage(own,tab,’TABLE’,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb);
dbms_output.put_line(‘unf: ‘||unf||’ fs1: ‘||fs1||’ fs2: ‘||fs2||’ fs3: ‘||fs3||’ fs4: ‘||fs4||’ full: ‘||full);

else
open c_cur for select * from dba_tab_partitions
where table_owner=own and table_name=tab;
fetch c_cur bulk collect into partlist;
close c_cur;

for i in partlist.first .. partlist.last
loop
dbms_space.space_usage(partlist(i).table_owner,partlist(i).table_name,’TABLE PARTITION’,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,partlist(i).partition_name);
dbms_output.put_line(‘Partition: ‘||partlist(i).partition_name);
dbms_output.put_line(‘unf: ‘||unf||’ fs1: ‘||fs1||’ fs2: ‘||fs2||’ fs3: ‘||fs3||’ fs4: ‘||fs4||’ full: ‘||full);
end loop;

end if;
dbms_output.put_line(‘——————————————————————————–‘);

end;
/

In case there is a need to check a single table partition:

set serveroutput on
exec dbms_output.enable(1000000);

declare

unf number;
unfb number;
fs1 number;
fs1b number;
fs2 number;
fs2b number;
fs3 number;
fs3b number;
fs4 number;
fs4b number;
full number;
fullb number;

own dba_tables.owner%type;
tab dba_tables.table_name%type;
par dba_tab_partitions.partition_name%type;

begin

own:=upper(‘&owner’);
tab:=upper(‘&table_name’);
par:=upper(‘&partition_name’);

dbms_output.put_line(‘——————————————————————————–‘);

dbms_output.put_line(‘Owner:     ‘||own);
dbms_output.put_line(‘Table:     ‘||tab);
dbms_output.put_line(‘Partition: ‘||par);
dbms_output.put_line(‘————————————————‘);

dbms_space.space_usage(own,tab,’TABLE PARTITION’,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,par);
dbms_output.put_line(‘unf: ‘||unf||’ fs1: ‘||fs1||’ fs2: ‘||fs2||’ fs3: ‘||fs3||’ fs4: ‘||fs4||’ full: ‘||full);

dbms_output.put_line(‘——————————————————————————–‘);

end;
/

The following PL/SQL block will show the space usage in the (partitioned) index:

set serveroutput on
exec dbms_output.enable(1000000);

declare

unf number;
unfb number;
fs1 number;
fs1b number;
fs2 number;
fs2b number;
fs3 number;
fs3b number;
fs4 number;
fs4b number;
full number;
fullb number;

own dba_indexes.owner%type;
ind dba_indexes.index_name%type;

yesno varchar2(3);

type parts is table of dba_ind_partitions%rowtype;
partlist parts;

type cursor_ref is ref cursor;
c_cur cursor_ref;

begin

own:=upper(‘&owner’);
ind:=upper(‘&index_name’);
dbms_output.put_line(‘——————————————————————————–‘);

open c_cur for select partitioned from dba_indexes
where owner=own and index_name=ind;
fetch c_cur into yesno;
close c_cur;

dbms_output.put_line(‘Owner: ‘||own);
dbms_output.put_line(‘Index: ‘||ind);

dbms_output.put_line(‘————————————————‘);

if yesno=’NO’
then
dbms_space.space_usage(own,ind,’INDEX’,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb);
dbms_output.put_line(‘unf: ‘||unf||’ fs1: ‘||fs1||’ fs2: ‘||fs2||’ fs3: ‘||fs3||’ fs4: ‘||fs4||’ full: ‘||full);

else
open c_cur for select * from dba_ind_partitions
where index_owner=own and index_name=ind;
fetch c_cur bulk collect into partlist;
close c_cur;

for i in partlist.first .. partlist.last
loop
dbms_space.space_usage(partlist(i).index_owner,partlist(i).index_name,’INDEX PARTITION’,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,partlist(i).partition_name);
dbms_output.put_line(‘Partition: ‘||partlist(i).partition_name);
dbms_output.put_line(‘unf: ‘||unf||’ fs1: ‘||fs1||’ fs2: ‘||fs2||’ fs3: ‘||fs3||’ fs4: ‘||fs4||’ full: ‘||full);
end loop;

end if;
dbms_output.put_line(‘——————————————————————————–‘);

end;
/

In case there is a need to check a single index partition:

set serveroutput on
exec dbms_output.enable(1000000);

declare

unf number;
unfb number;
fs1 number;
fs1b number;
fs2 number;
fs2b number;
fs3 number;
fs3b number;
fs4 number;
fs4b number;
full number;
fullb number;

own dba_indexes.owner%type;
ind dba_indexes.index_name%type;
par dba_ind_partitions.partition_name%type;

begin

own:=upper(‘&owner’);
ind:=upper(‘&index_name’);
par:=upper(‘&partition_name’);

dbms_output.put_line(‘——————————————————————————–‘);

dbms_output.put_line(‘Owner: ‘||own);
dbms_output.put_line(‘Index: ‘||ind);
dbms_output.put_line(‘Partition: ‘||par);
dbms_output.put_line(‘————————————————‘);

dbms_space.space_usage(own,ind,’INDEX PARTITION’,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,par);
dbms_output.put_line(‘unf: ‘||unf||’ fs1: ‘||fs1||’ fs2: ‘||fs2||’ fs3: ‘||fs3||’ fs4: ‘||fs4||’ full: ‘||full);

dbms_output.put_line(‘——————————————————————————–‘);

end;
/

If the value of UNF makes up the major part of the space as allocated,  unpublished bug 5890312 is hit.
In a specific case a table had a value of 4GB unformatted blocks out of 4.3GB allocated blocks.

Solution

The actions as needed to resolve this issue are:

1) Install the patch for unpublished Bug 5890312 in order to prevent future occurrences

2) Resolve the current space allocation, options:
– shrink the table (partition)
– recreate the table (partition)
– recreate the index (partition)

以上关于表空间异常增长且表(或表分区)和索引(或索引分区)中分配了大量未格式化的块的描述与我们的问题十分相似,但该文档的指出可能发生该问题的版本是10.2.0.3,而我们的版本已经是10.2.0.4了,是否又一次是Oracle的”伪修复”呢?怀着疑虑我们提交了SR,希望MOS帮助我们确认是否”hit Bug 729149.1″。

来自MOS的回复:

“The bug is fixed in 10.2.0.4.3 as well as 11.1.0.6.
You might as well apply the lastest PSU available for your platform: 10.2.0.4.4 Patch:9352164

I’m afraid there is no one off (interim) patch available for 10.2.0.4 on Solaris SPARC.
It was only provided for 9.2.0.8 and 10.2.0.3.
The one off patch 5890312 has been made available in Recommended Patch Sets (PSU): 10.2.0.4.3 and 10.2.0.4.4.

More information about Oracle Recommended Patches and their availability can be found in:
NOTE.756671.1 Oracle Recommended Patches — Oracle Database

If you have trouble applying any of the recommended PSUs, please let me know and i will log a one off backport request on your behalf  to ask development is it would be possible to provide you with the one off patch on 10.2.0.4.”

看起来我们并不走运,在10.2.0.4的基础上甚至没有one-off补丁,如果要彻底解决只能打PSU。

这个Bug最早出现是在9.2.0.8上,有一个相关的说明:

Hdr: 5987262 9.2.0.8.0 RDBMS 9.2.0.8.0 SPACE PRODID-5 PORTID-197 5890312
Abstract: TABLESPACE IS ABNORMALLY INCREASED  BY UNFORMATTED BLOCKS

*** 04/12/07 12:32 am ***
TAR:
----
6265975.992

PROBLEM:
--------
SQL>  select count(*) from pb.PBTAJUMT;

  COUNT(*)
----------
    897807

SQL> select sum(bytes)/1024/1024 from dba_extents where
segment_name='PBTAJUMT';

SUM(BYTES)/1024/1024
--------------------
               22008

SQL> select count(*) from dba_extents where segment_name='PBTAJUMT';

  COUNT(*)
----------
      5502

SQL>
       UNF
----------
   1404407

SQL>
       FS4
----------
        87

SQL>
       FS3
----------
         8

SQL>
       FS2
----------
         1

SQL>
       FS1
----------
        99

SQL>
       FULL
----------
      2291

AFTER 18 rows inserted >>

SQL> select count(*) from pb.PBTAJUMT;

  COUNT(*)
----------
    897825

SQL> select sum(bytes)/1024/1024 from dba_extents where
segment_name='PBTAJUMT';

SUM(BYTES)/1024/1024
--------------------
               23004

SQL> select count(*) from dba_extents where segment_name='PBTAJUMT';

  COUNT(*)
----------
      5751

SQL>
       UNF
----------
   1468072

SQL>
       FS4
----------
       103

SQL>
       FS3
----------
         7

SQL>
       FS2
----------
         2

SQL>
       FS1
----------
        56

SQL>
      FULL
----------
      2334

After 18 rows inserted, extents increased about 1G btyes.

DIAGNOSTIC ANALYSIS:
--------------------
ALERT LOG : node2 >

Thu Apr 12 09:40:15 2007
ORA-1653: unable to extend table PB.PBTAJUMT by 256 in
tablespace PB
Thu Apr 12 09:42:48 2007
ORA-1653: unable to extend table PB.PBTAJUMT by 256 in
tablespace PB
..
Thu Apr 12 09:44:37 2007
ORA-1653: unable to extend table PB.PBTAJUMT by 256 in
tablespace PB
Thu Apr 12 09:48:06 2007
ORA-1653: unable to extend table PB.PBTAJUMT by 256 in
tablespace PB
Thu Apr 12 09:49:49 2007
alter database datafile '/dev/oravg05/roravg05_04' resize 1000m
Thu Apr 12 09:50:21 2007
Completed: alter database datafile '/dev/oravg05/roravg05_04'
Thu Apr 12 09:50:39 2007
ORA-1653: unable to extend table PB.PBTAJUMT by 256 in
tablespace PB
...
Thu Apr 12 11:43:07 2007
 alter database datafile '/dev/oravg05/roravg05_04' resize 3300m
Completed:  alter database datafile '/dev/oravg05/roravg05_04
Thu Apr 12 11:43:51 2007
ORA-1653: unable to extend table PB.PBTAJUMT by 256 in
tablespace PB
Thu Apr 12 11:44:35 2007
alter database datafile '/dev/oravg05/roravg05_04' resize 3500
Thu Apr 12 11:44:37 2007
ORA-3214 signalled during: alter database datafile
'/dev/oravg05/roravg05_04'...
Thu Apr 12 11:44:54 2007
alter database datafile '/dev/oravg05/roravg05_04' resize 3500m
Thu Apr 12 11:45:05 2007
Completed: alter database datafile '/dev/oravg05/roravg05_04'

kjbdb1_ora_23026.trc :: segment dump >>

After a specific  extents , we can find owner is only inst 2 .
...
 0x05073607  Free: 2 Inst: 2
   0x05073608  Free: 1 Inst: 2
   0x05482905  Free: 2 Inst: 2
   0x05482906  Free: 1 Inst: 1
   0x05482907  Free: 5 Inst: 1
   0x05482908  Free: 5 Inst: 1
   0x11400405  Free: 5 Inst: 2 <======
   0x11400406  Free: 5 Inst: 2
   0x11400407  Free: 5 Inst: 2
   0x11400408  Free: 5 Inst: 2
   0x11400505  Free: 5 Inst: 2
   0x11400506  Free: 5 Inst: 2
   0x11400507  Free: 5 Inst: 2
   0x11400508  Free: 5 Inst: 2
   0x11400605  Free: 5 Inst: 2
   0x11400606  Free: 5 Inst: 2
   0x11400607  Free: 5 Inst: 2
   0x11400608  Free: 5 Inst: 2
   0x11400705  Free: 5 Inst: 2
   0x11400706  Free: 5 Inst: 2
   0x11400707  Free: 5 Inst: 2
   0x11400708  Free: 5 Inst: 2
   0x11400805  Free: 5 Inst: 2
   0x11400806  Free: 5 Inst: 2
   0x11400807  Free: 5 Inst: 2
   0x11400808  Free: 5 Inst: 2
   0x11400905  Free: 5 Inst: 2
   0x11400a05  Free: 5 Inst: 2
   0x11400b05  Free: 5 Inst: 2
   0x11400c05  Free: 5 Inst: 2
   0x11400d05  Free: 5 Inst: 2
   0x11400e05  Free: 5 Inst: 2
   0x11400f05  Free: 5 Inst: 2
   0x11401005  Free: 5 Inst: 2
   0x11401105  Free: 5 Inst: 2
   0x11401205  Free: 5 Inst: 2
   0x11401305  Free: 5 Inst: 2
   0x11401405  Free: 5 Inst: 2
   0x11401505  Free: 5 Inst: 2
   0x11401605  Free: 5 Inst: 2
   0x11401705  Free: 5 Inst: 2
   0x11401805  Free: 5 Inst: 2
   0x11401905  Free: 5 Inst: 2
   0x11401a05  Free: 5 Inst: 2
   0x11401b05  Free: 5 Inst: 2
   0x11401c05  Free: 5 Inst: 2
   0x11401d05  Free: 5 Inst: 2
   0x11401e05  Free: 5 Inst: 2
   0x11401f05  Free: 5 Inst: 2
   0x11402005  Free: 5 Inst: 2
   0x11402105  Free: 5 Inst: 2
   0x11402205  Free: 5 Inst: 2
   0x11402305  Free: 5 Inst: 2
   0x11402405  Free: 5 Inst: 2
   0x11402505  Free: 5 Inst: 2
   0x11402605  Free: 5 Inst: 2
   0x11402705  Free: 5 Inst: 2
   0x11402805  Free: 5 Inst: 2
   0x11402905  Free: 5 Inst: 2
   0x11402a05  Free: 5 Inst: 2
   0x11402b05  Free: 5 Inst: 2
   0x11402c05  Free: 5 Inst: 2
   0x11403605  Free: 5 Inst: 2
   0x11403705  Free: 5 Inst: 2

WORKAROUND:
-----------
none

RELATED BUGS:
-------------
bug.5945288, bug.4887955, bug.2447046
REPRODUCIBILITY:
----------------
No, cannot reproduce intentionally. But happened 4 times on product machine.

TEST CASE:
----------

STACK TRACE:
------------

SUPPORTING INFORMATION:
-----------------------
This system migrated on 19th of Mar. It happened 4 times from 20th of Mar.
24 HOUR CONTACT INFORMATION FOR P1 BUGS:

MOS无法重现该Bug,但可以确认该Bug的存在,具体触发的条件也不得而知。

如果您的环境无法在短期内应用补丁的话,可以通过 shrink/recreate 表(或表分区)和索引(或索引分区)来workaround这个问题,事实上我感觉这个Bug的触发条件并不简单,也就是说出现概率并不高;但如果您觉得自己不是那么幸运,那么还尽快将您的产品数据库升级到10.2.0.4.4以上或者10.2.0.5(我们是激进派)吧!

沪ICP备14014813号-2

沪公网安备 31010802001379号