如何使用gdb工具对Oracle系统状态(systemstate)做trace

当Oracle系统hang住 ,无法使用一切方法登录时 (包括 sqlplus -prelim / as sysdba),我们可以使用gdb调试工具来对 Oracle做系统 dump ,通过 系统 dump信息 判断 具体hang的原因 。 若直接 将 进程 kill 掉,则将失去现场 无法帮助今后避免 这样的hang情况。

要使用gdb 外部工具, 就需要知道目前实例中后台进程的进程号。

我们一般通过 以下命令列出 Oracle 进程:ps -ef|grep <SID>

[oracle@rh2 ~]$ ps -ef|grep oraclewebmoney
oracle   16996 16995  0 21:55 ?        00:00:00 oraclewebmoney (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

然后启动gdb ,指定Oracle软件中二进制文件 oracle的位置和 进程id

[oracle@rh2 udump]$ gdb $ORACLE_HOME/bin/oracle  16996
GNU gdb Red Hat Linux (6.3.0.0-1.159.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB.  Type “show warranty” for details.
This GDB was configured as “x86_64-redhat-linux-gnu”…
(no debugging symbols found)
Using host libthread_db library “/lib64/tls/libthread_db.so.1″.

Attaching to program: /u01/oracle/product/10.2.0/db_1/bin/oracle, process 14594
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libskgxp10.so…(no debugging symbols found)…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libskgxp10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libhasgen10.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libhasgen10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libskgxn2.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libskgxn2.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libocr10.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libocr10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libocrb10.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libocrb10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libocrutl10.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libocrutl10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libjox10.so…

在gdb 提示行中 输入 print ksudss(10),即

(gdb) print ksudss(10)

之后将在udump目录中产生相关<SID>_ora_<pid>的trace文件,我们通过分析trace可以发现hang的主要原因。

trace文件示例如下:

System name:    Linux
Node name:      rh2
Release:        2.6.9-78.ELsmp
Version:        #1 SMP Wed Jul 9 15:46:26 EDT 2008
Machine:        x86_64
Instance name: webmoney
Redo thread mounted by this instance: 1
Oracle process number: 15
Unix process pid: 16996, image: oracle@rh2 (TNS V1-V3)

*** 2009-09-07 21:57:14.100
*** SERVICE NAME:(SYS$USERS) 2009-09-07 21:57:14.100
*** SESSION ID:(528.2041) 2009-09-07 21:57:14.100
===================================================
SYSTEM STATE
————
System global information:
processes: base 0x91637c30, size 500, cleanup 0x9167a2e0
allocation: free sessions 0x91779840, free calls (nil)
control alloc errors: 0 (process), 0 (session), 0 (call)
PMON latch cleanup depth: 0
seconds since PMON’s last scan for dead processes: 45
system statistics:

[gview file=”http://askmac.cn/wp-content/uploads/resource/gdb.pdf”]

Oracle database 11g release2发布

万众期待的11g r2终于掀开了盖头来,作为先行军,linux平台无疑可以让最多专家和用户更好最广泛的测试新版本。

oracle甚至为其使用了独立的域名:www.oracledatabase11g.com,针对新产品使用独立域名的作为并不多见,而该域名目前的pr 值 及 alex排名均甚低。究其根本 可能是一种潮流,类似暴雪的游戏产品均拥有自己独立的域名, 我们可以猜想今后会出现 oracledatabase12[X].com 和 oracledatabase13[X].com。下为11g主页的横幅flash。

在《Oracle announces availability of Oracle Database 11g Release 2》中,Oracle展开了一贯的忽悠伎俩,提出了以下优势:更成熟的网格计算帮助企业减少成本,节约存储成本,更少的无用冗余,自动调优与扩展性(更强大的awr和memeory auto tuning )。

以上优势当然是我们希望的,但现实往往是残酷的。 看到11g 逐渐成熟 走向 更广泛的运用,  则DBA本身需要掌握的 新特性 ,以及 新版本中不同的 “古怪脾性”  ,相信有机会一一体验。。。。

书籍推荐:《Secrets of the Oracle Database》

Oracle数据库的小秘密,作者是Norbert Debes;

就内容而言并非最internal,换句话说对日常管理还是有用的,当然是对expert而言。

我的博客中翻译了他对AUDIT_SYSLOG_LEVEL解释的若干章节,今后会继续努力。

该书出版已经较长时间,但国内并无任何出版迹象;实际DBA专题类书的销量也不大,始终属于小众类的。

书的内容还是很有价值的,特别是对于几个参数的深入研究和使用perl 尝试相关实验的内容。

附上封面:

secrets of oracle database database

附上下载地址:Oracle Secrets.

我们以学习为目的…….

如何确定Oracle软件中是否安装了RAC选项

1. 首先切换到 $ORACLE_HOME/rdbms/lib目录

2. 运行以下命令:
    nm -r libknlopt.a | grep -c kcsm.o

若以上命令返回0,则rac选项未安装,也就是有一部分2进制文件未链接。若返回值大于0,则安装了rac选项。

若希望在不重新安装的前提下启用rac选项,可以如下操作:

在登陆所有节点,并关闭所有实例,以NORMAL或IMMEDIATE方式均可。

停止其他一切资源(包括监听器,ASM实例等)

切换到$ORACLE_HOME/lib目录:

运行命令: make -f ins_rdbms.mk rac_on

附  ins_rdbms.mk 文件

RAC动态资源(DRM)管理介绍

以下文本摘自: metalink doc 390483.1

Subject:  DRM – Dynamic Resource management
Doc ID:  390483.1  Type:  BULLETIN
Modified Date :  13-JAN-2009  Status:  PUBLISHED

In this Document
Purpose
Scope and Application
DRM – Dynamic Resource management
DRM – Dynamic Resource Mastering
References

Applies to:
Oracle Server – Enterprise Edition – Version: 10.1.0.2 to 11.1.0
Oracle Server – Standard Edition – Version: 10.1.0.2 to 11.1.0
Information in this document applies to any platform.
Oracle Real application Clusters
Purpose

To describe the concept of DRM (Dynamic Resource Mastering)
Scope and Application

This note in intended for experienced Real application cluster DBA’s
DRM – Dynamic Resource management
DRM – Dynamic Resource Mastering

When using Real application Clusters (RAC), Each instance has its own SGA and buffer cache. RAC will ensure that these block changes are co -ordinated to maximize performance and to ensure data intergrity. Each copy of the buffer also called as a cache resource has a master which is one of the nodes of the cluster.

In database releases before 10g (10.1.0.2) once a cache resource is mastered on an instance, a re-mastering or a change in the master would take place only during a reconfiguration that would happen automatically during both normal operations like instance startup or instance shutdown or abnormal events like Node eviction by Cluster Manager. So if Node B is the master of a cache resource, this resource will remain mastered on Node B until reconfiguration.

10g  introduces a concept of resource remastering via DRM. With DRM a resource can be re-mastered on another node say from Node B to Node A if it is found that the cache resource is accessed more frequently from Node A. A reconfiguration is no longer the only reason for a resource to be re-mastered.

In 10gR1 DRM is driven by affinity of files and in 10gR2 it is based on objects.

Sample LMD trace file during a DRM operation

Begin DRM(202) - transfer pkey 4294951314 to 0 oscan 1.1
*** 2006-08-01 17:34:54.645
Begin DRM(202) - transfer pkey 4294951315 to 0 oscan 1.1
*** 2006-08-01 17:34:54.646
Begin DRM(202) - transfer pkey 4294951316 to 0 oscan 1.1
*** 2006-08-01 17:34:54.646
Begin DRM(202) - transfer pkey 4294951317 to 0 oscan 1.1

DRM attributes are intentionally undocumented since they may change depending on the version. These attributes should not be changed without discussing with Support.
@DRM is driven by the following
@ 1.) _gc_affinity_time = Time in minutes at which statistics will be evaluated (default = 10 mins)
@ 2.) _gc_affinity_limit = # of times a node accesses a file/object (default = 50)
@ 3.) _gc_affinity_minimum = minimum # of times per minute a file/object is accessed before affinity kicks in
@ (default = 600 per minute per cpu )

It is important to note that

  1. Two instance will not start a DRM operation at the same time however lmd,lms,lmon processes from all instances collectively take part in the DRM operation.
  2. Normal activity on the database is not affected due to DRM. This means users continue insert/update/delete operations without any interruptions. Also DRM operations complete very quickly.

@

Disable DRM
Generally DRM should not be disabled unless Oracle Support/Development has suggested turning it off due to some known issues.
@To disable DRM, set
@To disable DRM, set
@_gc_affinity_time=0                                 # Only if DB version is 10.1 or 10.2
@_gc_undo_affinity=FALSE                       # Only if Db version is 10.2
@_gc_policy_time=FALSE                         # Only if DB version is 11.1 or higher
@_gc_affinity_time has been renamed to _gc_policy_time in 11g

参数cluster_interconnect详细介绍

以下文本摘自metalink doc:

This note attempts to clarify the cluster_interconnects parameter and the
platforms on which the implementation has been made. A brief explanation on
the workings of the parameter has also been presented in this note.
This is also one of the most frequently questions related to cluster and RAC
installations on most sites and forms a part of the prerequisite as well.

ORACLE 9I RAC – Parameter CLUSTER_INTERCONNECTS
———————————————–

FREQUENTLY ASKED QUESTIONS
————————–
November 2002

CONTENTS
——–
1.  What is the parameter CLUSTER_INTERCONNECTS for ?
2.  Is the parameter CLUSTER_INTERCONNECTS available for all platforms ?
3.  How is the Interconnect recognized on Linux ?
4.  Where could I find more information on this parameter ?
5.  How to detect which interconnect is used ?
6.  Cluster_Interconnects is mentioned in the 9i RAC administration
    guide as a Solaris specific parameter, is this the only platform
    where this parameter is available ?
7.  Are there any side effects for this parameter, namely affecting normal
    operations ?
8.  Is the parameter OPS_INTERCONNECTS which was available in 8i similar
    to this parameter ?
9.  Does Cluster_interconnect allow failover from one Interconnect to another
    Interconnect ?
10. Is the size of messages limited on the Interconnect ?
11. How can you see which protocoll is being used by the instances ?
12. Can the parameter CLUSTER_INTERCONNECTS be changed dynamically during runtime ?

 
QUESTIONS & ANSWERS
——————-
1. What is the parameter CLUSTER_INTERCONNECTS for ?

Answer
——
This parameter is used to influence the selection of the network interface
for Global Cache Service (GCS) and Global Enqueue Service (GES) processing.

This note does not compare the other elements of 8i OPS with 9i RAC
because of substantial differences in the behaviour of both architectures.
Oracle 9i RAC has certain optimizations which attempt to transfer most of
the information required via the interconnects so that the number of disk
reads are minimized. This behaviour known as Cache fusion phase 2 is summarised
in Note 139436.1
The definition of the interconnnect is a private network which
will be used to transfer the cluster traffic and Oracle Resource directory
information and blocks to satisfy queries. The technical term for that is
cache fusion.

The CLUSTER_INTERCONNECTS should be used when
– you want to override the default network selection
– bandwith of a single interconnect does not meet the bandwith requirements of
  a Real Application Cluster database

The syntax of the parameter is:

CLUSTER_INTERCONNECTS = if1:if2:…:ifn
Where if<n> is an IP address in standard dotted-decimal format, for example,
144.25.16.214. Subsequent platform implementations may specify interconnects
with different syntaxes.
2. Is the parameter CLUSTER_INTERCONNECTS available for all platforms ?

Answer
——

This parameter is configurable on most platforms.
This parameter can not be used on Linux.

The following Matrix shows when the parameter was introduced on which platform:

Operating System    Available since
AIX                   9.2.0
HP/UX                 9.0.1
HP Tru64              9.0.1
HP OPenVMS            9.0.1
Sun Solaris           9.0.1

References
———-
Bug <2119403> ORACLE9I RAC ADMINISTRATION SAYS CLUSTER_INTERCONNECTS IS SOLARIS ONLY.
Bug <2359300> ENHANCE CLUSTER_INTERCONNECTS TO WORK WITH 9I RAC ON IBM
3.  How is the Interconnect recognized on Linux ?

Answer
——
Since Oracle9i 9.2.0.8 CLUSTER_INTECONNETCS can be used to change the interconnect.
A patch is also available for 9.2.0.7 under Patch 4751660.
Before 9.2.0.8 the Oracle implementation for the interface selection reads the ‘private hostname’
in the cmcfg.ora file and uses the corresponding ip-address for the interconnect.
If no private hostname is available the public hostname will be used.
4.  Where could I find information on this parameter ?

Answer
——

The parameter is documented in the following books:
Oracle9i Database Reference Release 2 (9.2)
Oracle9i Release 1 (9.0.1) New Features in Oracle9i Database Reference –
                   What’s New in Oracle9i Database Reference?
Oracle9i Real Application Clusters Administration Release 2 (9.2)
Oracle9i Real Application Clusters Deployment and Performance Release 2 (9.2)

Also port specific documentation may contain information about the usage of
the cluster_interconnects parameter.

Documentation can be viewed on
    http://tahiti.oracle.com
    http://otn.oracle.com/documentation/content.html
References:
———–
Note 162725.1: OPS/RAC VMS: Using alternate TCP Interconnects on 8i OPS
               and 9i RAC on OpenVMS

Note 151051.1: Init.ora Parameter “CLUSTER_INTERCONNECTS” Reference Note

5. How to detect which interconnect is used ?
    The following commands show which interconnect is used for UDP or TCP:
    sqlplus> connect / as sysdba
             oradebug setmypid
             oradebug ipc
             exit

    The corresponding trace can be found in the user_dump_dest directory and for
    example contains the following information in the last couple of lines:

           SKGXPCTX: 0x32911a8 ctx
           admno 0x12f7150d admport:
           SSKGXPT 0x3291db8 flags SSKGXPT_READPENDING     info for network 0
                 socket no 9     IP 172.16.193.1         UDP 43307
                 sflags SSKGXPT_WRITESSKGXPT_UP
                 info for network 1
                 socket no 0     IP 0.0.0.0      UDP 0
                 sflags SSKGXPT_DOWN
           context timestamp 0x1ca5
                 no ports
   Please note that on some platforms and versions (Oracle9i 9.2.0.1 on Windows)
   you might see an ORA-70 when the command oradebug ipc has not been
   implemented.

   When  other protocols such as LLT, HMP or RDG are used, then the trace file will not
   reveal an IP address.
6.  Cluster_Interconnects is mentioned in the 9i RAC administration
    guide as a Solaris specific parameter, is this the only platform
    where this parameter is available ?

Answer
—– 

This information that this parameter works on Solaris only is incorrect. Please
check the answer for question number 2 for the complete list of platforms for the same.

References:
———–
bug <2119403> ORACLE9I RAC ADMINISTRATION SAYS CLUSTER_INTERCONNECTS IS SOLARIS ONLY.
7.  Are there any side effects for this parameter, namely affecting normal
    operations ?

Answer
—–
When you set CLUSTER_INTERCONNECTS in cluster configurations, the
interconnect high availability features are not available. In other words,
an interconnect failure that is normally unnoticeable would instead cause
an Oracle cluster failure as Oracle still attempts to access the network
interface which has gone down. Using this parameter you are explicitly
specifying the interface or list of interfaces to be used.
 

8.  Is the parameter OPS_INTERCONNECTS which was available in 8i similar
    to this parameter ?

Answer
——
Yes, the parameter OPS_INTERCONNECTS was used to influence the network selection
for the Oracle 8i Parallel Server.

Reference
———
Note <120650.1> Init.ora Parameter “OPS_INTERCONNECTS” Reference Note
9.  Does Cluster_interconnect allow failover from one Interconnect to another
    Interconnect ?

Answer
——
Failover capability is not implemented at the Oracle level. In general this
functionality is delivered by hardware and/or Software of the operating system.
For platform details please see Oracle platform specific documentation
and the operating system documentation.
10. Is the size of messages limited on the Interconnect ?

Answer
——
The message size depends on the protocoll and platform.
UDP: In Oracle9i Release 2 (9.2.0.1) message size for UDP was limited to 32K.
     Oracle9i 9.2.0.2 allows to use bigger UDP message sizes depending on the
     platform. To increase throughput on an interconnect you have to adjust
     udp kernel parameters.
TCP: There is no need to set the message size for TCP.
RDG: The recommendations for RDG are documented in
        Oracle9i Administrator’s Reference – Part No. A97297-01
References
———-
Bug <2475236> RAC multiblock read performance issue using UDP IPC
11. How can you see which protocoll is being used by the instances ?

Answer
——
Please see the alert-file(s) of your RAC instances. During startup you’ll
   find a message in the alert-file that shows the protocoll being used.

      Wed Oct 30 05:28:55 2002
      cluster interconnect IPC version:Oracle UDP/IP with Sun RSM disabled
      IPC Vendor 1 proto 2 Version 1.0
12. Can the parameter CLUSTER_INTERCONNECT be changed dynamically during runtime ?

Answer
——
    No. Cluster_interconnects is a static parameter and can only be set in the
    spfile or pfile (init.ora)

如何跟踪dbca工具,Tracing the Database Configuration Assistant (DBCA)

以下文档摘自metalink doc : 188134.1

Subject: Tracing the Database Configuration Assistant (DBCA)
  Doc ID: 188134.1 Type:  BULLETIN
  Modified Date:  09-OCT-2008 Status:  PUBLISHED
Purpose
——-

The Purpose of this document is to assist in debugging DBCA problems.

 
Scope & Application
——————-

This document is for DBAs and support analysts to troubleshoot DBCA issues.
Tracing the Database Configuration Assistant (DBCA)
—————————————————
 
To provide verbose output for DBCA, tracing can be enabled to provide additional
output.  To turn on tracing from 9i to 10g:

1. Do a copy of the original dbca file in $ORACLE_HOME/bin. For example:

   cp $ORACLE_HOME/bin/dbca $ORACLE_HOME/bin/dbca.ori

2. vi the dbca file in the $ORACLE_HOME/bin directory.

3. At the end of the file, look for the following line:

   # Run DBCA
   $JRE_DIR/bin/jre -DORACLE_HOME=$OH -DJDBC_PROTOCOL=thin -mx64m -classpath
   $CLASSPATH oracle.sysman.assistants.dbca.Dbca $ARGUMENTS

4. Add the following just before the -classpath in the ‘$JRE_DIR’ line:
   -DTRACING.ENABLED=true -DTRACING.LEVEL=2
5. At the end of the dbca file, the string should now look like this:

   # Run DBCA
   $JRE_DIR/bin/jre -DORACLE_HOME=$OH -DJDBC_PROTOCOL=thin -mx64m
   -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath $CLASSPATH
   oracle.sysman.assistants.dbca.Dbca $ARGUMENTS

The -DTRACING.LEVEL was missing a space after the 2 so that it looks
like 2-classpath.

6. To trace run:

   [opcbsol1]/u01/home/usupport> dbca > dbca.out &

The output will be written to the dbca.out file.
In 10.2  you can add the -DDEBUG flag  so that you get the debug information
given to you interactively

“C:\oracle\product\10.2.0\racdb\jdk\jre\BIN\JAVA” -DDEBUG -DORACLE_HOME=”%OH%” –
DJDBC_PROTOCOL=thin -mx128m oracle.sysman.assistants.dbca.Dbca %*
BUT in 10.2 you should already have the trace information written automatically
to the following location

$ORACLE_HOME/cfgtoollogs/dbca/trace.log

In 11g, the location of trace.log has changed to:

$ORACLE_BASE/cfgtoollogs/dbca/<DBNAME>/trace.log

.

RMAN内部原理介绍

RMAN实用程序由两部分组成:可执行文件和recover.bsq文件。recover.bsq文件实质上是库文件,可执行文件从recover.bsq文件中析取代码来创建在目标数据库上执行的PL/SQL调用。recover.bsq文件是整个操作的中枢。这两个文件始终保持链接,并从逻辑上构成RMAN客户端实用程序。需要指出的是,recover.bsq文件和可执行文件的版本必须相同,否则就不能正常工作。

RMAN实用程序有一种独特的,有序的,可预测的用法:解释在目标数据库中远程执行的PL/SQL调用中的命令。RMAN的命令语言比较独特,需要具备一定经验才能掌握。实质上,RMAN实用程序完成了我们需要做的所有备份,还原,恢复,和处理备份工作。由可执行解释程序负责解释这些程序命令,并且使这些命令与recover.bsq文件中的PL/SQL块匹配。RMAN随后将这些RPC传递给数据库,并基于用户的请求收集信息。如果命令要求I/O操作(可能是备份或还原操作),RMAN会在返回信息时准备另一个过程块,并将这个过程块传递回目标数据块。这些过程块负责执行OS的系统调用以进行指定的读或写操作。

10.2.0.1版本中的recover.bsq文件

在32位的linux平台上为Oracle配置>1.7GB的SGA

在没有任何附加改动的情况下,32位linux上Oracle对SGA的默认配置最大为1.7GB 。在微小改动的情况下,可以分配最大占用2.7GB RAM空间的SGA。如果大小要超过4GB就要做其他改动。

注意{运行hugemem kernel模式时,相应的限制分别是2.7GB和3.7GB。由于hugemem内核存在额外开销,因此只有在系统RAM不低于16GB的情况下才运行hugemem内核。}

为了分配大于1.7GB小于2.8GB的SGA,在Oracle启动时必须减小基地址,称之为映射基地址(mapped base)。可以采取下面的步骤减少基地址的映射:

(1)  cd $ORACLE_HOME/rdbms/lib

(2)利用genksms命令创建一个汇编文件:

genksms  -s 0x15000000 > ksms.s

(3)生成对象文件

make -f    ins_rdbms.mk  ksms.o

(4)生成Oracle可执行文件:

make -f ins_rdbms.mk ioracle

接下来需要做的事把共享内存参数SHMMAX增加到1.7~2.7GB,并用新编译生成的代码启动Oracle。

为了使用大于2.7GB的SGA,采用以下方法。

创建RMAN磁盘,RAM磁盘创建完毕后,使用间接数据缓冲区(indirect data buffer)配置Oracle。它本质是Oracle用于数据库块缓冲区的RAM磁盘文件。使用RAM磁盘可以避开所有的32位限制。

安装了ramfs类型的文件系统可以创建RMAN磁盘。Red Hat ES 3.0之前使用shmfs,在3.0后则用ramfs创建,使用一下语法:

umount /dev/shm

mount -t ramfs ramfs /dev/shm

chown oracle:dba  /dev/shm

将下面内容加入文件/etc/security/limits.conf中:

oracle soft  memlock 3145728

oracle soft memlock 3145728

如果用ssh登陆系统,将下面一行加入到/etc/init.d/ssh中:

ulimit -l 3145728

在Oracle初始化参数中加入”use_indirect_data_buffer=true”。

为了使用大型内存模型,接下来需要把Linux共享内存设置到足够大。用下面的方法把共享内存最大值设置为8GB:

echo 8589934592 > /proc/sys/kernel/shmmax

或者把下面一行加入到/etc/sysctl.conf文中:

sys.kernel.shmmax= 8589934592

并执行 sysctl -w

注意: 间接数据缓冲区只能用作Oracle块缓冲区。低端内存仍用于其它内存共享项目,例如共享池或日志缓冲区。

10.2.0.1 db console启动失败问题一例

帮网友诊断10.2.0.1数据库安装后配置EM的问题一例,主要是在EM启动过程中遇到了错误,日志如下:

2009-7-7 20:11:39 oracle.sysman.emcp.EMConfig perform
严重: 启动 Database Control 时出错
有关详细资料, 请参阅 oracle\product.2.0\db_1\cfgtoollogs\dbca\orcl\emConfig.log 中的日志文件。
2009-7-7 20:11:39 oracle.sysman.emcp.EMConfig perform
配置: Stack Trace:
oracle.sysman.emcp.exception.EMConfigException: 启动 Database Control 时出错
	at oracle.sysman.emcp.EMDBPostConfig.performConfiguration(EMDBPostConfig.java:569)
	at oracle.sysman.emcp.EMDBPostConfig.invoke(EMDBPostConfig.java:181)
	at oracle.sysman.emcp.EMDBPostConfig.invoke(EMDBPostConfig.java:150)
	at oracle.sysman.emcp.EMConfig.perform(EMConfig.java:155)
	at oracle.sysman.assistants.util.em.EMConfiguration.run(EMConfiguration.java:430)
	at java.lang.Thread.run(Thread.java:534)

出现错误的java stack call是EMDBPostConfig说明em db repository 已经完成配置在尝试启动EM,但此时出现了问题。

实际造成该问题的有多种可能性,这里列举最常见的几种:

1. 杀毒软件或者防火墙造成的,这里建议在安装windows期间彻底关闭杀毒类软件和Windows自带的防火墙
2. EM使用的端口已经被其他软件所占用,建议使用netstat命令诊断,或另择网络端口
3. Windows主机名以”U”字母开头

针对第三种可能性,可以在MOS上找到Note <Problem: Startup: Db Control 10.2.0.1 Fails To Start when the hostname begins with the letter “u”>:

Problem: Startup: Db Control 10.2.0.1 Fails To Start when the hostname begins with the letter "u" 

Applies to:
Enterprise Manager for RDBMS - Version: 10.2.0.1 and later   [Release: 10.2 and later ]

Symptoms

The dbconsole 10.2.0.1 is created with emca.The repository creation completes successfully and the
dbconsole is created and configured without any error.
The last phase of emca is to start the dbconsole.
The standalone agent is started successfully but the dbconsole fails to start.

The file $ORACLE_HOME/cfgtoollogs/emca/sid/emca*.log shows:
===================================================
05/11/2005 23:07:49 oracle.sysman.emcp.util.PlatformInterface serviceCommand
CONFIG: Waiting for service 'OracleDBConsoleAMITAL' to fully start
05/11/2005 23:07:59 oracle.sysman.emcp.EMConfig perform
SEVERE: Error starting Database Control
Refer to the log file at
d:\oracle\product.2.0\db_1\cfgtoollogs\emca\amital\emca_2005-11-05_10-55-42-PM.log for more
details.
05/11/2005 23:07:59 oracle.sysman.emcp.EMConfig perform
CONFIG: Stack Trace:
oracle.sysman.emcp.exception.EMConfigException: Error starting Database Control
  at oracle.sysman.emcp.EMDBPostConfig.performConfiguration(EMDBPostConfig.java:569)
  at oracle.sysman.emcp.EMDBPostConfig.invoke(EMDBPostConfig.java:181)
Cause
This issue can be reproduced on all windows platforms which name begins with the letter u.
This issue has been logged in the following bug:
Bug:4714774 DBCONSOLE DOES NOT WORK HAVING A HOSTNAME STARTING WITH "U"
This bug will be fixed in the patchset 10.2.0.2
Solution

Download and apply the RDBMS patchset 10.2.0.2 once/if available or

To enable the dbconsole to start successfully, follow the steps listed below:

1. Save the file $ORACLE_HOME\host_SID\sysman\config\emd.properties to emd.properties.orig
2. Update the file $ORACLE_HOME\host_SID\sysman\config\emd.properties, replacing \ with / in the
   following line:
   For example change:
   omsRecvDir=d:\oracle\product.2.0\db_1\ukp001_db0\sysman\recv
   to
   omsRecvDir=d:/oracle/product/10.2.0/db_1/ukp001_db0l/sysman/recv
3. Stop and restart the DB Control 

Hdr: 4714774 10.2.0.1 EMCP 10.2.0.1 PRODID-1370 PORTID-912
Abstract: DBCONSOLE DOES NOT WORK HAVING A HOSTNAME STARTING WITH "U"

Problem Description
-------------------
this seems to be releated to bug 3610052.
Having a hostname like u1234, install rdbms10.2.
start dbconsole will take 4-5 min. Even if the service is shown as started,
dbconsole java processes stops and restarts.

Environment Information
-----------------------

Test Case Step-by-Step Instructions
-----------------------------------
1. rename the hostanme to uXYZ....
2. install 10.2 rdbms into its own oraclehome
3. strt listener, create a database using dbca.
4. access the dbconsole page shows: "page cannot be displayed"

Check the dbconsole java process in taksmanager , you will easily see, that
it is coninously restarting. dbconsole service is shown as "started".

Test Case Location
------------------
not needed easy to reproduce

Diagnostic Analysis
-------------------
run the java command from file emd.nohup file, this will show something like:

java...Exception...
Malformed \uxxxx encoding.

and a java stack trace.

I assume that this problem will always happen, having a "\u" in the path for
the dbconsole

Performance
-----------

NLS Information
---------------

Patches
-------

Log Files Location
------------------

Reproducibility
---------------
everytime, only on windows platforms:
w2k w2003, windowsXP

URL
---
not needed

Did you test with the latest version?
-------------------------------------
yes

Available Workarounds
---------------------
Change the hostname from uXYZ to UXYZ,
do
set ORACLE_SID=
emca -config dbcontrol db -reposrecreate 

Related Bugs
------------
3610052

Hdr: 6313490 10.2.0.1.0 CONSOLE 10.2.0.1.0 PRODID-1370 PORTID-215 4714774
Abstract: 10.2.0.1DBCONSOLE IN A DB CREATED FROM IAS INSTALL FAILS TO START

Problem Description
-------------------
The dbconsole 10.2.0.1.0 creating is failing while starting the dbconsole
dbconsole in a db created from iAS install.

as per the logs:-

SEVERE: Error starting Database Control
    Refer to the log file at

D:\oracle\product.2.0\db_1\cfgtoollogs\emca\mmds\emca_2007-06-19_09-47-70-A
M.log for more details.
    Jun 19, 2007 9:58:17 AM oracle.sysman.emcp.EMConfig perform
    CONFIG: Stack Trace:
    oracle.sysman.emcp.exception.EMConfigException: Error starting Database
Control at
oracle.sysman.emcp.EMDBPostConfig.performConfiguration(EMDBPostConfig.java:569
) at oracle.sysman.emcp.EMDBPostConfig.invoke

Environment Information
-----------------------
dbconsole in a db created from iAS install.
db console 10.2.0.1.0 on Microsoft Windows Server 2003
The 10g application server version  is 10.1.2.0.2. to include January 2007
database and application server CPUs.

Applied the  database CPU patch and after that reconfigured the Dbconsole .
Now it doesn't get configured .Db console generating core dump

Test Case Step-by-Step Instructions
-----------------------------------

Test Case Location
------------------

Diagnostic Analysis
-------------------
Stopped the application server service and then started the dbcosnole .But
still getting the error

D:\oracle\product.2.0\db_1\BIN>emctl status dbconsole
    Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
    Oracle Enterprise Manager 10g is not running.
    ------------------------------------------------------------------
    Logs are generated in directory 

Asked him to move the file from Oracle_Home>/_/emctl.pid file
to another location . Now the dbconsole is started:-

>emctl start dbconsole
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
Starting Oracle Enterprise Manager 10g Database Control ...The
OracleDBConsoleMM
DS service is starting...................The OracleDBConsoleMMDS service was
started successfully.

D:\oracle\product.2.0\db_1\BIN>emctl upload
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
EMD upload error: uploadXMLFiles skipped :: OMS version not checked yet..

Asked him to Edit the file
ORACLE_HOME>\oc4j\j2ee\home\applications\dms\WEB-INF\web.xml

Still the dbconsole is not getting started .

可能是因为Windows上主机名以”U”字母开头后导致相关路径发生了转义。

可以通过在cmd中输入hostname来确认Windows上的主机名。

给出了2种解决方案:

1.升级到10.2.0.2以上版本,目前推荐的是10.2.0.5 Patch set; 版本10.2.0.1到10.2.0.3上的bug太多了..

2.修改$ORACLE_HOME\host_SID\sysman\config\emd.properties中的内容,将omsRecvDir变更改成使用斜杠而非反斜杠的路径形式,并重启EM.

沪ICP备14014813号-2

沪公网安备 31010802001379号