【SQL优化】Oracle中的Top-N与分页匹配查询

以下罗列一些Oracle中的Top-N与分页匹配查询优化写法:



CREATE INDEX i_pop ON cities(population);


TOP-N


SELECT * FROM (
  SELECT name, population
  FROM cities
  ORDER BY population DESC
) WHERE rownum <= 5;


SELECT * FROM (
  SELECT name, population
  FROM cities
  WHERE state='Florida'
  ORDER BY population DESC
) WHERE rownum <= 5;



分页

SELECT * FROM (
  SELECT * FROM (
    SELECT name, population,
      rownum AS rn
    FROM cities
    WHERE state='Florida'
    ORDER BY population DESC
  ) WHERE rownum <= 20
) WHERE rn > 10;


Top-N with joins


SELECT * FROM (
  SELECT c.name as city, c.population, s.capital
  FROM cities c, states s
  WHERE c.state_id = s.id
    AND c.state='Florida'
  ORDER BY c.population DESC
) WHERE rownum <= 5
/




windows平台上的11g release 2终于发布了

下午无意中打开了oracle主页上11g下载的页面,赫然发现windows平台的安装介质已经发布了。

介质分成2个zip包,1.5g和600m; 11g的安装介质较10g大了许多,因为默认附加了apex与sql developer.

下载地址:

Oracle Database 11g Release 2 (11.2.0.1.0) for Microsoft Windows (32-bit)
Download win32_11gR2_database_1of2.zip (1,625,721,289 bytes)
Download win32_11gR2_database_2of2.zip (631,934,821 bytes)
Oracle Database 11g Release 2 (11.2.0.1.0) for Microsoft Windows (x64)
Download win64_11gR2_database_1of2.zip (1,213,501,989 bytes) (cksum – 3906682109)
Download win64_11gR2_database_2of2.zip (1,007,988,954 bytes) (cksum – 1232608515)

目前32位未提供grid infrastructure 介质,想要体验windows上的11g rac只能使用64bit .

Oracle Database 11g Release 2 Grid Infrastructure (11.2.0.1.0) for Microsoft Windows (x64)

Download win64_11gR2_grid.zip (715,166,425 bytes) (cksum – 3127109177)

根据metalink文档867040.1所述,Oracle database 11g release 2 将默认支持windows 7 以及 windows 2008 r2 .

单机的安装过程十分简单:

一般来说windows 发行版在正式生产环境中不多见,不过安装到笔记本上方便了今后对11gr2新特性的测试。

附release note:

Oracle Database 11g Release 2 Now Available on Windows 32 and 64-bit [ID 1081390.1]

Modified 07-APR-2010     Type ANNOUNCEMENT     Status PUBLISHED

In this Document
What is being announced?
References


Applies to:

Oracle Server – Standard Edition – Version: 11.2.0.1 – Release: 11.2
Oracle Server – Personal Edition – Version: 11.2.0.1 – Release: 11.2
Oracle Server – Enterprise Edition – Version: 11.2.0.1 – Release: 11.2
Microsoft Windows (32-bit)
Microsoft Windows x64 (64-bit)
Microsoft Windows x64 (64-bit) – OS Version: 7
Microsoft Windows (32-bit) – OS Version: 7
Microsoft Windows x64 (64-bit) – Version: 2008 R2

What is being announced?

Oracle Database 11g Release 2 for Windows x86 (32-bit) and x86-64 is now available for download from OTN or the Oracle Store.

New with 11.2 is certification with Windows 2008 R2 and Windows 7.   You can find out more in Note  1065024.1 “Oracle Database 11g Release 2 Certification Highlights”.

explore my oracle support using firefox 3.6

升级到FF3.6的朋友肯定发现了现在使用火狐浏览器访问my oracle support 时许多页面打开为空白页,譬如这篇介绍FF3.6与目前oracle support site兼容性的文章:

Firefox 3.6 Not Officially Supported by My Oracle Support

Although much of the functionality of My Oracle Support is accessible via Firefox 3.6, please be aware that Firefox 3.6 is not officially supported by My Oracle Support at this time. Please review the My Oracle Support FAQ (Knowledge Document #747252.5) for details on the current browser requirements for accessing My Oracle Support.

We are aware of a current issue when viewing Knowledge documents using Firefox 3.6. Using Firefox 3.6, when clicking on a link for a Knowledge document the document appears blank.  The workaround is to instead open the documents in a new tab or new window.

We are working on a fix for this issue and hope to have this resolved in the near future. We apologize for any inconvenience.

FF升级到3.6的时间已经超过一个月了;oracle support 之前曾发表将解决该兼容问题的声明,原本以为这只是一个小case,因该在数周内彻底解决。但目前又发布了暂时不官方支持的声明(大部分文档可以通过在新窗口内打开方式阅读)。

看起来大型网站的建设维护确实不是Oracle的特长,apex或者说htmldb又真的适合超大型web site吗?

Oracle X$ View:X$KJBL

The status of buffer locks can be checked through the table X$KJBL; it is not necessary to dump the locks with
the ORADEBUG command, in order to get to this information. The table contains the client and shadow locks.

 

 

Column name            Data type            Description
KJBLLOCKP              RAW,kjbl*            Lock pointer. Can be joined with X$LE.LE_KJBL.
KJBLGRANT              VARCHAR2 (9)         Current grant level, valid values are:
                                            KJUSERNL (0), KJUSERCR (1), KJUSERCW (2), KJUSERPR
                                            (3), KJUSERPW (4), KJUSEREX (5)
KJBLREQUEST            VARCHAR2 (9)         Request Level, valid values are:
                                            (See KJBLGRANT)
                                            KJBLROLE NUMBER Granted Role:
                                            KJBL_GRANT_N (0x00), KJBL_GRANT_S (0x01)
                                            KJBL_GRANT_X (0x02), KJBL_REQ_S (0x04),
                                            KJBL_REQ_X (0x08), KJBL_REQ_STL (0x10),
                                            KJBL_REQ_CR (0x10), KJBL_OACK (0x20),
                                            KJBL_ROLE_G (0x40), KJBL_HAS_PI (0x80)
                                            
KJBLRESP               RAW, kjbr*          Current Resource Info (Resource pointer)
KJBLNAME               VARCHAR2 (30)       Resource name in Hex notation ([id1], [id2], [Type])
KJBLNAME2              VARCHAR2 (30)       Resource name in Decimal notation (id1, id2, Type)
KJBLQUEUE              NUMBER              Grant / Convert queue:
                                           Grant Queue (0x00), Convert Queue 0x80
KJBLLOCKST             VARCHAR2 (64)       Lock state:
                                           KJUSERGRANTED (0x00), KJUSEROPENING (0x01),
                                           KJUSERCLOSING (0x02), KJUSERCANCELLING (0x04),
                                           KJUSERCVTING (0x08), KJUSERSTANDALONE (0x10),
                                           KJUSERASTDELAYED (0x20), KJUSERMSGPENDING(0x40),
                                           KJUSERFROZEN (0x80)
KJBLWRITING            NUMBER              Writing
                                           ·0, not writing; 1, writing
KJBLREQWRITE                               Write requested
                                           ·  0, no request; 1, write requested
KJBLOWNER              NUMBER              Owning instance – node id
KJBLMASTER             NUMBER              Master instance – node id
KJBLBLOCKED            NUMBER              Blocked lock? – Checks if there is a converting lock
KJBLBLOCKER            NUMBER              Blocking lock? - Checks if there is a lock held in an incompatible mode

 

 

The columns KJBLBLOCKER and KJBLBLOCKED in particular let you determine if you are waiting on
blocked lock or if you are blocking other sessions. With this table, it is also possible to determine where the
lock is mastered and the role of the lock.

Oracle X$ View:X$KJMSDP

X$KJMSDP

The table X$KJMSDP has entries for each LMS process that is configured for the instance. The default
number of LMS processes is the number of CPU’s divided by 4, but 2 as a minimum. The LMS processes are
responsible for the delivery of the GCS messages and blocks to the other instances.

 

 

Column name          Data Type               Description
ADDR                 RAW(4)                  Identifier / Address
INDX                 NUMBER                  LMS id (starts with 0)
INST_ID              NUMBER                  Instance id
PID                  NUMBER                  LMS process id (starts with 0)
FLAG                 NUMBER                  Flags:
ACTUAL_RCV           NUMBER                  Actual Global Cache Service messages received
LOGICAL_RCV          NUMBER                  Logical Global Cache Service messages received
LOGICAL_PTIME        NUMBER                  Logical Global Cache Service messages process time (ms)
SBUF_TIME            NUMBER                  Time for flushing send buffer (ms)
FC_SENT              NUMBER                  Number of flow control messages sent
NULL_REQ             NUMBER                  Number of null requests send by this process
WAIT_TICKET          NUMBER                  Number of times waited for tickets
CRB_SENT             NUMBER                  Number of CR blocks sent
CRB_STIME            NUMBER                  CR block sent time (ms)
RCVQ_TIME            NUMBER                  Receive message queue time
ERRCHK_TIME          NUMBER                  Error check time
FMSGBUFS_TIME        NUMBER                  Flush message buffer time
RCFGFRZ_TIME         NUMBER                  Reconfiguration freeze time
RCFGSYNC_TIME        NUMBER                  Reconfiguration sync time
PBATFLUSH_TIME       NUMBER                  Process batch flush time
SQFLUSH_TIME         NUMBER                  Send queue flush time
DRMSYNC_TIME         NUMBER                  DRM sync time
RCVMSG_TIME          NUMBER                  Time to receive messages
BPMSG_TIME           NUMBER                  Batch message process time
PMSG_TIME            NUMBER                  Time to process message from receiver
SCANQ_TIME           NUMBER                  Scan queue time
PDCQ_TIME            NUMBER                  Time to check down-convert queue
PTOQ_TIME            NUMBER                  Time to check deferred ping queue
FSCH_TIME            NUMBER                  Time to flush the side-channel messages
IPBAT_TIME           NUMBER                  Time for embedded batch message processing
RETRYQ_TIME          NUMBER                  Retry queue time

 

 

X$KJMSDP
As with the LMD process, it is important to check if there are waits for tickets. If this occurs, look at the
corresponding section at the description of the table X$KJMDDP for further explanation.

What’s preconnect.svc in 11g RAC?

有网友反映在11.2的RAC中有一个名如*_preconnect.svc的资源一直处于OFFLINE状态,而其TARGET则为ONLINE状态,无法通过重启CRS或者手动start该资源来使之ONLINE。我们来具体看一下什么情况下回产生这种资源?:

[oracle@rh2 ~]$ srvctl add service -h

Adds a service configuration to the Oracle Clusterware.

Usage: srvctl add service -d  -s  {-r "" 
[-a ""] [-P {BASIC | NONE | PRECONNECT}] | -g  
[-c {UNIFORM | SINGLETON}] } [-k   ] [-l [PRIMARY][,PHYSICAL_STANDBY]
[,LOGICAL_STANDBY][,SNAPSHOT_STANDBY]] [-y {AUTOMATIC | MANUAL}] [-q {TRUE|FALSE}] 
[-x {TRUE|FALSE}] [-j {SHORT|LONG}] [-B {NONE|SERVICE_TIME|THROUGHPUT}] 
[-e {NONE|SESSION|SELECT}] [-m {NONE|BASIC}] [-z ] 
[-w ] [-t ] [-f]
    -d       Unique name for the database
    -s              Service name
    -r ""    Comma separated list of preferred instances
    -a ""    Comma separated list of available instances
    -g            Server pool name
    -c {UNIFORM | SINGLETON} Service runs on every active server in the 
server pool hosting this service (UNIFORM) or just one server (SINGLETON)
    -k              network number (default number is 1)
    -P {NONE | BASIC | PRECONNECT}        TAF policy specification
    -l                 Role of the service (primary, physical_standby, logical_standby, snapshot_standby)
    -y               Management policy for the service (AUTOMATIC or MANUAL)
    -e        Failover type (NONE, SESSION, or SELECT)
    -m      Failover method (NONE or BASIC)
    -w              Failover delay
    -z              Failover retries
    -t              Edition (or "" for empty edition value)
    -j   Connection Load Balancing Goal (SHORT or LONG). Default is LONG.
    -B      Runtime Load Balancing Goal (SERVICE_TIME, THROUGHPUT, or NONE)
    -x   Distributed Transaction Processing (TRUE or FALSE)
    -q  AQ HA notifications (TRUE or FALSE)
Usage: srvctl add service -d  -s  -u {-r "" | -a ""} [-f]
    -d       Unique name for the database
    -s              Service name
    -u                       Add a new instance to service configuration
    -r        Name of new preferred instance
    -a       Name of new available instance
    -f                       Force the add operation even though a listener is not configured for a network
    -h                       Print usage

[oracle@rh2 ~]$ srvctl add service -d PROD -s maclean -r "PROD1,PROD2" -P BASIC

[oracle@rh2 ~]$ crs_stat|grep maclean
NAME=ora.prod.maclean.svc
NAME=ora.prod.maclean_taf.svc

[oracle@rh2 ~]$ srvctl add service -d PROD -s maclean_pre -r "PROD1,PROD2" -P PRECONNECT
[oracle@rh2 ~]$ crs_stat|grep maclean_pre            
NAME=ora.prod.maclean_pre.svc
NAME=ora.prod.maclean_pre_preconnect.svc

/* 可以看到仅当创建的service使用preconnect的TAF policy时会附带创建{service_name}_preconnect.svc的服务 */

[oracle@rh2 ~]$ srvctl start service -d PROD -s maclean    
[oracle@rh2 ~]$ srvctl start service -d PROD -s maclean_pre

[oracle@rh2 ~]$ crs_stat |grep -A3 maclean
NAME=ora.prod.maclean.svc
TYPE=ora.service.type
TARGET=ONLINE
STATE=ONLINE on rh2
--
NAME=ora.prod.maclean_pre.svc
TYPE=ora.service.type
TARGET=ONLINE
STATE=ONLINE on rh2
--
NAME=ora.prod.maclean_pre_preconnect.svc
TYPE=ora.service.type
TARGET=ONLINE
STATE=OFFLINE
--
NAME=ora.prod.maclean_taf.svc
TYPE=ora.service.type
TARGET=OFFLINE
STATE=OFFLINE

[oracle@rh2 ~]$ srvctl start service -d prod -s maclean_pre_preconnect
PRCD-1084 : Failed to start service maclean_pre_preconnect
PRCR-1079 : Failed to start resource ora.prod.maclean_pre_preconnect.svc
CRS-2674: Start of 'ora.prod.maclean_pre_preconnect.svc' on 'rh2' failed
CRS-2674: Start of 'ora.prod.maclean_pre_preconnect.svc' on 'rh3' failed
CRS-2632: There are no more servers to try to place resource 
'ora.prod.maclean_pre_preconnect.svc' on that would satisfy its placement policy

/* 无法手动启动该preconnect.svc服务 */

[oracle@rh2 ~]$ crs_stat |grep preconnect
NAME=ora.prod.maclean_pre_preconnect.svc

[oracle@rh2 ~]$ srvctl remove service -d PROD -s maclean_pre_preconnect
PRCD-1107 : Removed service maclean_pre_preconnect but failed to remove its 
underlying server pool PROD_maclean_pre
PRCS-1012 : Failed to remove server pool PROD_maclean_pre
PRCR-1072 : Failed to unregister server pool ora.PROD_maclean_pre
CRS-2554: Server pool 'ora.PROD_maclean_pre' cannot be unregistered as 
it is referenced by resource 'ora.prod.maclean_pre.svc'

[oracle@rh2 ~]$ crs_stat |grep preconnec

/* 以上remove命令报错但该preconnect.svc服务还是被删除了,虽然我们并不推荐这样做 */

[oracle@rh2 admin]$ srvctl start service -d PROD -s maclean_pre

[oracle@rh2 admin]$ lsnrctl service

LSNRCTL for Linux: Version 11.2.0.2.0 - Production on 30-MAR-2011 18:55:43

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM1", status READY, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         LOCAL SERVER
Service "PROD" has 1 instance(s).
  Instance "PROD1", status READY, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         LOCAL SERVER
Service "PRODXDB" has 1 instance(s).
  Instance "PROD1", status READY, has 1 handler(s) for this service...
    Handler(s):
      "D000" established:0 refused:0 current:0 max:1022 state:ready
         DISPATCHER 
         (ADDRESS=(PROTOCOL=tcp)(HOST=rh2)(PORT=36196))
Service "maclean" has 1 instance(s).
  Instance "PROD1", status READY, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         LOCAL SERVER
Service "maclean_pre" has 1 instance(s).
  Instance "PROD1", status READY, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         LOCAL SERVER
The command completed successfully
                                                                                                                           
[oracle@rh2 admin]$ tnsping PROD_TAF

TNS Ping Utility for Linux: Version 11.2.0.2.0 - Production on 30-MAR-2011 18:54:38

Copyright (c) 1997, 2010, Oracle.  All rights reserved.

Used parameter files:

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)
(HOST = rh-cluster-scan)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) 
(SERVICE_NAME = maclean_pre)))
OK (10 msec)


[oracle@rh2 admin]$ sqlplus  maclean/maclean@PROD_TAF

SQL> select instance_name,host_name from v$instance;

INSTANCE_NAME
----------------
HOST_NAME
----------------------------------------------------------------
PROD2
rh3.oracle.com

shutdown PROD2 instance:SQL> shutdown abort

SQL> select instance_name,host_name from v$instance;

INSTANCE_NAME
----------------
HOST_NAME
----------------------------------------------------------------
PROD1
rh2.oracle.com


/* 可以看到即使删除preconnect.svc服务也不会影响到TAF的正常使用,
    在生产环境中无需刻意去调试该服务,最好的应对方式是放任不管
*/

所以如果某天你看到你的crs_stat -t输出里有一条莫名的类似”ora….ect.svc ora….ce.type ONLINE OFFLINE”的记录,那么大可以忽略该资源的OFFLINE状态;这是正常现象,用不着紧张!

Diff Hidden Parameter between 10g and 11g

 

[gview file=”https://www.askmac.cn/wp-content/uploads/2010/03/Diff-Hidden-Parameter-between-10g-and-11g.pdf”]

隐藏参数_high_priority_processes与oradism

运行在操作系统上的进程存在2种系统时序优先级模式:即 实时模式 Real Time(RT) mode, 与分时模式 Time Sharing(TS) mode.
绝大多数Oracle进程运行在TS模式下:

[oracle@rh1 ~]$ ps -efc|grep ora_|grep -v grep
oracle    8510     1 TS   23 Mar27 ?        00:00:02 ora_pmon_PROD
oracle    8512     1 TS   23 Mar27 ?        00:00:00 ora_psp0_PROD
oracle    8514     1 TS   23 Mar27 ?        00:00:00 ora_mman_PROD
oracle    8516     1 TS   23 Mar27 ?        00:00:02 ora_dbw0_PROD
oracle    8518     1 TS   23 Mar27 ?        00:00:04 ora_lgwr_PROD
oracle    8520     1 TS   23 Mar27 ?        00:00:04 ora_ckpt_PROD
oracle    8522     1 TS   23 Mar27 ?        00:00:08 ora_smon_PROD
oracle    8524     1 TS   23 Mar27 ?        00:00:00 ora_reco_PROD
oracle    8526     1 TS   23 Mar27 ?        00:00:34 ora_cjq0_PROD
oracle    8528     1 TS   23 Mar27 ?        00:00:06 ora_mmon_PROD
oracle    8530     1 TS   24 Mar27 ?        00:00:07 ora_mmnl_PROD
oracle    8538     1 TS   23 Mar27 ?        00:00:00 ora_arc0_PROD
oracle    8540     1 TS   23 Mar27 ?        00:00:00 ora_arc1_PROD
oracle    8548     1 TS   23 Mar27 ?        00:00:00 ora_qmnc_PROD
oracle    8555     1 TS   23 Mar27 ?        00:00:00 ora_q000_PROD
oracle    8559     1 TS   23 Mar27 ?        00:00:00 ora_q001_PROD
oracle   30500     1 TS   23 22:10 ?        00:00:00 ora_j000_PROD

如上所示所有进程均运行在TS模式下且priority均为23|24.
Oracle一般不推荐使用RT模式,因为虽然个别进程可以通过这种方式获得更多的CPU资源,但往往系统的瓶颈并非CPU,即尽管CPU使用率高了,但实际系统TPS并未得到提升。
在10gr2版本后RAC中的LMS进程成为唯一一个使用RT模式的Oracle进程,我们可以通过查询参数_high_priority_processes了解相关信息:

SQL> col name format a40
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%priority%';

NAME                                     VALUE
---------------------------------------- ----------
_high_priority_processes                 LMS*
_os_sched_high_priority                  1

_high_priority_processes通过进程功能名进行匹配,下面我们将提高LGWR及PMON进程的优先级:

SQL> alter system set "_high_priority_processes"='LMS*|LGWR|PMON' scope=spfile;

System altered.

SQL> startup force;
ORACLE instance started.

Total System Global Area  281018368 bytes
Fixed Size                  2083336 bytes
Variable Size             150996472 bytes
Database Buffers          121634816 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.
SQL> !ps -efc|grep ora_|grep -v grep
oracle   31441     1 RR   41 22:50 ?        00:00:00 ora_pmon_PROD
oracle   31445     1 TS   23 22:50 ?        00:00:00 ora_psp0_PROD
oracle   31447     1 TS   23 22:50 ?        00:00:00 ora_mman_PROD
oracle   31449     1 TS   23 22:50 ?        00:00:00 ora_dbw0_PROD
oracle   31451     1 RR   41 22:50 ?        00:00:00 ora_lgwr_PROD
oracle   31455     1 TS   23 22:50 ?        00:00:00 ora_ckpt_PROD
oracle   31457     1 TS   23 22:50 ?        00:00:00 ora_smon_PROD
oracle   31459     1 TS   22 22:50 ?        00:00:00 ora_reco_PROD
oracle   31461     1 TS   23 22:50 ?        00:00:01 ora_cjq0_PROD
oracle   31463     1 TS   23 22:50 ?        00:00:01 ora_mmon_PROD
oracle   31465     1 TS   24 22:50 ?        00:00:00 ora_mmnl_PROD
oracle   31471     1 TS   24 22:50 ?        00:00:00 ora_p000_PROD
oracle   31473     1 TS   24 22:50 ?        00:00:00 ora_p001_PROD
oracle   31475     1 TS   24 22:50 ?        00:00:00 ora_arc0_PROD
oracle   31477     1 TS   22 22:50 ?        00:00:00 ora_arc1_PROD
oracle   31481     1 TS   23 22:50 ?        00:00:00 ora_qmnc_PROD
oracle   31488     1 TS   23 22:50 ?        00:00:00 ora_q000_PROD
oracle   31490     1 TS   23 22:50 ?        00:00:00 ora_q001_PROD
oracle   31500     1 TS   23 22:50 ?        00:00:00 ora_j000_PROD

好了lgwr和pmon进程也进入实时模式了,同时priority值上升到了41.
注意:
Oracle默认仅允许LMS进程(11g中多了VKTM进程)使用RT模式是有它的原因的,所以如果不是Oracle support 推荐,您没有任何修改隐式参数的理由。
其次根据Oracle文档[ID 602419.1]的描述,oradism文件(该文件位于$ORACLE_HOME/bin目录下)不正确的权限将导致RT模式无法被正确使用,该文件默认属于root用户并具有s权限。如下测试:

[oracle@rh1 bin]$ ls -la oradism
-r-sr-s---  1 root oinstall 14931 Mar 11  2008 oradism
[oracle@rh1 bin]$ su - root
Password:
[root@rh1 ~]# chown oracle:oinstall /s01/oracle/product/10.2.0/db_1/bin/oradism
[root@rh1 ~]# exit
logout
[oracle@rh1 bin]$ ls -la oradism
-r-xr-x---  1 oracle oinstall 14931 Mar 11  2008 oradism
[oracle@rh1 bin]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Sun Mar 28 23:07:03 2010

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area  281018368 bytes
Fixed Size                  2083336 bytes
Variable Size             150996472 bytes
Database Buffers          121634816 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.
SQL> col name format a35;
SQL> col value format a10;
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%priority%';

NAME                                VALUE
----------------------------------- ----------
_high_priority_processes            LMS*|LGWR|PMON
_os_sched_high_priority             1
SQL> !ps -efc|grep ora_|grep -v grep
oracle   31994     1 TS   23 23:07 ?        00:00:00 ora_pmon_PROD
oracle   31998     1 TS   23 23:07 ?        00:00:00 ora_psp0_PROD
oracle   32000     1 TS   23 23:07 ?        00:00:00 ora_mman_PROD
oracle   32002     1 TS   23 23:07 ?        00:00:00 ora_dbw0_PROD
oracle   32004     1 TS   24 23:07 ?        00:00:00 ora_lgwr_PROD
oracle   32008     1 TS   22 23:07 ?        00:00:00 ora_ckpt_PROD
oracle   32010     1 TS   23 23:07 ?        00:00:00 ora_smon_PROD
oracle   32012     1 TS   22 23:07 ?        00:00:00 ora_reco_PROD
oracle   32014     1 TS   23 23:07 ?        00:00:01 ora_cjq0_PROD
oracle   32016     1 TS   23 23:07 ?        00:00:01 ora_mmon_PROD
oracle   32018     1 TS   24 23:07 ?        00:00:00 ora_mmnl_PROD
oracle   32026     1 TS   24 23:07 ?        00:00:00 ora_arc0_PROD
oracle   32028     1 TS   23 23:07 ?        00:00:00 ora_arc1_PROD
oracle   32032     1 TS   23 23:07 ?        00:00:00 ora_qmnc_PROD
oracle   32045     1 TS   23 23:07 ?        00:00:00 ora_q000_PROD
oracle   32065     1 TS   23 23:08 ?        00:00:00 ora_q001_PROD
oracle   32072     1 TS   23 23:08 ?        00:00:00 ora_j000_PROD

that’s great, 显然oradism不仅为Oracle实例提供了内存资源控制功能,还包括了进程优先级分配的权限。
我们应当再次声明hidden parameter不应“滥用”于production environment.

发一个jd:System Engineer 博君一笑

   1.  Goal and mission:

A System Engineer has excellent and in-depth knowledge of a family of products (e.g. NagraVision CAS, iDTV & SMS, MediaGuard CAS). He is the recognised expert on that range of systems and is able to analyse and resolve complex problems under severe pressure. He recognises the importance of excellent customer relationships and is a good team player. He is highly motivated to develop his skills.

   2. Responsibility:

    * Is assigned to one or more accounts. Is responsible for the technical solution (installation, customer support, maintenance, etc) and works colsely with the program manager in charge of the account.
    * Creatively develops workaround solutions using broad knowledge of NagraVision products when no other solution is viable
    * Maintains a can-do attitude
    * Is fascinated by new technology and products
    * Handles most crisis situations confidently, is aware of customer consequences and knows when a call should be escalated
    * Recognises possible product design faults or potential enhancements and readily communicates this feedback to senior System Engineers for validation
    * Takes pride in solving customer problems effectively and is intolerant of poor quality work
    * Lives up to customer expectations and updates them on changes
    * Is respectful of and helpful to other engineers and enjoys gaining and sharing knowledge
    * Attends training on products, services and behavioural skills related to the position
    * Provides tuition on NagraVision products to customers and to NagraVision Engineers, under the managerial responsibility of the Head of System Engineering.
    * Participates in Hotline turns, under the managerial responsibility of the Head of System Engineering.
    * Execute security measures on the systems.
    * Provide quick response time to meet customer satisfaction.

   3. Reporting line

Reporting to: Head of System Engineering

Direct Report: Igor Ferigutti

   4. Interfaces

    * Daily contact with Program/Deployment Manager to inform on current status and plan the next steps of a project
    * Information to and from Customers using live communication, conference calls and e-mail reports
    * Collaboration with Central Services’ staff for problem solving.
    * Collaboration with Product Development for bug reporting and problem solving
    * Regular communication with Head of System Engineering to keep the resource planning up-to-date.

   5. Skills

    * Is recognised to have excellent knowledge of a set of NagraVision products
    * Mostly self-reliant, but requires direction under some situations.
    * Has a good telephone manner and remote diagnosis skills.
    * Is fluent in English, any other Asian language is an asset.
    * Recognised expertise in the UNIX platforms and Windows.
    * Ability to perform complex queries on Oracle or Sybase databases.
    * Recognised ability to analyse TCP/IP networking problems.
    * Ability to demonstrate certain NagraVision products or to provide training on them.
    * Demonstrates good communication skills and reasonable inter-personal skills.
    * Working knowledge of the Ticketing and Remote access tools.
    * Ability to complete problem analysis and do troubleshooting of simple multi-cluster systems.
    * Demonstrates high customer service awareness.
    * Works effectively under pressure.

这就是D&D中所谓的万精油职业吗?

ora-7445 [kghalp+0500] [SIGSEGV]错误

今天没有外出(似乎人不到现场就特别容易出问题),早上10点左右接到电话被告知crm11实例上出现了7445错误,准备用web vpn拨上去查看一下,赫然发觉windows 7 不支持这种vpn(准确说ie8和firefox都不支持);无奈无奈只好用拨号。
发现alert log中出现大量 7445错误记录:

Fri Mar 26 09:24:53 2010
Errors in file /oravl01/oracle/admin/CRMDB1/udump/crmdb11_ora_6754320.trc:
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []
Fri Mar 26 09:24:55 2010
Trace dumping is performing id=[cdmp_20100326092455]
Fri Mar 26 09:31:16 2010
Errors in file /oravl01/oracle/admin/CRMDB1/udump/crmdb11_ora_2994552.trc:
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []

看到kghalp函数第一印象 ,是Oracle中堆管理使用的函数;
让我们猜猜字面意思? k -> kernel g -> generic h-> heap a-> allocation p-> point
再让我们来看一下当时的call stack:

Exception signal: 11 (SIGSEGV), code: 51 (Invalid permissions for mapped object), addr: 0x3b, PC: [0x1000973e0, kghalp+0500]
Registers:
iar: 00000001000973e0, msr: a00000000000d0b2
 lr: 00000001013a6df8,  cr: 0000000022292484
r00: 0000000000000010, r01: 0ffffffffffcb160, r02: 000000011022a9c0,
r03: 0000000000000002, r04: 0000000000000000, r05: 0000000000000100,
r06: 0000000000000001, r07: 0000000000000000, r08: 0000000000000000,
r09: 0000000000000000, r10: 00000000101b60d8, r11: 0000000000000004,
r12: 0000000024592484, r13: 000000011026bfe0, r14: 0000000000000000,
r15: 0000000000009000, r16: 0000000110195b2c, r17: 0000000000000000,
r18: 0000000000000001, r19: 0000000000000000, r20: 0000000000001000,
r21: 0000000000000000, r22: 0000000000000100, r23: 0000000000000001,
r24: 0000000000000000, r25: 0000000000000000, r26: 0000000000000001,
r27: 0000000104c7fd44, r28: 0000000000000000, r29: 0000000000000100,
r30: 0000000000000000, r31: 0000000110195a58,
*** 2010-03-26 09:57:28.679
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []
Current SQL statement for this session:
INSERT INTO AUDIT_DDL_LOG (DDL_TIME, SESSION_ID, OS_USER, IP_ADDRESS, TERMINAL, HOST, USER_NAME, DDL_TYPE, OBJECT_TYPE, OWNER, OBJECT_NAME, SQL_TEXT) VALUES (SYSDATE, SYS_CONTEXT('USERENV','SESSIONID'), SYS_CONTEXT('USERENV','OS_USER'), SYS_CONTEXT('USERENV','IP_ADDRESS'), SYS_CONTEXT('USERENV','TERMINAL'), SYS_CONTEXT('USERENV','HOST'), ORA_LOGIN_USER, ORA_SYSEVENT, ORA_DICT_OBJ_TYPE, ORA_DICT_OBJ_OWNER, ORA_DICT_OBJ_NAME, :B1 )
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
70000043da500d0        10  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              000000000 ? 104A54EED ?
ksedmp+0290          bl       ksedst               104A54870 ?
ssexhd+03e0          bl       ksedmp               300001D15 ?
000044C0             ?        00000000
parchk+01f4          bl       kghalp               000000000 ?
                                                   2842288200000001 ?
                                                   000000000 ? 000000000 ?
                                                   000001040 ? 110195B2C ?
ptmak+0168           bl       parchk               FFFFFFFFFFCB560 ?
                                                   FFFFFFFFFFCB430 ?
                                                   FFFFFFFFFFCB430 ?
pdybF00_Init+0244    bl       ptmak                10008049C ? 000000000 ?
                                                   FFFFFFFFFFCB4F0 ? 07FFFFFFF ?
pdy1F79_Init+00c8    bl       pdybF00_Init         110BEB1D0 ?
pdy1F01_Driver+0048  bl       pdy1F79_Init         FFFFFFFFFFCBC40 ?
pdli_new_cog+00f0    bl       pdy1F01_Driver       FFFFFFFFFFCBCE0 ? 000000000 ?
pdlifu+0264          bl       pdli_new_cog         1013885F4 ? FFFFFFFFFFCCB00 ?
                                                   7000004383E7680 ?
phpcog+0010          bl       pdlifu               FFFFFFFFFFCD958 ?
                                                   7000004383E7680 ? 104C95048 ?
phpcmp+0f80          bl       phpcog               FFFFFFFFFFCC4F0 ? 000000000 ?
pcicms2+02d4         bl       phpcmp               FFFFFFFFFFCD958 ?


发生错误的最上层 kghalp 函数由 parchk 调用, 这似乎是一个package check函数(猜测,呵呵). 我们来整理一下思路, parchk 函数调用了 kghalp函数以帮其分配内存,但却得到了一个非法的低地址[[0x00000003B],正常情况下正文段使用的空间; 这看起来显然是一个bug。
让我们来查查support.oracle.com , 键入7445 kghalp 和sigsegv 关键字 (很多时候不需要使用ora 600/7445 lookup tools).
bug 8244533 赫然显目:

Bug 8244533: ORA-07445 [KGHALP] ERRORS COMPILING PACKAGE WITH DEBUG
    STACK TRACE:
    ------------
       ksedst <- ksedmp <- ssexhd <- 000044BC <- parchk        <- ptmak <-
    pdybF00_Init <- pdy1F79_Init <- pdy1F01_Driver <- pdli_new_cog         <-
    pdlifu <- phpcog <- phpcmp <- pcicms2 <- pcicms          <- kkxcms <- kkxswcm
    <- kkxmpbms <- kkxmesu <- xtypls           <- qctopls <- qctcopn <- qctcopn

    Exception signal: 11 (SIGSEGV), code: 51 (Invalid permissions for mapped
    object),
    addr: 0x3b, PC: [0x1000973e0, kghalp+0500]
    Registers:
    iar: 00000001000973e0, msr: a00000000000d0b2
    lr: 000000010139ffb8,  cr: 00000000222a2484
    r00: 0000000000000010, r01: 0ffffffffffe2980, r02: 00000001101e5ab8,
    r03: 0000000000000002, r04: 0000000000000000, r05: 0000000000000100,
    r06: 0000000000000001, r07: 0000000000000000, r08: 0000000000000000,
    r09: 0000000000000000, r10: 0000000010171200, r11: 0000000000000004,
    r12: 00000000245a2484, r13: 000000011021fbc0, r14: 0000000000000000,
    r15: 0000000000009000, r16: 0000000110150c54, r17: 0000000000000000,
    r18: 0000000000000001, r19: 0000000000000000, r20: 0000000000001000,
    r21: 0000000000000000, r22: 0000000000000100, r23: 0000000000000001,
    r24: 0000000000000000, r25: 0000000000000000, r26: 0000000000000001,
    r27: 0000000104c5983c, r28: 0000000000000000, r29: 0000000000000100,
    r30: 0000000000000000, r31: 0000000110150b80,
    *** 16:37:14.603
    ksedmp: internal or fatal error
    ORA-7445: exception encountered: core dump [kghalp+0500] [SIGSEGV]
    [Invalid permissions for mapped object] [0x00000003B] [] []
    Current SQL statement for this session:
    select dummy from dual where  ora_dict_obj_type = 'TABLE'
----- Call Stack Trace -----ptmak pdybF00_Init pdy1F79_Init pdy1F01_Driver pdli_new_cog pdlifuphpcog phpcmp pcicms2 pcicms kkxcms kkxswcm kkxmpbms kkxmesu xtyplsTo Filer.Based on this call stack this would appear a likely match forbug 6951953 Abstract: ORA-7445 [PTMAK] IMPORTING PACKAGE COMPILED DEBUG.This bug is fixed on 10.2.0.5 and there is a 10.2.0.4 patch available for IBM AIX Based Systems (64-bit).It maybe worth while to have the customer apply the patch to seeif it resolves the issue.Also the uploaded files included test.sql is this a reproducable testcase?

这个bug 似乎仅在 IBM AIX on POWER Systems (64-bit) 发生,当以DEBUG 模式编译包时有一定几率出现。
好了,既然已经了解了可能发生的诱因,我们可以进一步分析了,接下来看看 errorstack trace信息中 的SO 记录。

      SO: 70000043d217668, type: 53, owner: 70000048cee2238, flag: INIT/-/-/0x00
      LIBRARY OBJECT LOCK: lock=70000043d217668 handle=700000446261588 mode=N
      call pin=0 session pin=0 hpc=0000 hlc=0000
      htl=70000043d2176e8[70000042b52b368,70000042bb9a808] htb=70000044929b460 ssga=70000044929ad68
      user=70000048cee2238 session=70000048eb33010 count=1 flags=[0000] savepoint=0x4bac1488
      LIBRARY OBJECT HANDLE: handle=700000446261588 mtx=7000004462616b8(1) cdp=1
      name=ALTER TRIGGER "SHUCRM3O"."TRI_PRODUCT_INSTANCE_RELATED" COMPILE DEBUG REUSE SETTINGS
      hash=164e6a8942406cee159f8943a1a3c85e timestamp=03-26-2010 09:52:12
      namespace=CRSR flags=RON/KGHP/TIM/PN0/SML/KST/DBN/MTX/[120100d0]
      kkkk-dddd-llll=0000-0001-0001 lock=N pin=0 latch#=16 hpc=0002 hlc=0002
      lwt=700000446261630[700000446261630,700000446261630] ltm=700000446261640[700000446261640,700000446261640]
      pwt=7000004462615f8[7000004462615f8,7000004462615f8] ptm=700000446261608[700000446261608,700000446261608]
      ref=700000446261660[700000446261660,700000446261660] lnd=700000446261678[700000446261678,700000446261678]
        LIBRARY OBJECT: object=70000045adbc1e8
        type=CRSR flags=EXS[0001] pflags=[0000] status=VALD load=0
        CHILDREN: size=16
        child#    table reference   handle
             5 70000041776f5c0 70000045ae44720 70000042bfa3a20
        DATA BLOCKS:
        data#     heap  pointer    status pins change whr
            0 70000043d9fed20 70000045adbc300 I/P/A/-/-    0 NONE   00

的确有以debug 模式编译对象的语句,不过对象不是包而是trigger ; 看起来只要是可以以debug 模式compile 的对象都有可能引发该问题。
好了,问题到这里已经比较明确了: 应用端以DEBUG模式重新编译包引发了 Oracle bug 8244533,从而导致了对应服务进程的崩溃;总算是虚惊一场,之后通过trace内的machine和user信息找到了实施变更的应用方人员并教育之。

沪ICP备14014813号-2

沪公网安备 31010802001379号