隐藏参数_high_priority_processes与oradism

运行在操作系统上的进程存在2种系统时序优先级模式:即 实时模式 Real Time(RT) mode, 与分时模式 Time Sharing(TS) mode.
绝大多数Oracle进程运行在TS模式下:

[oracle@rh1 ~]$ ps -efc|grep ora_|grep -v grep
oracle    8510     1 TS   23 Mar27 ?        00:00:02 ora_pmon_PROD
oracle    8512     1 TS   23 Mar27 ?        00:00:00 ora_psp0_PROD
oracle    8514     1 TS   23 Mar27 ?        00:00:00 ora_mman_PROD
oracle    8516     1 TS   23 Mar27 ?        00:00:02 ora_dbw0_PROD
oracle    8518     1 TS   23 Mar27 ?        00:00:04 ora_lgwr_PROD
oracle    8520     1 TS   23 Mar27 ?        00:00:04 ora_ckpt_PROD
oracle    8522     1 TS   23 Mar27 ?        00:00:08 ora_smon_PROD
oracle    8524     1 TS   23 Mar27 ?        00:00:00 ora_reco_PROD
oracle    8526     1 TS   23 Mar27 ?        00:00:34 ora_cjq0_PROD
oracle    8528     1 TS   23 Mar27 ?        00:00:06 ora_mmon_PROD
oracle    8530     1 TS   24 Mar27 ?        00:00:07 ora_mmnl_PROD
oracle    8538     1 TS   23 Mar27 ?        00:00:00 ora_arc0_PROD
oracle    8540     1 TS   23 Mar27 ?        00:00:00 ora_arc1_PROD
oracle    8548     1 TS   23 Mar27 ?        00:00:00 ora_qmnc_PROD
oracle    8555     1 TS   23 Mar27 ?        00:00:00 ora_q000_PROD
oracle    8559     1 TS   23 Mar27 ?        00:00:00 ora_q001_PROD
oracle   30500     1 TS   23 22:10 ?        00:00:00 ora_j000_PROD

如上所示所有进程均运行在TS模式下且priority均为23|24.
Oracle一般不推荐使用RT模式,因为虽然个别进程可以通过这种方式获得更多的CPU资源,但往往系统的瓶颈并非CPU,即尽管CPU使用率高了,但实际系统TPS并未得到提升。
在10gr2版本后RAC中的LMS进程成为唯一一个使用RT模式的Oracle进程,我们可以通过查询参数_high_priority_processes了解相关信息:

SQL> col name format a40
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%priority%';

NAME                                     VALUE
---------------------------------------- ----------
_high_priority_processes                 LMS*
_os_sched_high_priority                  1

_high_priority_processes通过进程功能名进行匹配,下面我们将提高LGWR及PMON进程的优先级:

SQL> alter system set "_high_priority_processes"='LMS*|LGWR|PMON' scope=spfile;

System altered.

SQL> startup force;
ORACLE instance started.

Total System Global Area  281018368 bytes
Fixed Size                  2083336 bytes
Variable Size             150996472 bytes
Database Buffers          121634816 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.
SQL> !ps -efc|grep ora_|grep -v grep
oracle   31441     1 RR   41 22:50 ?        00:00:00 ora_pmon_PROD
oracle   31445     1 TS   23 22:50 ?        00:00:00 ora_psp0_PROD
oracle   31447     1 TS   23 22:50 ?        00:00:00 ora_mman_PROD
oracle   31449     1 TS   23 22:50 ?        00:00:00 ora_dbw0_PROD
oracle   31451     1 RR   41 22:50 ?        00:00:00 ora_lgwr_PROD
oracle   31455     1 TS   23 22:50 ?        00:00:00 ora_ckpt_PROD
oracle   31457     1 TS   23 22:50 ?        00:00:00 ora_smon_PROD
oracle   31459     1 TS   22 22:50 ?        00:00:00 ora_reco_PROD
oracle   31461     1 TS   23 22:50 ?        00:00:01 ora_cjq0_PROD
oracle   31463     1 TS   23 22:50 ?        00:00:01 ora_mmon_PROD
oracle   31465     1 TS   24 22:50 ?        00:00:00 ora_mmnl_PROD
oracle   31471     1 TS   24 22:50 ?        00:00:00 ora_p000_PROD
oracle   31473     1 TS   24 22:50 ?        00:00:00 ora_p001_PROD
oracle   31475     1 TS   24 22:50 ?        00:00:00 ora_arc0_PROD
oracle   31477     1 TS   22 22:50 ?        00:00:00 ora_arc1_PROD
oracle   31481     1 TS   23 22:50 ?        00:00:00 ora_qmnc_PROD
oracle   31488     1 TS   23 22:50 ?        00:00:00 ora_q000_PROD
oracle   31490     1 TS   23 22:50 ?        00:00:00 ora_q001_PROD
oracle   31500     1 TS   23 22:50 ?        00:00:00 ora_j000_PROD

好了lgwr和pmon进程也进入实时模式了,同时priority值上升到了41.
注意:
Oracle默认仅允许LMS进程(11g中多了VKTM进程)使用RT模式是有它的原因的,所以如果不是Oracle support 推荐,您没有任何修改隐式参数的理由。
其次根据Oracle文档[ID 602419.1]的描述,oradism文件(该文件位于$ORACLE_HOME/bin目录下)不正确的权限将导致RT模式无法被正确使用,该文件默认属于root用户并具有s权限。如下测试:

[oracle@rh1 bin]$ ls -la oradism
-r-sr-s---  1 root oinstall 14931 Mar 11  2008 oradism
[oracle@rh1 bin]$ su - root
Password:
[root@rh1 ~]# chown oracle:oinstall /s01/oracle/product/10.2.0/db_1/bin/oradism
[root@rh1 ~]# exit
logout
[oracle@rh1 bin]$ ls -la oradism
-r-xr-x---  1 oracle oinstall 14931 Mar 11  2008 oradism
[oracle@rh1 bin]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Sun Mar 28 23:07:03 2010

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area  281018368 bytes
Fixed Size                  2083336 bytes
Variable Size             150996472 bytes
Database Buffers          121634816 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.
SQL> col name format a35;
SQL> col value format a10;
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%priority%';

NAME                                VALUE
----------------------------------- ----------
_high_priority_processes            LMS*|LGWR|PMON
_os_sched_high_priority             1
SQL> !ps -efc|grep ora_|grep -v grep
oracle   31994     1 TS   23 23:07 ?        00:00:00 ora_pmon_PROD
oracle   31998     1 TS   23 23:07 ?        00:00:00 ora_psp0_PROD
oracle   32000     1 TS   23 23:07 ?        00:00:00 ora_mman_PROD
oracle   32002     1 TS   23 23:07 ?        00:00:00 ora_dbw0_PROD
oracle   32004     1 TS   24 23:07 ?        00:00:00 ora_lgwr_PROD
oracle   32008     1 TS   22 23:07 ?        00:00:00 ora_ckpt_PROD
oracle   32010     1 TS   23 23:07 ?        00:00:00 ora_smon_PROD
oracle   32012     1 TS   22 23:07 ?        00:00:00 ora_reco_PROD
oracle   32014     1 TS   23 23:07 ?        00:00:01 ora_cjq0_PROD
oracle   32016     1 TS   23 23:07 ?        00:00:01 ora_mmon_PROD
oracle   32018     1 TS   24 23:07 ?        00:00:00 ora_mmnl_PROD
oracle   32026     1 TS   24 23:07 ?        00:00:00 ora_arc0_PROD
oracle   32028     1 TS   23 23:07 ?        00:00:00 ora_arc1_PROD
oracle   32032     1 TS   23 23:07 ?        00:00:00 ora_qmnc_PROD
oracle   32045     1 TS   23 23:07 ?        00:00:00 ora_q000_PROD
oracle   32065     1 TS   23 23:08 ?        00:00:00 ora_q001_PROD
oracle   32072     1 TS   23 23:08 ?        00:00:00 ora_j000_PROD

that’s great, 显然oradism不仅为Oracle实例提供了内存资源控制功能,还包括了进程优先级分配的权限。
我们应当再次声明hidden parameter不应“滥用”于production environment.

Comments

  1. admin says

    Hdr: 7488159 10.2.0.3 RDBMS 10.2.0.3 RAC PRODID-5 PORTID-87
    Abstract: LMS PROCESSES DON’T RUN IN REAL TIME PRIORITY

    *** 10/16/08 12:33 am ***
    TAR:
    —-
    7199119.992

    PROBLEM:
    ——–
    LMS processes should be running in RT by default stating 10.2.
    But LMS remains in low priority (time-sharing) despite following settings:

    ls -la oradism
    -rwsr-sr-x 1 root dba-21451792 Apr 28 03:06 oradism

    _high_priority_processes LMS*
    _os_sched_high_priority 1

    [GCO270]:oraenv01:/u002/oracle/env01db/10.2.0/bin>ps -efl | grep lms
    80008001 R 400 764317 524289 0.1 44 0 15M –
    09:44:07 0:00.28 ora_lms0_GCO270
    80008001 R 400 764416 524289 0.1 44 0 15M –
    09:44:08 0:00.29 ora_lms1_GCO270
    80008001 R 400 764432 524289 0.0 44 0 15M –
    09:44:08 0:00.29 ora_lms2_GCO270
    80008001 S 400 764461 524289 0.0 44 0 15M event
    09:44:08 0:00.28 ora_lms3_GCO270

    DIAGNOSTIC ANALYSIS:
    ——————–
    Tru64 UNIX process priority

    A number between 0 – 63.
    44 – 63: represent lowest scheduling priority.
    32 – 43: used by system jobs
    0 -31: reserved for real-time jobs.

    GPRD70]:cgifs01:/usr/users/cgifs01/stat>ps -O SCHED 1045604…
    PID USER %CPU PRI UPR NI PPR PSR POL PSET S TTY
    TIME COMMAND
    849453 cgifs01 0.0 44 44 0 19 6 TS 0 S pts/3
    0:00.13 ksh
    849453 cgifs01 0.0 44 44 0 19 6 TS 0 S pts/3
    0:00.13 ksh
    849453 cgifs01 0.0 44 44 0 19 6 TS 0 S pts/3
    0:00.13 ksh
    849453 cgifs01 0.2 44 44 0 19 6 TS 0 S pts/3
    0:00.13 ksh
    849453 cgifs01 0.2 44 44 0 19 6 TS 0 S pts/3
    0:00.14 ksh
    849453 cgifs01 0.2 44 44 0 19 4 TS 0 S pts/3
    0:00.14 ksh
    849453 cgifs01 0.2 44 44 0 19 7 TS 0 S pts/3
    0:00.14 ksh
    849453 cgifs01 0.2 44 44 0 19 7 TS 0 S pts/3
    0:00.14 ksh
    849453 cgifs01 0.6 44 44 0 19 7 TS 0 S pts/3
    0:00.15 ksh
    849453 cgifs01 0.6 44 44 0 19 7 TS 0 S pts/3
    0:00.15 ksh
    849453 cgifs01 0.6 44 44 0 19 5 TS 0 S pts/3
    0:00.15 ksh
    849453 cgifs01 0.6 44 44 0 19 4 TS 0 S pts/3
    0:00.16 ksh
    849453 cgifs01 0.6 44 44 0 19 6 TS 0 S pts/3
    0:00.16 ksh
    849453 cgifs01 1.0 44 44 0 19 5 TS 0 S pts/3
    0:00.17 ksh
    849453 cgifs01 1.0 44 44 0 19 4 TS 0 S pts/3
    0:00.17 ksh
    849453 cgifs01 1.0 44 44 0 19 6 TS 0 S pts/3
    0:00.17 ksh
    849453 cgifs01 1.0 44 44 0 19 7 TS 0 S pts/3
    0:00.18 ksh

    WORKAROUND:
    ———–
    A script to renice the LMSs and LMDs process

  2. admin says

    Hdr: 9245122 10.2.0.4 RDBMS 10.2.0.4 RAC PRODID-5 PORTID-23
    Abstract: HIGH WAITTIME FOR GC_REMASTER

    *** 12/28/09 06:57 am ***

    BUG TYPE CHOSEN
    ===============
    Performance

    SubComponent: Real Application Clusters
    =======================================
    DETAILED PROBLEM DESCRIPTION
    ============================
    application runs very slow, the database shows high waits for gc_remaster
    (it was not the first time of this issue)

    DIAGNOSTIC ANALYSIS
    ===================
    col ksppinm for a30
    col ksppstvl for a30 tru
    select x.ksppinm, y.ksppstvl
    from x$ksppi x , x$ksppcv y
    where x.indx = y.indx
    and x.ksppinm like ‘%_Parameter_name%’ ==> keep the name inside ‘% %’
    order by x.ksppinm;

    KSPPINM KSPPSTVL
    —————————— ——————————
    _os_sched_high_priority 1

    KSPPINM KSPPSTVL
    —————————— ——————————
    _high_priority_processes LMS*

    defthw99030srv_oracle_DI0DB1> ls -rtl $ORACLE_HOME/bin/oradism
    -r-sr-s— 1 root dba-1249392 Apr 4 2008
    /oracle/system/dbms1020/bin/oradism

    WORKAROUND?
    ===========
    No

    TECHNICAL IMPACT
    ================
    Bad performance

    RELATED ISSUES (bugs, forums, RFAs)
    ===================================
    N/A

    HOW OFTEN DOES THE ISSUE REPRODUCE AT CUSTOMER SITE?
    ====================================================
    Always

    DOES THE ISSUE REPRODUCE INTERNALLY?
    ====================================
    Not attempted

  3. admin says

    Hdr: 9477972 10.2.0.4.0 RDBMS 10.2.0.4.0 VOS PRODID-5 PORTID-23
    Abstract: LMS IS NOT RUNNING IN REAL TIME MODE

    *** 03/16/10 12:22 am ***
    TAR:
    —-

    PROBLEM:
    ——–
    LMS processes is not running in real time mode.

    $ ps -efc | grep lms
    oraperf 17813 15326 FSS 59 14:18:03 pts/9 0:00 grep lms
    oraperf 23530 1 FSS 1 22:02:43 ? 8:46 ora_lms2_PERF1
    oraperf 23542 1 FSS 1 22:02:43 ? 8:46 ora_lms3_PERF1
    oraperf 23522 1 FSS 1 22:02:42 ? 8:00 ora_lms0_PERF1
    oraperf 23526 1 FSS 59 22:02:42 ? 8:28 ora_lms1_PERF1
    oraperf 23546 1 FSS 1 22:02:43 ? 8:36 ora_lms4_PERF1

    They are running in FSS.

    DIAGNOSTIC ANALYSIS:
    ——————–
    Current setting is,
    _os_sched_high_priority = 1
    _high_priority_processes = LMS*

    I found note.602419.1, but oradism already set correctly.
    [oraperf@wsqfinc1a] /opt/app/dtperf $ ls -al $ORACLE_HOME/bin/oradism
    -r-sr-s— 1 root perf 1249392 Apr 4 2008
    /opt/app/dtperf/perfdb/10.2/bin/oradism

  4. admin says

    Hdr: 5635098 10.2.0.2.0 KERNEL 2.0.0.1 PRODID-1309 PORTID-46
    Abstract: INSTANCE HANGS NO APARENT REASON

    From kernel perspective, nothing stands out to point to a kernel issue.
    However, I do see around 181 processes in ‘R’ queue state at the time the
    dump is taken on node itrac19.cern.ch. I do see LMS process running in
    Realtime as well. Many processes are in call sys_gettimeofday() as well,
    especially crsd.bin, racgimon and oracle shadows – lms as well.

    The hang is the culmination of the following factors.
    1. lms0, lms1 running in realtime mode – default starting 10.2 may starve
    other processes to get the cpu. Recommend to use _os_sched_high_priority=0
    in the init.ora, so that lms does not run in realtime priority.
    2. gettimeofday() syscall is very expensive on x86 platform and hence any
    kind of tracing or collecting statistics will impact this.
    gettimeofday() bug 5132861 is fixed in 11g but cannot be backported to your
    db version. I think you can turn off the statistics collection by changing
    the parameters:
    timed_statistics=false , statistics_level=basic

    Hope this helps.

    1. Reviewed the sysrq dumps and the box is simply running to capacity limit
    here. there are atleast 401 processes in ‘R’ state which means the box is
    overloaded and heavily cpu bound.
    2. For the parameter change, that parameter would have worked in 10.2.0.2.
    However due to bug 5848782 that started in 10.2.0.3, setting
    _os_sched_high_priority will not be effective and yes, one need to use
    _high_priority_processes=” to mitigate that.

    3. ok about the statistics level.

  5. admin says

    Refer to LMS and Real Time Priority in Oracle RAC 10g Release 2 [ID 558185.1], we would confirm if this behaviour occur on 11gR2 also. If so, please provide command to check/configure DB parameter to avoid this issue.

    LMS and VKTM should run in realtime.

    Please review the following documents:
    http://www.oracle.com/technology/products/database/clusterware/pdf/rac_aix_system_stability.pdf
    http://public.dhe.ibm.com/partnerworld/pub/whitepaper/162b6.pdf
    http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/7785cdf7e84ea2a6862573b90050e6a6/$FILE/11gR2-tips_August%2025%202010.pdf

    LMS priority should be 39, better to follow the above links for configuring AIX along with Oracle RAC database.

    SELECT x.ksppinm parm
    ,y.ksppstvl value
    ,x.ksppdesc descr
    FROM x$ksppi x
    ,x$ksppcv y
    WHERE x.indx = y.indx
    AND x.ksppdesc like ‘%&parm_pattern%’
    AND x.ksppinm like ‘\_%’ escape ‘\’
    ORDER BY x.ksppinm
    /
    Enter value for parm_pattern: priority

    ========

    also ps -elf | grep lms

    _os_sched_high_priority 1 OS high priority level

    which is the correct setting

    what is the output of ps -elf | grep lms

    if you see the pattern: 60 — then lms process is running in realtime

Comment

*

沪ICP备14014813号-2

沪公网安备 31010802001379号