11g新特性之IO校准(IO Calibration)

IO校准(IO Calibration)特性可以帮助我们了解存储系统的真实性能,以进一步判断I/O性能问题是由数据库还是存储系统自身引起的。I/O校准特性(IO Calibration)通过对Oracle数据文件的随机I/O访问存储介质,其结论值更符合数据库IO性能的真实情况。
在使用该特性前,我们要确保满足以下条件:

  • 调用该存储过程需要用到SYSDBA权限
  • TIME_STATISTICS 参数为true:

SQL> show parameter timed_statistics;

NAME TYPE VALUE
———————————— ———– ——————————
timed_statistics boolean TRUE

  • 必须打开异步IO;注意在使用文件系统时,FILESYSTEMIO_OPTIONS需设为SETALL或ASYNC,否则Oracle不会启用异步IO。
  • 可以通过以下查询检验是否启用了异步IO:
SQL> SELECT NAME, ASYNCH_IO
2    FROM V$DATAFILE F, V$IOSTAT_FILE I
3   WHERE F.FILE# = I.FILE_NO
4     AND FILETYPE_NAME = 'Data File';

NAME                                                                             ASYNCH_IO
-------------------------------------------------------------------------------- ---------
D:\TOOLS\ADMINSTRATORG\ORADATA\PROD\DATAFILE\O1_MF_SYSTEM_65DN8HXT_.DBF       ASYNC_ON
D:\TOOLS\ADMINSTRATORG\ORADATA\PROD\DATAFILE\O1_MF_SYSAUX_65DN8J18_.DBF       ASYNC_ON
D:\TOOLS\ADMINSTRATORG\ORADATA\PROD\DATAFILE\O1_MF_UNDOTBS1_65DN8J1S_.DBF     ASYNC_ON
D:\TOOLS\ADMINSTRATORG\ORADATA\PROD\DATAFILE\O1_MF_USERS_65DN8J2X_.DBF        ASYNC_ON

IO Calibration特性可以通过DBMS_RESOURCE_MANAGER.CALIBRATE_IO存储过程调用;该过程会对Oracle数据文件引发一系列IO敏感的只读工作负载(由1MB大小的随机IO组成),从而判断存储系统所能持续的最大IOPS(每秒最大IO请求数)和MBPS(每秒IO传输速率)。为了使结果更具代表性,应当保持IO Calibration测试过程中整个数据库是空闲的,没有其他IO负载损耗。

让我们具体来看一下CALIBRATE_IO存储过程的调用方法:

 -- Initiate an I/O calibration
 --
 -- Input arguments:
 --  num_physical_disks   - Approximate number of physical disks in
 --                         the database storage
 --  max_latency          - Maximum tolerable latency in milliseconds for
 --                         database-block-sized IO requests
 --
 -- Output arguments:
 --  max_iops       - Maximum number of I/O requests per second that can be
 --                   sustained.  The I/O requests are randomly-distributed,
 --                   database-block-sized reads.
 --  max_mbps       - Maximum throughput of I/O that can be sustained,
 --                   expressed in megabytes per second. The I/O requests
 --                   are randomly-distributed, 1 megabyte reads.
 --  actual_latency - Average latency of database-block-sized I/O
 --                   requests at "max_iops" rate, expressed in milliseconds.
 --
  PROCEDURE calibrate_io (
   num_physical_disks  IN  PLS_INTEGER DEFAULT 1,
   max_latency         IN  PLS_INTEGER DEFAULT 20,
   max_iops            OUT PLS_INTEGER,
   max_mbps            OUT PLS_INTEGER,
   actual_latency      OUT PLS_INTEGER
   );

其中输入参数num_physical_disks用以指定数据库存储所用物理磁盘的近似数;max_latency指定了客户所能容忍的最大IO 延迟时间,单位为ms;一般我们认为IO平均等待在10ms左右是一种较好的表现,远大于这个数字往往说明IO负载过高了,这里我们假定15ms是我们的deadline。

SQL> set serveroutput on;
SQL> DECLARE
  2    lat  INTEGER;
  3    iops INTEGER;
  4    mbps INTEGER;
  5  BEGIN
  6  -- DBMS_RESOURCE_MANAGER.CALIBRATE_IO (disk_count,max_latency , iops, mbps, lat);
  7     DBMS_RESOURCE_MANAGER.CALIBRATE_IO (2, 15, iops, mbps, lat);
  8
  9    DBMS_OUTPUT.PUT_LINE ('max_iops = ' || iops);
 10    DBMS_OUTPUT.PUT_LINE ('latency  = ' || lat);
 11    dbms_output.put_line('max_mbps = ' || mbps);
 12  end;
 13  /

max_iops = 127
latency  = 15
max_mbps = 38

PL/SQL procedure successfully completed
/*平均延迟为15ms时,最大iops为127,mbps为38*/

/*执行过程中也可以通过V$IO_CALIBRATION_STATUS视图了解进度*/
SQL> select * from V$IO_CALIBRATION_STATUS;

STATUS        CALIBRATION_TIME
------------- --------------------------------------------------------------------------------
IN PROGRESS   11-8月 -10 12.02.35.121 下午

SQL> alter session set nls_timestamp_format='HH24:MI';

Session altered
SQL> col start_time for a10;
SQL> col end_time for a10;
SQL> select * from DBA_RSRC_IO_CALIBRATE;

START_TIME END_TIME     MAX_IOPS   MAX_MBPS  MAX_PMBPS    LATENCY NUM_PHYSICAL_DISKS
---------- ---------- ---------- ---------- ---------- ---------- ------------------
12:55      12:59             127         38         14         15                  2
/*DBA_RSRC_IO_CALIBRATE视图记录了IO CALIBRATION的历史信息*/

[gview file=”http://askmac.cn/wp-content/uploads/resource/calibrate_io.pdf”]

Comments

  1. admin says

    Hdr: 7047806 11.1.0.6 RDBMS 11.1.0.6 VOS PRODID-5 PORTID-226 ORA-600
    Abstract: ORA-600 [12240] AFTER INTERRUPTING DBMS_RESOURCE_MANAGER.CALIBRATE_IO
    PROBLEM:
    ——–
    To reproduce, simply run and interrupt (Ctrl-C)
    dbms_resource_manager.calibrate_io.
    Then you are going to see a few cs (Calibration Slave?) background processes
    that are waiting on “KSV master wait”.
    A foreground is trying to kill these processes and its trace file contains
    the following:
    rdel: Failed to kill slaves

    After a few tries, the process gives up with an oeri(12240) error:

    rdel: Failed to kill slaves
    Incident 105769 created, dump file:
    /ade/orabench_lin/oracle/log/diag/rdbms/dwmain/MAIN/incident/incdir_105769/MAI
    N_ora_13684_i105769.trc
    ORA-600: internal error code, arguments: [12240], [], [], [], [], [], [],
    [], [], [], [], []

    At this point pmon takes the lead and tries to cleanup this now dead process
    but cannot and generate the following in its trace file:

    *** 15:14:01.372
    found process 0x4c71495c pid=21 serial=123 ospid = 13684 dead
    rdel: Failed to kill slaves

    DIAGNOSTIC ANALYSIS:
    ——————–
    I have reproduced this in-house on label RDBMS_MAIN_LINUX_080507 with ASM.

    WORKAROUND:
    ———–

    RELATED BUGS:
    ————-

    REPRODUCIBILITY:
    —————-

    TEST CASE:
    ———-

    STACK TRACE:
    ————

    SUPPORTING INFORMATION:
    ———————–

    24 HOUR CONTACT INFORMATION FOR P1 BUGS:
    —————————————-

    DIAL-IN INFORMATION:
    ——————–

    IMPACT DATE:
    ————

    When shutting down the database, these cs slaves generate trace files with
    this sort of content:

    ORA-448: normal completion of background process
    ORA-448: normal completion of background process
    —– SQL Statement (None) —–
    Current SQL information unavailable – no cursor.

    —– Call Stack Trace —–
    calling call entry argument values in hex
    location type point (? means dubious value)
    ——————– ——– ——————–
    —————————-
    skdstdst()+38 call kgdsdst() BFFFA088 ? 2 ?
    ksedst1()+81 call skdstdst()+42949672 BFFFA088 ? 0 ? 1 ? 88787CE
    ?
    58 E5A0C78 ? 88787CE ?
    ksedst()+30 call ksedst1()+308268239 B79C1C88 ? B7BF058C ?
    5 2050033 ? 0 ? FF ? 2004 ?
    dbkedDefDump()+960 call ksedst()+4294967266 0 ? 4BA29BBC ? 4BA29BBC ?
    70A068CE ? 0 ? 4CC507A4 ?
    ksedmp()+35 call dbkedDefDump()+4294 1 ? 0 ?
    966336
    ksfdcse()+7883 call ksedmp()+4294967261 1 ? 13B ? BFFFA4F8 ? E ?
    4C716978 ? 20008000 ?
    kcfcse()+550 call ksfdcse()+429495941 BFFFAB94 ? 0 ? B79D8024 ?
    3 182715 ? 3B039D ? 15 ?
    ksvrdp()+12669 call 00000000 0 ? 0 ? 0 ?
    opirip()+662 call ksvrdp()+539424192 6F746365 ? 64207972 ?
    2073656F ? 20746F6E ?
    73697865 ? 6F662074 ?
    opidrv()+500 call opirip()+4294966634 32 ? 4 ? BFFFC628 ?
    sou2o()+71 call opidrv()+4294966796 32 ? 4 ? BFFFC628 ?

    RELEASE NOTES:
    ]]A state transition was not occuring on CTRL-C
    ]]causing the ORA-600 in this bug.
    ]]
    REDISCOVERY INFORMATION:
    If you perform a CTRL-C while doing IO calibration and
    you see this trace in the alert log, you may have hit this
    bug.
    WORKAROUND:
    None

Comment

*

沪ICP备14014813号-2

沪公网安备 31010802001379号