dbms_stats收集模式在9i和10g上的区别

大约2个月前,一位业内人士问我为什么9i CBO迁移到10g上会出现许多执行计划改变导致的性能,他当然是为了能考考我;实际上我接触过的环境大多在8i/9i下没有使用CBO优化模式,从8i/9i的RBO模式跨越到10g上较为成熟的CBO优化模式,这当中出现执行计划讹误可以说是情理之中的;而9i CBO到10上的CBO问题也不少,我首先想到的是统计信息收集上存在区别,但具体是什么区别却又说不上。那位业内人士听了我的回答,笑,笑而不语。

Oracle十分博大,博大到可以称为Oracle的世界,很多东西长期不用就会遭人淡忘;我们来复习下9i和10g上统计信息收集的一些改动。

在9i中收集统计信息时其默认的MOTHOD_OPT模式为’FOR ALL COLUMNS SIZE 1’,使用这种模式时Oracle只收集所有列上最基础的统计信息,包括了最小/大值,distinct值等信息;但是不会收集列上的直方图。对那些数据均匀分布和没有出现在SQL语句中where子句中作为条件的列来说,这样的统计信息完全足够了。然而如果列上的数据分布并不均匀就可能导致CBO的执行计划成本计算不准确,这时我们需要手动对这些列上的直方图进行统计。

10g上对dbms_stats包中默认的METHOD_OPT模式做了修正,这显然是引起9i CBO迁移到10g CBO后易发地执行计划变化的一个重要因素,也是那位业内人士所要问的题眼。

新的默认METHOD_OPT值为”FOR ALL COLUMNS SIZE AUTO”,这意味着Oracle将通过内部算法自动决定那些列上需要收集统计信息,而那些列上不需要。是否收集直方图取决于列上数据的分布情况和与对应表相关的工作负载,这种工作负载可以解释为数据库中存在某些需要参考这些列的详细信息来计算执行成本的SQL语句。

这种方式听上去十分理想,似乎Oracle可以默默无闻地为我们抓取所有急需的统计信息。

然而问题是在许多环境中Oracle没有做出是否需要收集列上直方图的正确决定。实践证明Oracle有可能收集许许多多不必要的直方图,同时又放弃了许多需要收集的直方图。

在轻量级的应用环境中这种直方图收集不当的问题造成的影响大多数时间不为人们所察觉,相反在performance critical或已经形成性能瓶颈的环境中则可能是一场不大不小的麻烦。

此外Oracle还改变了列上密度(density)信息的计算方式。该值常被Oracle用来确定谓词选择性,当突然出现额外不必要的直方图时可能造成的广泛显著地性能影响(当然好的影响也可能出现,只是概率上……)。

显然这些莫名出现的不速之客也会给共享池造成影响,library cache与row cache相关的闩可能短期内车水马龙,如果您的应用数据表上有成百上千的列那么情况可能更糟(所以说开发要遵循范式,没有规矩的最后结果往往是应用不可用,项目失败。别告诉我你的应用苟且地活着,那同样意味着项目失败)!

Comments

  1. admin says

    Applies to:
    Oracle Server – Enterprise Edition – Version: 9.0.1.0 to 10.2.0.4
    Information in this document applies to any platform.
    Goal
    This document outlines how to determine the default parameter settings when gathering statistics on Table on 9i and 10g.

    Solution

    On 9i, Gather procedures have a number of hard coded default values
    On 10g, All procedures that gather optimizer statistics no longer have hardcoded default values.

    The defaults can be viewed using :

    select dbms_stats.get_param(‘cascade’) from dual;
    select dbms_stats.get_param(‘degree’) from dual;
    select dbms_stats.get_param(‘estimate_percent’) from dual;
    select dbms_stats.get_param(‘method_opt’) from dual;
    select dbms_stats.get_param(‘no_invalidate’) from dual;
    select dbms_stats.get_param(‘granularity’) from dual;

    Parameters can be set using DBMS_STATS.SET_PARAM.

    EXAMPLES

    Get Parameter Example

    SQL> select dbms_stats.get_param(‘method_opt’) from dual;

    DBMS_STATS.GET_PARAM(‘METHOD_OPT’)
    ———————————-
    FOR ALL COLUMNS SIZE AUTO

    Set Parameter Example

    SQL> exec dbms_stats.set_param(‘METHOD_OPT’, ‘FOR ALL COLUMNS SIZE 1’)
    PL/SQL procedure successfully completed.

    SQL> select dbms_stats.get_param(‘method_opt’) from dual;

    DBMS_STATS.GET_PARAM(‘METHOD_OPT’)
    ———————————-
    FOR ALL COLUMNS SIZE 1

    The default values on 9i are hard coded and are as follows:

    DBMS_STATS.GATHER_TABLE_STATS (

    ownname VARCHAR2,
    tabname VARCHAR2,
    partname VARCHAR2 DEFAULT NULL, –> ALL partitions
    estimate_percent NUMBER DEFAULT NULL, –> 100% sample
    block_sample BOOLEAN DEFAULT FALSE,
    method_opt VARCHAR2 DEFAULT ‘FOR ALL COLUMNS SIZE 1’,
    degree NUMBER DEFAULT NULL, –> parallel degree 1
    granularity VARCHAR2 DEFAULT ‘DEFAULT’, –> level (PARTITION + GLOBAL)
    cascade BOOLEAN DEFAULT FALSE, –> does not cascade to indexes by default
    no_invalidate BOOLEAN DEFAULT FALSE);

  2. admin says

    Show Optimizer Statistics for CBO
    set echo off
    set scan on
    set lines 150
    set pages 66
    set verify off
    set feedback off
    set termout off

    column uservar new_value Table_Owner noprint
    select user uservar from dual;
    set termout on
    column TABLE_NAME heading “Tables owned by &Table_Owner” format a30
    select table_name from dba_tables where owner=upper(‘&Table_Owner’) order by 1
    /
    undefine table_name
    undefine owner
    prompt
    accept owner prompt ‘Please enter Name of Table Owner (Null = &Table_Owner): ‘
    accept table_name prompt ‘Please enter Table Name to show Statistics for: ‘
    column TABLE_NAME heading “Table|Name” format a15
    column PARTITION_NAME heading “Partition|Name” format a15
    column SUBPARTITION_NAME heading “SubPartition|Name” format a15
    column NUM_ROWS heading “Number|of Rows” format 9,999,999,999,990
    column BLOCKS heading “Blocks” format 999,990
    column EMPTY_BLOCKS heading “Empty|Blocks” format 999,999,990

    column AVG_SPACE heading “Average|Space” format 9,990
    column CHAIN_CNT heading “Chain|Count” format 999,990
    column AVG_ROW_LEN heading “Average|Row Len” format 990
    column COLUMN_NAME heading “Column|Name” format a25
    column NULLABLE heading Null|able format a4
    column NUM_DISTINCT heading “Distinct|Values” format 999,999,990
    column NUM_NULLS heading “Number|Nulls” format 9,999,990
    column NUM_BUCKETS heading “Number|Buckets” format 990
    column DENSITY heading “Density” format 990
    column INDEX_NAME heading “Index|Name” format a15
    column UNIQUENESS heading “Unique” format a9
    column BLEV heading “B|Tree|Level” format 90
    column LEAF_BLOCKS heading “Leaf|Blks” format 990
    column DISTINCT_KEYS heading “Distinct|Keys” format 9,999,999,990
    column AVG_LEAF_BLOCKS_PER_KEY heading “Average|Leaf Blocks|Per Key” format 99,990
    column AVG_DATA_BLOCKS_PER_KEY heading “Average|Data Blocks|Per Key” format 99,990
    column CLUSTERING_FACTOR heading “Cluster|Factor” format 999,999,990
    column COLUMN_POSITION heading “Col|Pos” format 990
    column col heading “Column|Details” format a24
    column COLUMN_LENGTH heading “Col|Len” format 9,990
    column GLOBAL_STATS heading “Global|Stats” format a6
    column USER_STATS heading “User|Stats” format a6
    column SAMPLE_SIZE heading “Sample|Size” format 9,999,999,999,990
    column to_char(t.last_analyzed,’MM-DD-YYYY’) heading “Date|MM-DD-YYYY” format a10

    prompt
    prompt ***********
    prompt Table Level
    prompt ***********
    prompt
    select
    TABLE_NAME,
    NUM_ROWS,
    BLOCKS,
    EMPTY_BLOCKS,
    AVG_SPACE,
    CHAIN_CNT,
    AVG_ROW_LEN,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from dba_tables t
    where
    owner = upper(nvl(‘&&Owner’,user))
    and table_name = upper(‘&&Table_name’)
    /
    select
    COLUMN_NAME,
    decode(t.DATA_TYPE,
    ‘NUMBER’,t.DATA_TYPE||'(‘||
    decode(t.DATA_PRECISION,
    null,t.DATA_LENGTH||’)’,
    t.DATA_PRECISION||’,’||t.DATA_SCALE||’)’),
    ‘DATE’,t.DATA_TYPE,
    ‘LONG’,t.DATA_TYPE,
    ‘LONG RAW’,t.DATA_TYPE,
    ‘ROWID’,t.DATA_TYPE,
    ‘MLSLABEL’,t.DATA_TYPE,
    t.DATA_TYPE||'(‘||t.DATA_LENGTH||’)’) ||’ ‘||
    decode(t.nullable,
    ‘N’,’NOT NULL’,
    ‘n’,’NOT NULL’,
    NULL) col,
    NUM_DISTINCT,
    DENSITY,
    NUM_BUCKETS,
    NUM_NULLS,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from dba_tab_columns t
    where
    table_name = upper(‘&Table_name’)
    and owner = upper(nvl(‘&Owner’,user))
    /

    select
    INDEX_NAME,
    UNIQUENESS,
    BLEVEL BLev,
    LEAF_BLOCKS,
    DISTINCT_KEYS,
    NUM_ROWS,
    AVG_LEAF_BLOCKS_PER_KEY,
    AVG_DATA_BLOCKS_PER_KEY,
    CLUSTERING_FACTOR,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from
    dba_indexes t
    where
    table_name = upper(‘&Table_name’)
    and table_owner = upper(nvl(‘&Owner’,user))
    /
    break on index_name
    select
    i.INDEX_NAME,
    i.COLUMN_NAME,
    i.COLUMN_POSITION,
    decode(t.DATA_TYPE,
    ‘NUMBER’,t.DATA_TYPE||'(‘||
    decode(t.DATA_PRECISION,
    null,t.DATA_LENGTH||’)’,
    t.DATA_PRECISION||’,’||t.DATA_SCALE||’)’),
    ‘DATE’,t.DATA_TYPE,
    ‘LONG’,t.DATA_TYPE,
    ‘LONG RAW’,t.DATA_TYPE,
    ‘ROWID’,t.DATA_TYPE,
    ‘MLSLABEL’,t.DATA_TYPE,
    t.DATA_TYPE||'(‘||t.DATA_LENGTH||’)’) ||’ ‘||
    decode(t.nullable,
    ‘N’,’NOT NULL’,
    ‘n’,’NOT NULL’,
    NULL) col
    from
    dba_ind_columns i,
    dba_tab_columns t
    where
    i.table_name = upper(‘&Table_name’)
    and owner = upper(nvl(‘&Owner’,user))
    and i.table_name = t.table_name
    and i.column_name = t.column_name
    order by index_name,column_position
    /

    prompt
    prompt ***************
    prompt Partition Level
    prompt ***************

    select
    PARTITION_NAME,
    NUM_ROWS,
    BLOCKS,
    EMPTY_BLOCKS,
    AVG_SPACE,
    CHAIN_CNT,
    AVG_ROW_LEN,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from
    dba_tab_partitions t
    where
    table_owner = upper(nvl(‘&&Owner’,user))
    and table_name = upper(‘&&Table_name’)
    order by partition_position
    /

    break on partition_name
    select
    PARTITION_NAME,
    COLUMN_NAME,
    NUM_DISTINCT,
    DENSITY,
    NUM_BUCKETS,
    NUM_NULLS,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from
    dba_PART_COL_STATISTICS t
    where
    table_name = upper(‘&Table_name’)
    and owner = upper(nvl(‘&Owner’,user))
    /

    break on partition_name
    select
    t.INDEX_NAME,
    t.PARTITION_NAME,
    t.BLEVEL BLev,
    t.LEAF_BLOCKS,
    t.DISTINCT_KEYS,
    t.NUM_ROWS,
    t.AVG_LEAF_BLOCKS_PER_KEY,
    t.AVG_DATA_BLOCKS_PER_KEY,
    t.CLUSTERING_FACTOR,
    t.GLOBAL_STATS,
    t.USER_STATS,
    t.SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from
    dba_ind_partitions t,
    dba_indexes i
    where
    i.table_name = upper(‘&Table_name’)
    and i.table_owner = upper(nvl(‘&Owner’,user))
    and i.owner = t.index_owner
    and i.index_name=t.index_name
    /

    prompt
    prompt ***************
    prompt SubPartition Level
    prompt ***************

    select
    PARTITION_NAME,
    SUBPARTITION_NAME,
    NUM_ROWS,
    BLOCKS,
    EMPTY_BLOCKS,
    AVG_SPACE,
    CHAIN_CNT,
    AVG_ROW_LEN,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from
    dba_tab_subpartitions t
    where
    table_owner = upper(nvl(‘&&Owner’,user))
    and table_name = upper(‘&&Table_name’)
    order by SUBPARTITION_POSITION
    /
    break on partition_name
    select
    p.PARTITION_NAME,
    t.SUBPARTITION_NAME,
    t.COLUMN_NAME,
    t.NUM_DISTINCT,
    t.DENSITY,
    t.NUM_BUCKETS,
    t.NUM_NULLS,
    t.GLOBAL_STATS,
    t.USER_STATS,
    t.SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from
    dba_SUBPART_COL_STATISTICS t,
    dba_tab_subpartitions p
    where
    t.table_name = upper(‘&Table_name’)
    and t.owner = upper(nvl(‘&Owner’,user))
    and t.subpartition_name = p.subpartition_name
    and t.owner = p.table_owner
    and t.table_name=p.table_name
    /

    break on partition_name
    select
    t.INDEX_NAME,
    t.PARTITION_NAME,
    t.SUBPARTITION_NAME,
    t.BLEVEL BLev,
    t.LEAF_BLOCKS,
    t.DISTINCT_KEYS,
    t.NUM_ROWS,
    t.AVG_LEAF_BLOCKS_PER_KEY,
    t.AVG_DATA_BLOCKS_PER_KEY,
    t.CLUSTERING_FACTOR,
    t.GLOBAL_STATS,
    t.USER_STATS,
    t.SAMPLE_SIZE,
    to_char(t.last_analyzed,’MM-DD-YYYY’)
    from
    dba_ind_subpartitions t,
    dba_indexes i
    where
    i.table_name = upper(‘&Table_name’)
    and i.table_owner = upper(nvl(‘&Owner’,user))
    and i.owner = t.index_owner
    and i.index_name=t.index_name
    /

    clear breaks
    set echo on

  3. antiper says

    这个博客还不错!
    有点收获!

  4. Managing CBO Stats during an upgrade to 10g or 11g
    Applies to:
    Oracle Server – Enterprise Edition – Version: 10.1.0.2 to 11.1.0.6
    Information in this document applies to any platform.
    This note applies to Oracle Applications, Siebel, PSFT as well as databases with custom applications.
    Goal

    Cost-based Optimizer (CBO) uses statistics when generating explain plans. CBO statistics can be classified into four types: schema objects, data dictionary, fixed objects, and system performance.

    CBO statistics for schema objects have been used for several releases now. CBO statistics for data dictionary were introduced on 9i and considered optional. CBO system statistics were also introduced on 9i but rarely implemented. CBO statistics for fixed objects were introduced on 10g.

    This note provides guidance managing existing and new CBO stats during an upgrade from 9i to 10g or 11g; or from 10g to 11g. The terms pre-upgrade release and post-upgrade release are used accordingly. Pre-upgrade refers then to 9i or 10g, and post-upgrade to 10g or 11g.

    The core idea presented by this note is to continue gathering CBO statistics for application schemas as usual, but create a clean baseline for non-application objects (data dictionary, fixed objects and system performance).

    Strategy:

    1. For application schema objects, use specific procedure required by the application vendor if they provide one. For example, if using Oracle eBiz, continue using FND_STATS. If there is no procedure provided by the application vendor, or this is a home-grown application or customization, use initially the same defaults or settings you were using in your pre-upgrade release. Be aware that some of the defaults in DBMS_STATS package have changed from 9i to 10g/11g, so you may need to use the SET_PARAM api to alter them back to the pre-upgrade release levels if your upgrade is from 9i.

    2. For data dictionary objects, gather full statistics once, without histograms. Do not re-gather until a new major upgrade requires so, or the workload changes. Re-gather if you make massive schema changes to the environment (e.g., add 100’s of new database users or drop and create a large number of objects, plug-in new tablespaces into the database, etc).

    3. For fixed objects, gather once during normal system load. Do not re-gather until a new major upgrade requires so, or the workload changes.

    4. For system performance stats, gather once with normal system load. Repeat only if system configuration or load changes significantly.
    Solution

    Before upgrading from 9i to 10g/11g; or from 10g to 11g:

    1. Continue gathering CBO statistics as per your current procedures.

    2. Make a full backup of your CBO statistics. Use scripts coe_create_user_coecbostats.sql and coe_backup_cbo_stats.sql connected as SYS.

    3. If pre-upgrade instance will be destroyed soon after the upgrade, make an export of schema owner COECBOSTATS using EXP utility.

    After upgrading from 9i to 10g/11g; or from 10g to 11g:

    1. If your application provides its own procedure to gather CBO statistics, discontinue immediately the execution of the job that performs an automatic gathering of CBO statistics:

    * On 10g, connect as SYS and EXEC DBMS_SCHEDULER.DISABLE(‘GATHER_STATS_JOB’);
    * On 11g, connect as SYS and EXEC DBMS_AUTO_TASK_ADMIN.DISABLE(‘auto optimizer stats collection’, NULL, NULL);

    2. Create a baseline of Data Dictionary CBO statistics by using script coe_gather_dictionary_stats.sql connected as SYS.

    3. Create during normal system load a baseline of Fixed Objects CBO statistics by using script coe_gather_fixed_objects_stats.sql connected as SYS.

    4. Create a baseline of System statistics for the CBO by using scripts coe_gather_system_stats_nw.sql, coe_gather_system_stats_start.sql and coe_gather_system_stats_stop.sql connected as SYS.

    4.1 Execute coe_gather_system_stats_nw.sql once in order to generate system statistics that are independent of the system workload (no workload “nw”).

    4.1 During normal system load, execute first coe_gather_system_stats_start.sql, then wait for at least two hours and execute second coe_gather_system_stats_stop.sql. This set will generate system statistics that depend on the system workload during the start and stop times. If your workload or hardware configuration changes over time, you will have to execute this set in the same manner after the system load/configuration has been implemented and during normal system utilization.

    Scripts: (attached to this note)

    These scripts are provided as a mechanism to perform the actions described in this note. They include a backup of the CBO statistics to be refreshed by the script.

    1. coe_create_user_coecbostats.sql creates a new schema owner COECBOSTATS with one object (COE$_STATTAB). Table COE$_STATTAB is a repository to store persistent versions of CBO statistics for any of the four types: schema objects, data dictionary, fixed objects, and system performance. Execute connected as SYS.

    2. coe_backup_cbo_stats.sql can be used on 9i, 10g or 11g. It creates a backup of all four types of CBO statistics when executed on 10g/11g, and all three valid types on 9i (skipping fixed objects). Execute connected as SYS.

    3. coe_gather_dictionary_stats.sql generates corresponding CBO statistics using an estimated percentage of 100% (compute), no histograms, and cascades into all related indexes (gathers statistics on indexes). It makes a backup of these statistics before and after gathering. Execute connected as SYS. It can be used on 9i, 10g or 11g.

    4. coe_gather_fixed_objects_stats.sql generates corresponding CBO statistics. It makes a backup of these statistics before and after gathering. Execute connected as SYS. It can be used on 10g or 11g.

    5. coe_gather_system_stats_nw.sql gathers system statistics that are independent of the workload. Execute connected as SYS. It can be used on 10g or 11g.

    6. coe_gather_system_stats_start.sql is used in combination with coe_gather_system_stats_stop.sql. They start and stop the gathering of system statistics that dependent of the system workload and configuration. They must be executed during a normal system utilization window. Execute connected as SYS. It can be used on 10g or 11g.

    7. coe_gather_system_stats_stop.sql is used in combination with coe_gather_system_stats_start.sql. They start and stop the gathering of system statistics that dependent of the system workload and configuration. They must be executed during a normal system utilization window. Execute connected as SYS. It can be used on 10g or 11g.

    Words of Caution:

    1. If your application does not provide specific instructions regarding CBO statistics gathering and you were using default functionality for DBMS_STATS on your pre-upgrade release (9i/10g), you may want to take the conservative approach of preserving the “pre-upgrade release” functionality initially, and gradually incorporate the new defaults for DBMS_STATS on the post-upgrade release (10g/11g). This is specially important from 9i to 10g or 11g.

    2. ESTIMATE_PERCENT had a default of 100% for sample size on 9i, while 10g defaults this parameter to DBMS_STATS.AUTO_SAMPLE_SIZE, which derives a very small estimate percentage (sample size). Small sample sizes are known to produce poor number of distinct values NDV on columns with skewed data (which are common), thus generate sub-optimal plans. Use then an estimate sample size of 100% on 10g if your window maintenance can afford it, even if that means gather statistics less often. If 100% were not feasible, try using at least an estimate of 30%. On 11g the default value of DBMS_STATS.AUTO_SAMPLE_SIZE gathers stats with a large sample size, so using the default value is a better approach.

    3. METHOD_OPT has a default of “FOR ALL COLUMNS SIZE 1” on 9i, which basically meant NO HISTOGRAMS. 10g and 11g default to AUTO, which means DBMS_STATS decides in which columns a histogram may help to produce a better plan. It is known that in some cases, the effect of a histogram is adverse to the generation of a better plan. Again, you may want to initially set this parameter to its pre-upgrade release used value, and later adjust to your post-upgrade release default value.

    4. In summary, to avoid causing changes to the execution plan, try to keep the statistics gathering process as close as possible to your pre-upgrade release, at least until after the upgrade is complete and stable, then adjust gradually to the features provided by the new release.

  5. SIZE Clause in METHOD_OPT Parameter of DBMS_STATS Package
    Applies to:
    Oracle Server – Enterprise Edition – Version: 9.0.1.0 to 11.2.0.2
    Information in this document applies to any platform.
    Purpose
    This note clarifies use of the SIZE clause in the METHOD_OPT parameter of the DBMS_STATS package and its default value.
    Scope and Application
    For DBAs of all levels.
    SIZE Clause in METHOD_OPT Parameter of DBMS_STATS Package

    SIZE clause is an optional clause in the METHOD_OPT parameter, which itself is an optional parameter in the DBMS_STATS package’s procedures GATHER_DATABASE_STATS, GATHER_DICTIONARY_STATS, GATHER_SCHEMA_STATS, GATHER_TABLE_STATS. This parameter determines whether to collect histograms on the columns of the tables, and if so, on which columns and how.

    Description of the METHOD_OPT parameter in the Oracle Documentation ‘Database PL/SQL Packages and Types Reference’ (Chapter DBMS_STATS) is as follows:

    METHOD_OPT

    accepts:

    *

    FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
    *

    FOR COLUMNS [size clause] column|attribute [size_clause] [,column|attribute [size_clause]…]

    size_clause is defined as size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY}

    – integer : Number of histogram buckets. Must be in the range [1,254].
    – REPEAT : Collects histograms only on the columns that already have histograms.
    – AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns.
    – SKEWONLY : Oracle determines the columns to collect histograms based on the data distribution of the columns.

    The [size_clause] in METHOD_OPT parameter determines the number of histogram buckets for a column, which must be in the range [1,254]. ‘SIZE 1’ means no histograms.

    In Oracle 9i METHOD_OPT, if not specified, defaults to ‘FOR ALL COLUMNS SIZE 1’, which means no histograms are collected for all columns in the table.

    In Oracle 10g and 11g default value of METHOD_OPT was changed to ‘FOR ALL COLUMNS SIZE AUTO’, which means Oracle automatically determines on which columns to collect histograms and how. This often leads to many more columns with histograms than it would have been in Oracle 9i. That in turn may negatively affect execution plans of some SQLs after upgrade from 9i. See Note 465787.1 for more details.

    If you do specify METHOD_OPT but do not specify the [size_clause], then in all Oracle versions 9i, 10g, 11g [size_clause] defaults to ‘SIZE 75’. In cases where the number of distinct values (NDV) for a column is less than 75, the number of histograms buckets for this column becomes =select DBMS_STATS.GET_PARAM(‘METHOD_OPT’) from dual;

    In Oracle 11g you can also use new function GET_PREFS for this purpose.

    Default value of METHOD_OPT parameter in Oracle 10g and 11g can be changed using SET_PARAM procedure. For example:
    EXEC DBMS_STATS.SET_PARAM (‘method_opt’,’for all columns size 1′);

    In Oracle 11g you can also use new DBMS_STATS package’s procedures SET_GLOBAL_PREFS, SET_DATABASE_PREFS, SET_SCHEMA_PREFS, SET_TABLE_PREFS, which add more flexibility when setting default values at different database levels.

  6. How to Change Default Parameters for Gathering Statistics
    Applies to:
    Oracle Server – Enterprise Edition – Version: 10.1.0.2 to 11.1.0.7
    Information in this document applies to any platform.
    Oracle Server Enterprise Edition – Version: 10.1.0.2 to 11.1.0.7
    Goal
    This document outlines how to change the default parameters in use for gathering optimizer statistics in Oracle 10g and Oracle 11g and provides an outline of the values possible for these parameters. This is useful when using automatic statistics gathering within a maintenance window, as these default parameter settings define how these statistics will be collected.
    Solution
    Syntax for Changing Default Parameters for Gathering Statistics

    In 10g and 11g the default values for parameters used to gather statistics may be changed
    by using the DBMS_STATS.SET_PARAM procedure.

    The procedure needs to be run individually for each parameter that one wishes to change

    The syntax for this is as follows:

    DBMS_STATS.SET_PARAM ( pname IN VARCHAR2, pval IN VARCHAR2);

    Pname is the parameter name.
    Pval is the value to set for the associated ‘Pname’ parameter.

    Which Parameters can be Set?

    The DBMS_STATS.SET_PARAM can be used to set the following parameters:

    CASCADE
    Controls whether indexes are analyzed at the same time
    Default: TRUE
    Possible Values:

    *
    TRUE
    *
    FALSE

    Note: The default value for CASCADE set by SET_PARAM is not used by export/import procedures. It is used only by gather procedures.

    Example: exec DBMS_STATS.SET_PARAM(‘CASCADE’,’FALSE’);

    DEGREE
    Degree of parallelism.
    Default: NULL
    Possible Values:

    *
    NULL – Use the table default value specified by the DEGREE clause in the CREATE TABLE or ALTER TABLE statement.
    *
    integer – the integer will be used as degree for all objects

    Example: exec DBMS_STATS.SET_PARAM(‘DEGREE’,’5′);

    ESTIMATE_PERCENT
    Percentage of rows to estimate
    Default: DBMS_STATS.AUTO_SAMPLE_SIZE
    Possible Values:

    * Valid range is [0.000001,100]
    * NULL – compute will be used (100%)
    * DBMS_STATS.AUTO_SAMPLE_SIZE – sample sizes may vary for different versions, for example, this tends to default to a smaller sample size in 10g than in 11g.

    Example: exec DBMS_STATS.SET_PARAM(‘ESTIMATE_PERCENT’,’NULL’);
    Note: When NULL is unquoted, this sets the parameter to the value Oracle recommends.
    In the case of the quoted ‘NULL’, this sets the value of the parameter itself to NULL so that
    the above example indicates that estimate_percent=null (i.e compute) as opposed to NULL without
    quotes that would imply using the default for this parameter for the specific Oracle version

    METHOD_OPT
    Used to gather column statistics
    Default: FOR ALL COLUMNS SIZE AUTO.
    Possible Values:

    * FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
    * FOR COLUMNS [size clause] column|attribute [size_clause] [,column|attribute [size_clause]…]
    size_clause is defined as size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY} integer : Number of histogram buckets. Must be in the range [1,254].
    o REPEAT : Collects histograms only on the columns that already have histograms.
    o AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns.
    o SKEWONLY : Oracle determines the columns to collect histograms based on the data distribution of the columns.

    Example: exec DBMS_STATS.SET_PARAM(‘METHOD_OPT’, ‘FOR ALL COLUMNS SIZE 1’);

    NO_INVALIDATE
    Determines whether to invalidate dependent cursors or not
    Default: DBMS_STATS.AUTO_INVALIDATE
    Possible Values:

    * DBMS_STATS.AUTO_INVALIDATE – Oracle decide when to invalidate dependent cursors.
    * TRUE – Does not invalidate the dependent cursors
    * FALSE – Invalidates dependent cursors

    Example: exec DBMS_STATS.SET_PARAM(‘NO_INVALIDATE’,’FALSE’);

    GRANULARITY
    Determines granularity of statistics to collect (only pertinent if the table is partitioned).
    Default: ‘AUTO’
    Possible Values:

    * ‘AUTO’ – determines the granularity based on the partitioning type
    * ‘ALL’ – gathers all (subpartition, partition, and global) statistics
    * ‘GLOBAL’ – gathers global statistics
    * ‘GLOBAL AND PARTITION’ – gathers the global and partition level statistics. No subpartition level statistics are gathered even if it is a composite partitioned object.
    * ‘PARTITION ‘- gathers partition-level statistics
    * ‘SUBPARTITION’ – gathers subpartition-level statistics.

    Note: ‘DEFAULT’ is obsolete. This option gathers global and partition-level statistics
    It is currently supported, but included in the documentation for legacy reasons only.
    Use the ‘GLOBAL AND PARTITION’ for this functionality.

    Example: exec DBMS_STATS.SET_PARAM(‘GRANULARITY’,’GLOBAL AND PARTITION’);

    AUTOSTATS_TARGET
    This parameter is applicable only for auto statistics collection.
    The value of this parameter controls the objects considered for statistics collection
    Default: ‘AUTO’
    Possible Values:

    * ‘AUTO’ – Oracle decides for which objects to collect statistics
    * ‘ALL’ – Statistics are collected for all objects in the system
    * ‘ORACLE’ – Statistics are collected for all Oracle owned objects.
    This option restricts the list of schemas for which the automatic stats
    gathering job will gather statistics to a list of Oracle component system
    E.g. SYS, SYSMAN, WMSYS and EXFSYS in a sample database

    Usage Notes

    * To run this procedure, the user must have the SYSDBA or both the ANALYZE ANY DICTIONARY and ANALYZE ANY system privileges.
    * Note that both arguments are of type VARCHAR2 and the values need to be enclosed in quotes even when they represent numbers.
    * Note also the difference between NULL and ‘NULL’:
    o When NULL is unquoted, this sets the parameter to the value Oracle recommends.
    o In the case of the quoted ‘NULL’, this sets the value of the parameter to NULL.

    How to Check Present Values for a Parameter

    In order to check the present value for a certain parameter do:
    select dbms_stats.get_param(pname) from dual;

  7. Statistics Gathering: Frequency and Strategy Guidelines

    Purpose
    ——-
    Provide guidelines for Gathering CBO statistics.

    Audience
    ——–
    DBAs

    Recommendations for Gathering CBO Statistics
    ——————————————–

    Summary
    ——-

    o Use individual statistic gathering commands for more control
    o Gather statistics on tables with a 5% sample
    o Gather statistics on indexes with compute
    o Add histograms where column data is known to be skewed

    Explanation of summary:
    ———————–
    The level to which objects statistics should be collected is very much data dependant.
    The goal is to read as little data as possible to achieve an accurate
    sample. Different sample sizes may be required to generate accurate enough
    figures to produce acceptable plans.

    Research has indicated that a 5% sample is generally sufficient for most tables.
    Gathering statistics on tables requires sorting to be done. Gathering statistics
    on indexes does not because the data is already sorted.
    Often this means that a compute on an index will perform acceptably whereas
    a compute on a table will not.
    Column statistics in the form of histograms are only appropriate for columns
    whose distribution deviates from the expected uniform distribution.

    Gathering statistics detail
    —————————

    The following article is a collection of opinions. Different systems need
    different levels of statistical analysis due to differences in data. However,
    If these recommendations are used sensibly, they give good basic guidelines
    for gathering statistics of objects.

    – The reason for gathering statistics is to provide CBO with the
    best information possible to help it choose a ‘good’ execution plans.

    – The accuracy of the stats depends on the sample size.

    – Even given COMPUTED stats, it is possible that CBO will not
    arrive at the BEST plan for a given SQL statement. This is because
    the optimizer inherently makes assumptions and has only limited
    information available.

    – Given a production system with predictable, known queries,
    the ‘best’ execution plan for each statement is not likely to
    vary over time – unless the application is unusual and
    uses data with wildy different characteristics from day to day.

    – Given the ‘best’ plan is unlikely to change, frequent gathering
    statistics has no benefit. It does incur costs though.

    – To determine the best sample size it is best to use gather statistics
    using different sample sizes and look at the results.
    The statistics should be fairly consistent once a reasonable
    sample size has been used. Increasing the sample size beyond
    a given size is unlikely to improve the accuracy. You can see this
    easily by analyzing 10 rows, 100 rows, 1000 rows etc.. At some
    point the results should start to look consistent.

    – To determine the best statistic gathering interval one should
    keep a history of the statistics from BEFORE and AFTER each collect.
    By keeping a history the user can check for varying statistics and
    adjust the sampling accordingly.
    If the statistics remain reasonably constant then the statistic
    gathering activity may not be adding any value. Unfortunately,
    it is not possible to determine that there will be a problem without
    collecting the statistics.

    – If the before / after stats do vary often then either the
    data profile is not predictable or the sample size is too small
    to be accurately reflecting the true nature of the data,
    Or (unlikely) the data is too random, in which case the stats
    are of limited use anyway and one should think of ways of ensuring
    queries using the data will get at least a reasonable response.

    – As the CBO uses the stats as the basis of its cost calculations, gathering
    new statistics on a table MAY result in a different execution plan
    for some statements. This is expected behaviour and allows the CBO to
    adjust access paths if the data profile changes.

    – It is possible that a different execution plan may be worse than
    the original plan. The difference may be small or quite large.
    This is a non-negotiable fact. Given a set of base information
    CBO will choose a plan. Given slightly different base information
    it may choose a different plan. It is unlikely the change of plan
    will coincide exactly with the real point at which a change of
    plan would be beneficial.

    – For most production systems predictability is more important than
    absolute best performance. Hopefully from the above it is clear
    than gathering statistics can have a destabilizing effect.
    This is not to say it should not be used, but to be aware of what may happen.

    – Most applications have several queries that form the heart of
    most transactions. These queries are critical in that any
    adverse change in execution plan could incur a high cost due to
    the number and frequency of usage of these statements.

    It is a good idea to isolate such statements into a test-suite
    where sample queries can be used to guage if performance of the
    key statements has deteriorated badly due to a statistics collect.
    IE: after gathering statistics check these statements
    still response in acceptable times.

    – It is recommended that users should store critical statistics to allow
    them to revert back to a working configuration in the event of a statistic
    change that significantly affects application performance. The following
    article explains how to store statistics: Note:117203.1

    – If gathering statistics causes critical statements to perform badly you
    can revert to the pre-analyze stats by:

    – importing previously exported statistics

    – using database point=in-time recovery.

    – re-analyze with a larger sample size in an attempt to
    generate more accurate statistics

    – Look at the bad plan/plans to see where the ‘bad’ chunk
    of cost has been introduced.

    – Use any available tuning options to correct the problem. For example:
    add hints to statements or views to correct the performance
    of problem queries, or selectively delete statistics.

    – For any statement that does suffer a change between good and very
    bad plans there is usually some element of the cost which is
    finely balanced and the re-analyze tips you between the plans.
    Any such statements are risky in a production system but there is no
    easy way to identify them. Generally, it is best to hint
    the SQL wherever there could be a fine balance between 2 options.

    – In a DSS / warehouse environment, queries are generally NOT predictable
    so there is no stable environment / query set to upset by gathering statistics.

    – Although the above may sound quite dramatic. It is actually unusual for
    a plan to change wildly or for stats to be finely balanced.
    Unusual does NOT mean it never happens. It can happen. The
    likelyhood is very small.

  8. wen4270407 says

    那如果把10g的MV log移植到9i上会不会出现问题啊~~

Comment

*

沪ICP备14014813号-2

沪公网安备 31010802001379号