何时会发生db file sequential read等待事件?

很多网友对系统内频繁发生的db file sequential read等待事件存有疑问,那么到底在那些场景中会触发该单块读等待事件呢?

在我之前写的一篇博文<SQL调优:Clustering Factor影响数据删除速度一例>中总结了db file sequential read等待事件可能发生的场景,在这里再share以下:

”db file sequential read”单块读等待是一种最为常见的物理IO等待事件,这里的sequential指的是将数据块读入到相连的内存空间中(contiguous memory space),而不是指所读取的数据块是连续的。该wait event可能在以下情景中发生:

  1. 最为常见的是执行计划中包含了INDEX FULL SCAN/UNIQUE SCAN,此时出现”db file sequential read”等待是预料之中的,一般不需要我们去特别关注
  2. 当执行计划包含了INDEX RANGE SCAN-(“TABLE ACCESS BY INDEX ROWID”/”DELETE”/”UPDATE”), 服务进程将按照”访问索引->找到rowid->访问rowid指定的表数据块并执行必要的操作”顺序访问index和table,每次物理 读取都会进入”db file sequential read”等待,且每次读取的都是一个数据块;这种情况下clustering_factor将发挥其作用,需要我们特别去关注,本例中提及的解决方法对 这种情景也有效
  3. Extent boundary,假设一个Extent区间中有33个数据块,而一次”db file scattered read”多块读所读取的块数为8,那么在读取这个区间时经过4次多块读取后,还剩下一个数据块,但是请记住多块读scattered read是不能跨越一个区间的(span an extent),此时就会单块读取并出现”db file sequential read”。这是一种正常现象,一般不需要额外关注
  4. 假设某个区间内有8个数据块,它们可以是块a,b,c,d,e,f,g,h,恰好当前系统中除了d块外的其他数据块都已经被缓存在buffer cache中了,而这时候恰好要访问这个区间中的数据,那么此时就会单块读取d这个数据块,并出现”db file sequential read”等待。注意这种情况不仅于表,也可能发生在索引上。这是一种正常现象,一般不需要额外关注
  5. chained/migrated rows即链式或迁移行,这里我们不介绍链式行的形成原因,chained/migrated rows会造成服务进程在fetch一行记录时需要额外地单块读取,从而出现”db file sequential read”。这种现象需要我们特别去关注,因为大量的链式/迁移行将导致如FULL SCAN等操作极度恶化(以往的经验是一张本来全表扫描只需要30分钟的表,在出现大量链式行后,全表扫描需要数个小时),同时也会对其他操作造成不那么 明显的性能影响。可以通过监控v$sysstat视图中的”table fetch continued row”操作统计来了解系统中链式/迁移行访问的情况,还可以通过DBA_TBALES视图中的CHAIN_CNT来了解表上的链式/迁移行情况,当然这 要求定期收集表上的统计信息;如果没有定期收集的习惯,那么可以配合@?/rdbms/admin/utlchain脚本和analyze table list chained rows 命令来获取必要的链式行信息
  6. 创建Index entry,显然当对表上执行INSERT操作插入数据时,虽然在执行计划中你看不到过多的细节,但实际上我们需要利用索引来快速验证表上的某些约束是否 合理,还需要在索引的叶子块中插入相关的记录,此时也可能出现”db file sequential read”等待事件,当然这还和具体的插入的方式有关系。这是一种正常现象,一般不需要额外关注
  7. 针对表上的UPDATE/DELETE,不同于之前提到的”INDEX RANGE SCAN-UPDATE/DELETE”,如果我们使用rowid去更新或删除数据时,服务进程会先访问rowid指向的表块(注意是先访问table block)上的行数据,之后会根据该行上的具体数据去访问索引叶子块(注意Oracle并不知道这些leaf block在哪里,所以这里同样要如range-scan/unique-scan那样去访问index branch block),这些访问都将会是单块读取,并会出现’db file sequential read’,完成必要的读取后才会执行更新或删除的实际EXEC操作,如下例:
以下trace中,obj#=1307547为sample表,而obj#=1307549为sample表上的唯一一个索引 

PARSING IN CURSOR #10 len=58 dep=0 uid=64 oct=6 lid=64 tim=1275805024007795 hv=505118268 ad='d387e470'
update sample set t2=t2+1 where rowid='AAE/OzAAEAAANUEAAQ'
END OF STMT
PARSE #10:c=1999,e=3016,p=1,cr=1,cu=0,mis=1,r=0,dep=0,og=1,tim=1275805024007787
WAIT #10: nam='db file sequential read' ela= 314 file#=4 block#=54532 blocks=1 obj#=1307547 tim=1275805024008308
WAIT #10: nam='db file sequential read' ela= 206 file#=6 block#=20 blocks=1 obj#=1307549 tim=1275805024009235
WAIT #10: nam='db file sequential read' ela= 206 file#=6 block#=742 blocks=1 obj#=1307549 tim=1275805024009496
WAIT #10: nam='db file sequential read' ela= 207 file#=6 block#=24 blocks=1 obj#=1307549 tim=1275805024009750
EXEC #10:c=2000,e=2297,p=6,cr=2,cu=8,mis=0,r=1,dep=0,og=1,tim=1275805024010210   --实际的UPDATE发生在这里

当大量执行这类UPDATE/DELETE操作时将需要频繁地交叉访问表和索引,如果恰好表上的某个索引有较高的 clustering_factor的话,那么就会形成本例中的这种性能问题了。实际上当表上有较多索引时,使用rowid来批量 update/delete数据这种方式是不被推荐的,仅当表上没有索引时才可能十分高效。如果你坚持要这样做,那么可以参照上面提到的建议。

 

8.BUG!BUG!已知在9i RAC及10g中使用ASM的情况下,存在引发在适用情况下不使用”scattered read”多块读而去使用”sequential read”的BUG。如果你的问题和上述情景都不匹配,但又有大量的”db file sequential read”等待事件,那么你有可能遇到bug了。在这里列出部分已知bug:

Bug# Version Affected
Bug 7243560 – High “db file sequential read” IO times when using ASM 10.2.0.4/11.1.0.7
Bug 7243560: RAPID INCREASE IN DB FILE SEQUENTIAL READ AFTER MOVING TO ASM 10.2.0.3
Bug 9711810: EXCESSIVE DB FILE SEQUENTIAL READS WITH NON COMPLIANT BUFFER CACHE ON RAC 9.2.0.8
Bug 9276739: INSERT STATEMENT HAS SLOW PERFORMANCE WITH DB FILE SEQUENTIAL READ 10.2.0.4
Bug 8625100: EXCESSIVE DB FILE SEQUENTIAL READ ON UNDO 10.2.0.4
Bug 8669544: HIGH DB FILE SEQUENTIAL READ AND GC CR DISK READ WAIT EVENTS DURING FULL SCAN 10.2.0.4
Bug 7427133: AN INSERT CAUSES LOTS OF ‘DB FILE SEQUENTIAL READ’ WAITS FOR THE INDEX BLOCKS 9.2.0.8
Bug 8493139: INCREASE IN DB FILE SEQUENTIAL READ WAITEVENT AFTER MIGRATING TO 10 RAC/ASM 10.2.0.4
Bug 5882268: PERFORMANCE ISSUE WITH ‘DB FILE SEQUENTIAL READ’ 10.2.0.2
Bug 7415702: LOTS OF ‘DB FILE SEQUENTIAL READ’ ON UNDO 10.2.0.3
Bug 5607724: 10202 DB FILE SEQUENTIAL READ THRICE AFTER UPGRADE FROM 9I 10.2.0.2

Number of distinct values (NDV) & synopsis & Histogram gather speed

Number of distinct values (NDV)   & synopsis & Histogram gather speed

Number of distinct values (NDV)   & synopsis

Pre 11g – Sampling based
Sample the data, get NDV from sample and scale
Computation involves sort and may spill to disk -> expensive
Can not scale the NDV well for skewed columns -> inaccurate NDV

11g – Approximate NDV
Create an auxiliary structure, synopsis by scanning data
Synopsis can be viewed as a sample of distinct values
The structure uses bounded amount of memory
Synopses on different segments of table can be aggregated to generate a single synopsis
NDV can be accurately  and efficiently derived from synopsis

 

Synopsis is a collection of hash values of distinct values
The hash value has the following properties
Bits are independent of each other
Each bit has same probability of being 0 or 1

 

 

 

SQL> select bytes/1024/1024 from dba_segments where segment_name='LARGE_HISTOGRAM';

BYTES/1024/1024
---------------
           2333

SQL> 
SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

SQL> set timing on;
SQL> 
SQL> exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 1', estimate_percent=>100);

PL/SQL procedure successfully completed.

Elapsed: 00:02:01.14
SQL> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.05
SQL> exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 1', estimate_percent=>100);

PL/SQL procedure successfully completed.

Elapsed: 00:02:01.16
SQL> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.03
SQL> exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100);
BEGIN dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100); END;

*
ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
ORA-06512: at "SYS.DBMS_STATS", line 15027
ORA-06512: at "SYS.DBMS_STATS", line 15049
ORA-06512: at line 1

Elapsed: 00:00:28.87

SQL> alter system set pga_aggregate_target=1000M;

System altered.

Elapsed: 00:00:00.01
SQL> show parameter work

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fileio_network_adapters              string
workarea_size_policy                 string      AUTO

SQL> exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100);
BEGIN dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100); END;

*
ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
ORA-06512: at "SYS.DBMS_STATS", line 15027
ORA-06512: at "SYS.DBMS_STATS", line 15049
ORA-06512: at line 1

Elapsed: 00:00:28.65

SQL> alter tablespace temp add tempfile size 10g;

Tablespace altered.

Elapsed: 00:00:00.08

SQL> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.04
SQL> select * from v$sort_usage;

no rows selected

Elapsed: 00:00:00.03
SQL>  exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100);

SQL>  exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100);
BEGIN dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100); END;

*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcfrbd_1], [0], [131068], [], [],
[], [], []
ORA-06512: at "SYS.DBMS_STATS", line 15027
ORA-06512: at "SYS.DBMS_STATS", line 15049
ORA-06512: at line 1

Elapsed: 00:00:30.63

SQL> create temporary tablespace temp1 tempfile size 10g;

Tablespace created.

Elapsed: 00:00:00.11
SQL> alter database default temporary tablespace temp1;

Database altered.

Elapsed: 00:00:00.03

SQL> alter tablespace temp1 add tempfile size 10g;

Tablespace altered.

Elapsed: 00:00:00.02

SQL>  exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100);
BEGIN dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>100); END;

*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [673281], [1],
[673280], [673280], [], []
ORA-06512: at "SYS.DBMS_STATS", line 15027
ORA-06512: at "SYS.DBMS_STATS", line 15049
ORA-06512: at line 1

Elapsed: 00:00:31.63

SQL> exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>1);
BEGIN dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE 100',estimate_percent=>1); END;

*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [673281], [1],
[673280], [673280], [], []
ORA-06512: at "SYS.DBMS_STATS", line 15027
ORA-06512: at "SYS.DBMS_STATS", line 15049
ORA-06512: at line 1

Elapsed: 00:00:01.72

SQL> exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE AUTO',estimate_percent=>1);

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.57

SQL> exec dbms_stats.gather_table_stats(USER,'LARGE_HISTOGRAM',method_opt=>'FOR ALL COLUMNS SIZE AUTO',estimate_percent=>100);

PL/SQL procedure successfully completed.

Elapsed: 00:02:02.13

11.2.0.3

 

相关BUG:

Bug 8663644  Slow Histograms gathering / Histograms are gathered serially

Bug 13583722 – slow incremental stats gather from global histogram gathers (Doc ID 13583722.8) 等

 

[转]BUFFER SORT是BUFFER却不是SORT

用AUTOTRACE查看执行的计划的同学常问到执行计划里的BUFFER SORT是什么意思,这里为什么要排序呢?

BUFFER SORT不是一种排序,而是一种临时表的创建方式。

BUFFER是执行计划想要表达的重点,是其操作: 在内存中存放一张临时表。

SORT修饰BUFFER,表示具体在内存的什么地方存放临时表: 在PGA的SQL工作区里的排序区。

至少有一种方法可以说服对此表示怀疑的人们,就是查询V$SQL_PLAN_STATISTICS_ALL.PROJECTION字段。

将STATISTICS_LEVEL设置为ALL先,然后执行真-排序命令,比如:select hire_date,salary from hr.employees order by hire_date

然后查看其V$SQL_PLAN_STATISTICS_ALL.PROJECTION字段:

SYS@br//scripts> select projection from v$sql_plan_statistics_all where 
sql_id=(select sql_id from v$sql where sql_text='select hire_date,salary from hr.employees order by hire_date') 
and operation='SORT' and options='ORDER BY';

PROJECTION
----------------------------------------------------------------
(#keys=1) "HIRE_DATE"[DATE,7], "SALARY"[NUMBER,22]

1 row selected.

其中开头的#keys表示返回的结果中排序的字段数量。

再执行一句真-排序命令:select hire_date,salary from hr.employees order by salary,hire_date

然后查看其V$SQL_PLAN_STATISTICS_ALL.PROJECTION字段,#keys因该为2:

SYS@br//scripts> select projection from v$sql_plan_statistics_all where 
sql_id=(select sql_id from v$sql where sql_text='select hire_date,salary from 
hr.employees order by salary,hire_date') and operation='SORT' and options='ORDER BY';

PROJECTION
------------------------------------------------------------------------------------
(#keys=2) "SALARY"[NUMBER,22], "HIRE_DATE"[DATE,7]

1 row selected.

来看看我们萌萌的BUFFER SORT的表现吧~

执行下面这个查询,它使用了所谓的BUFFER SORT:

select ch.channel_class,c.cust_city,sum(s.amount_sold) sales_amount
from sh.sales s,sh.customers c,sh.channels ch
where s.cust_id=c.cust_id and s.channel_id=ch.channel_id and
c.cust_state_province='CA' and
ch.channel_desc='Internet'
group by ch.channel_class,c.cust_city

附上其执行计划,Id为5的Operation是BUFFER SORT:
execution Plan
----------------------------------------------------------
Plan hash value: 3047021169

----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |   133 |  7980 |   902   (2)| 00:00:11 |       |       |
|   1 |  HASH GROUP BY         |           |   133 |  7980 |   902   (2)| 00:00:11 |       |       |
|*  2 |   HASH JOIN            |           | 12456 |   729K|   901   (2)| 00:00:11 |       |       |
|   3 |    MERGE JOIN CARTESIAN|           |   383 | 18001 |   408   (1)| 00:00:05 |       |       |
|*  4 |     TABLE ACCESS FULL  | CHANNELS  |     1 |    21 |     3   (0)| 00:00:01 |       |       |
|   5 |     BUFFER SORT        |           |   383 |  9958 |   405   (1)| 00:00:05 |       |       |
|*  6 |      TABLE ACCESS FULL | CUSTOMERS |   383 |  9958 |   405   (1)| 00:00:05 |       |       |
|   7 |    PARTITION RANGE ALL |           |   918K|    11M|   489   (2)| 00:00:06 |     1 |    28 |
|   8 |     TABLE ACCESS FULL  | SALES     |   918K|    11M|   489   (2)| 00:00:06 |     1 |    28 |
----------------------------------------------------------------------------------------------------

查看其V$SQL_PLAN_STATISTICS_ALL.PROJECTION字段:

SYS@br//scripts> select distinct projection from v$sql_plan_statistics_all where sql_id in 
(select distinct sql_id from v$sql where sql_text like 
'%where s.cust_id=c.cust_id and s.channel_id=ch.channel_id and%') 
and operation='BUFFER' and options='SORT';

PROJECTION
-------------------------------------------------------------------------
(#keys=0) "C"."CUST_ID"[NUMBER,22], "C"."CUST_CITY"[VARCHAR2,30]

1 row selected.

结果#keys等于0,是0啊… 0意味着该操作根据0个字段排序,那就是没有排序咯。

同样显示SORT但是不SORT打着左灯向右转的还有著名的SORT AGGREGATE。

只能这样说,AUTOTRACE中执行计划操作的取名有时真的太淘气了。

转自包光磊的博客:http://blogs.oracle.com/toddbao/entry/buffer_sort%E6%98%AFbuffer%E5%8D%B4%E4%B8%8D%E6%98%AFsort

 

Seconds in wait

The term “seconds in wait” means the number of seconds that a SQL Statement, Database User, etc. was waiting on an event. It is possible that the total number of “seconds in wait” exceeds the total amount of time in the period. The following example will illustrate how this occurs.

 

Example: Assume that one session locks a row that is needed by ten other sessions. All ten sessions will wait for the lock. Also assume that the session holds the lock for one hour. Since each of the ten sessions waited for one hour, Ignite will show total wait time of ten hours during the one hour period.

 

The maximum theoretical wait time during a period is the number of session multiplied by the length of the period.

Round-Trip Time

Round-trip time when running “select 1 from dual” (includes network time but not connect time) on this database.

 

 

 

Solution

If this is high, contact your network administrator to understand network latency.

Reduce the Number of SQL Statements

Shareable SQL uses bind variables rather than literal values. If an application makes use of literal (unshared) SQL then this can severely limit scalability and throughput. The cost of parsing a new SQL statement is expensive both in terms of CPU and the number of times the library cache and shared pool latches may need to be acquired and released. Even parsing a simple SQL statement may need to acquire a library cache latch twenty or thirty times.

 

By looking at the V$SQLAREA view it is possible to see which literal statements are good candidates for converting to use bind variables. The following query shows SQL in the SGA where there are a large number of similar statements:

 

SELECT substr(sql_text,1,50) “SQL”,
count(*),
sum(executions) “TotExecs”
FROM v$sqlarea
WHERE executions < 5
GROUP BY substr(sql_text,1,50)
HAVING count(*) > 30
ORDER BY 2

 

This query finds statements whose first 50 characters are the same and which have only been executed a few times each and have at least 30 different copies of this SQL in the shared pool. The query may need to be modified if the literals are in the first 50 characters.

 

There are numerous parameters in the INIT.ORA that can directly impact the efficiency of shared pool usage. For a full accounting of these, refer to MetaLink Note: 62143.1.

Library Cache Hit Ratio

The library cache (a component of the shared pool) stores the executable (parsed or compiled) form of recently referenced SQL and PL/SQL code. Oracle tries to reuse this code.

  • If the code has been executed previously and can be shared, Oracle will report a library cache hit.
  • If Oracle is unable to use existing code, then a new executable version of the code must be built, which is known as a library cache miss.

提高DBWR进程的吞吐量

Improve DBWR Throughput

To improve DBWR’s throughput, consider the following:

  • Disk capabilities and setup (stripe size, speed, controllers etc…)
  • Using raw devices instead of File System files — depending on their efficiency on your system.
  • Spreading database file across drives and controllers
  • Using Async IO if supported
  • If asynch IO is not possible, using multiple database writers. This is done with the DB_WRITERS database parameter in Oracle7, and the DBWR_IO_SLAVES database parameter in Oracle8 and 9.
  • Using multiple DB Writer gatherer processes in Oracle8 using the DB_WRITER_PROCESSES parameter.
  • Using the “Multiple buffer pools” feature in Oracle8 and Higher. See Note:135223.1 on the Oracle Metalink website.

 

Achieving the optimal setup for DBWR is highly dependent on the characteristics of your platform. The maximum I/O size the platform supports and the ability to support Asynchronous I/O are examples of these characteristics.

DB Commit Time

Average number of milliseconds waiting for the “log file sync” event, indicating commit times for this database.

 

Solutions

If this is high, review solutions presented in the help file for “log file sync” in the wait time data.

Oracle Compressed Indexes

Key compressed indexes are a way to index a low cardinality column without talking up as much space storing the repetitive values. Please view the Oracle Manuals for a full description on implementing compressed indexes.

 

Benefits

  • Fewer index leaf blocks need to be read since more rows fit into fewer blocks
  • Insert statements are faster since fewer leaf blocks need to be read to determine where the new row is inserted
  • Faster full index scans due to fewer leaf blocks
  • Space savings

沪ICP备14014813号-2

沪公网安备 31010802001379号