中午休闲时在itpub看到一个关于network buffer占用大量内存的求助帖,帖子原文如下:
各位大侠们,请教个问题。昨天遇到一个solaris10平台下的oracle10g(10.2.0.4)数据库报共享内存不足,发现数据库的sga_target才2512M,而在v$sgastat视图中查到的
shared pool–>NETWORK BUFFER就有1848744416字节,是什么引起network buffer这么大呢,在udmp目录下1分钟产生几个跟 ORA-4031相关的文件。==================
SQL> show parameter sgaNAME TYPE VALUE
———————————— ———– ——————————
lock_sga boolean FALSE
pre_page_sga boolean FALSE
sga_max_size big integer 2512M
sga_target big integer 2512M
SQL> show parameter shareNAME TYPE VALUE
———————————— ———– ——————————
hi_shared_memory_address integer 0
max_shared_servers integer
shared_memory_address integer 0
shared_pool_reserved_size big integer 72142028
shared_pool_size big integer 0
shared_server_sessions integer
shared_servers inte
NETWORK BUFFER对我们来说或许有些陌生,那是因为绝大多数场合都采用dedicated server模式,共享服务器模式下NETWORK BUFFER将被大量使用。MOS文档[741523.1]叙述了NETWORK BUFFER的主要用途:
On 10.2, after upgrading from 9iR2, the following error occurs:
ORA-07445: exception encountered: core dump [] [] [] [] [] []
plus
Dispatcher Trace file contains an ORA-4031 Diagnostic trace, with:
Allocation request for: NETWORK BUFFER…followed by…
found dead dispatcher ‘D000’, pid = (12, 1)
The amount of memory used by NETWORK BUFFERs in the shared pool has significantly grown between 9.2 and 10.2. The side-effect is to run-out of Shared Pool memory (reporting an ORA-4031), when a large number of sessions are connecting to the server (in the order of 1000’s).
While a session is being established, we allocate 3 buffers each of 32k in size. After the session is established, we use the 3 SDU-sized buffers, however we do not deallocate the 3x32k buffer we allocated initially.
This issue has been logged in unpublished Bug 5410481.
Additionally, there is Bug 6907529.
NS buffers are allocated based on the SDU specified by the user. The negotiated SDU could be considerably lower. The difference between these two is wasted.
For example, the dispatcher specifies an SDU of 32k. Clients, by default, use an SDU of 8k. The remaining 24k is never used.
Issue in Bug 6907529 is fixed in 11.2.
Bug 5410481 is fixed in 10.2.0.3.
As a workaround to 5410481, the ADDRESS part of DISPATCHERS parameter can be used to specify a smaller SDU size.
For example:
DISPATCHERS=”(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp))(SDU=8192))”To implement the change;
- connect to the database as SYSDBA
- alter system set dispatchers='(address=(protocol=tcp)(host=IP-Address)(sdu=8192))(dispatchers=DispatcherCount)’ scope=spfile;
re-start the database
你可能会问SDU是什么?Oracle NET缓存的数据以SDU为基本单位,SDU即 session data unit,一般默认为8192 bytes。当这些数据单元被写满,或被client读取时,他们将被传递给Oracle Network层(oracle network layer)。譬如Data Guard环境中redo传输的每个Chunk往往要大于8192 bytes,那么默认的SDU就不太适用。当有大量重做数据要传输到standby库时,增大SDU buffer的大小可以改善Oracle的网络性能。你可以很方便的通过修改sqlnet.ora配置文件来修改SDU,如在该文件内加入以下条目:
DEFAULT_SDU_SIZE=32767 /*修改全局默认SDU到32k*/
当然你也可以在tnsnames.ora中定义服务别名时个别指定SDU,下文我们会用到。
如上文所述在版本10.2.0.3以前当会话建立时,Oracle会以dispatchers参数定义的SDU为单位,分配3个单位的NETWORK BUFFER,而实际上client端可能并未指定和dispatchers一致的SDU,若dispatchers中定义的SDU为32k,而client使用默认的8k SDU,则一个会话可能要浪费3*32-3*8=72k的NETWORK BUFFER。
为什么共享服务器模式下会用到共享池中的NETWORK BUFFER,而独享服务器模式下没有呢?因为在独享服务器模式下每个会话所分配的三个SDU是从PGA中获取的;当使用共享服务器模式时会话与服务进程形成一对多的映射关系,这三个SDU 的NETWORK BUFFER同UGA一样转移到了SGA中。
下面我们通过实践来进一步验证。
SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi PL/SQL Release 10.2.0.4.0 - Production CORE 10.2.0.4.0 Production TNS for Linux: Version 10.2.0.4.0 - Production NLSRTL Version 10.2.0.4.0 - Production /*实验服务器端采用10.2.0.4版本*/ SQL> show parameter dispatch NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ dispatchers string (address=(protocol=tcp)(host=1 92.168.1.103)(sdu=32768))(SERV ICE=cXDB)(dispatchers=10) /*dispatchers中指定了SDU为32k*/ C:\Windows\System32>tnsping cXDB TNS Ping Utility for 32-bit Windows: Version 11.2.0.1.0 - Production on 05-8月 -2010 22:51:27 Copyright (c) 1997, 2010, Oracle. All rights reserved. 已使用的参数文件: D:\tools\adminstratorg\orahome\network\admin\sqlnet.ora 已使用 TNSNAMES 适配器来解析别名 尝试连接 (DESCRIPTION = (SDU=8192) (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.103)(PORT = 1521))) (CONNECT_DATA = (SERVER = SHARED) (SERVICE_NAME = cXDB))) OK (30 毫秒) /* client端采用11.2.0.1版本,定义了共享服务器模式的服务别名,显式指定SDU为8192字节*/
这里我们要用到一个简单的java程序,用来模拟大量会话登录;这个程序很傻瓜,但是总比你一个个开SQLPLUS要明智的多:
/*这是一个很简单的java程序,登录远程数据库,并尝试打开600个回话,并且都指定了SDU为8192*/ package javaapplication2; import oracle.jdbc.*; import java.sql.*; public class Main { public static void main(String[] args) throws SQLException { try { Class.forName("oracle.jdbc.driver.OracleDriver"); } catch(Exception e ) { } Connection cnn1=DriverManager.getConnection("jdbc:oracle:thin:@(DESCRIPTION = (SDU=8192) (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.103)(PORT = 1521))) (CONNECT_DATA = (SERVER = SHARED) (SERVICE_NAME = cXDB)))", "system", "password"); Statement stat1=cnn1.createStatement(); ResultSet rst1=stat1.executeQuery("select * from v$version"); while(rst1.next()) { System.out.println(rst1.getString(1)); } Connection m[]=new Connection[2000]; Statement s[]=new Statement[2000]; ResultSet r[]=new ResultSet[2000]; int i=0; while(i<600) { try { m[i]=DriverManager.getConnection("jdbc:oracle:thin:@(DESCRIPTION = (SDU=8192) (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.103)(PORT = 1521))) (CONNECT_DATA = (SERVER = SHARED) (SERVICE_NAME = cXDB)))", "system", "password"); } catch (Exception em) { System.out.println(em.getMessage()); } try { Thread.sleep(3); } catch (Exception e) { } s[i]=m[i].createStatement(); m[i].setAutoCommit(false); i++; System.out.println(i+"is ok !"); } System.out.println("We are waiting!"); try { Thread.sleep(1000); } catch (Exception e) { } } }
编译上面这段程序,尝试执行看看,执行的同时留意观察NETWORK BUFFER:
SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 328080 java -jar ora_network_buffer_test_8.jar /*启动编译后的测试程序*/ SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 69608200 SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 348960 /*会话终止后,NETWORK BUFFER回缩*/ 修改上述程序中的SDU到32k,重新编译后再次测试 java -jar ora_network_buffer_test_32.jar SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 328080 SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 99148576 /*可以看到同样的会话数量,client端SDU增大到32k后,NETWORK BUFFER有了大幅增长*/ 我们修改dispatchers参数中的SDU到8k看看 SQL> alter system set dispatchers=''; System altered. SQL> alter system set dispatchers='(address=(protocol=tcp)(host=192.168.1.103)(sdu=8192))(SERVICE=cXDB)(dispatchers=10)'; System altered. SQL> show parameter dispatchers NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ dispatchers string (address=(protocol=tcp)(host=1 92.168.1.103)(sdu=8192))(SERVI CE=cXDB)(dispatchers=10) SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 328080 java -jar ora_network_buffer_test_32.jar SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 99148552 /*看起来dispatcher中的SDU优先级并没有client中的高*/ 我们再来看看client中SDU为8k的情况 SQL> show parameter dispatchers NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ dispatchers string (address=(protocol=tcp)(host=1 92.168.1.103)(sdu=8192))(SERVI CE=cXDB)(dispatchers=10) SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 328080 java -jar ora_network_buffer_test_8.jar SQL> select name,pool,bytes from v$sgastat where name like '%NETWORK%'; NAME POOL BYTES -------------------------- ------------ ---------- NETWORK BUFFER shared pool 69608200 /*与dispatchers中为32k,而client为8k时一样*/
由以上实践可知10.2.0.4之后,NETWORK BUFFER的使用量由客户端设定的SDU和共享服务器会话数决定。我在之前的博文中曾经列出过TNS协议的几个基础类描述(见《Oracle 网络TNS协议的几个基础类描述》),其中Session包含了setSDU(int i)方法,其代码如下:
public void setSDU(int i) { if(i <= 0) sdu = 2048; else if(i > 32767) sdu = 32767; else if(i < 512) sdu = 512; else sdu = i; }
由以上代码可知,客户端设定的SDU时,其最大最小值分别为32k和512bytes,大于32k时被强制设为32k,而小于512bytes时被强制设为512bytes,若设定SDU<0,则被强制修正为2048 bytes,在512 bytes- 32767 bytes之间则为原值不变。