SGA_MAX_SIZE,SGA_TARGET以及PRE_PAGE_SGA参数

2010/08/26 by admin 4 Comments

10g引入ASMM后SGA_TARGET取代shared_pool_size,db_cache_size等参数，成为DBA关注的主要SGA内存管理参数；有不少情况下SGA_TARGET参数会设置为一个小于SGA_MAX_SIZE的值(这样做在多实例情况下更具灵活性)。但不少人会问，这样岂不是要浪费一部分物理内存吗？Oracle会为实例分配SGA_MAX_SIZE大小的内存的，SGA_TARGET要设得和SGA_MAX_SIZE一样大才合理啊！

让我们来看看实际的情况:

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

/*  linux上的10.2.0.4  */

SQL> show parameter sga
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 3600M
sga_target                           big integer 368M

SQL> col component for a25;
SQL> select component, current_size / 1024 / 1024, last_oper_type
  2    from v$sga_dynamic_components
  3   where current_size != 0;

COMPONENT                 CURRENT_SIZE/1024/1024 LAST_OPER_TYP
------------------------- ---------------------- -------------
shared pool                                   96 STATIC
large pool                                    16 STATIC
java pool                                     16 STATIC
DEFAULT buffer cache                         224 INITIALIZING

/* 此时的SGA_TARGET为368M，而SGA_MAX_SIZE为3600M */

/* 我们来看一下实际的物理内存使用情况  */

/* 以root用户登录，因为我们需要用到清理文件系统缓存的命令 */

[maclean@rh2 ~]$ su - root
Password:

[root@rh2 ~]# sync
[root@rh2 ~]# sync

/* sync 命令用以写出文件系统脏缓存，类似于Oracle的checkpoint手动检查点 */

[root@rh2 ~]# echo 3 > /proc/sys/vm/drop_caches
[root@rh2 ~]# echo 3 > /proc/sys/vm/drop_caches

/*  将proc文件系统下的drop_caches文件置为3，清理文件系统缓存，以免其干扰我们的实验 */

[root@rh2 ~]# free
             total       used       free     shared    buffers     cached
Mem:       4046352     429328    3617024          0        296     176100
-/+ buffers/cache:     252932    3793420
Swap:     10241428          0   10241428

/*  4g内存中仅有419M处于使用状态，其他均为free。可见Oracle没有为实例分配大小等于SGA_MAX_SIZE的内存空间，而接近于SGA_TARGET_SIZE的大小 */
[root@rh2 ~]# ps -ef|grep pmon|grep -v grep
maclean   6361     1  0 18:35 ?        00:00:00 ora_pmon_YOUYUS

[root@rh2 ~]# pmap -x 6361
6361:   ora_pmon_YOUYUS
Address           Kbytes     RSS   Dirty Mode   Mapping
0000000000400000  100412    7300       0 r-x--  oracle
000000000680f000     544     180      64 rwx--  oracle
0000000006897000     148     104     104 rwx--    [ anon ]
000000001e9d0000     672     532     532 rwx--    [ anon ]
0000000060000000 3688448    1044     388 rwxs-    [ shmid=0x390005 ]

/* 利用pmap工具探测Oracle后台进程的内存地址空间，可以看到这里虚拟共享内存段(也就是SGA)的大小为3602M */

[root@rh2 ~]# ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x74018366 3112960    root      600        4          0
0x00000000 3473409    root      644        80         2
0x74018365 3080194    root      600        4          0
0x00000000 3506179    root      644        16384      2
0x00000000 3538948    root      644        280        2
0x1cdd16e8 3735557    maclean   640        3776970752 15

/* 使用ipcs 命令也可以观察到这个大小为3602M的共享内存段 */

/* 可以猜测Oracle在这里使用malloc函数或类似的方法实现虚拟内存的分配，没有像calloc函数那样对空间中的每一位都初始为0,保证了不浪费物理内存  */

/*  我们动态修改SGA_TARGET参数来看看*  /

SQL> alter system set sga_target=3000M;

System altered.

SQL> select component, current_size / 1024 / 1024, last_oper_type
  2    from v$sga_dynamic_components
  3   where current_size != 0;

COMPONENT                 CURRENT_SIZE/1024/1024 LAST_OPER_TYP
------------------------- ---------------------- -------------
shared pool                                   96 STATIC
large pool                                    16 STATIC
java pool                                     16 STATIC
DEFAULT buffer cache                        2864 INITIALIZING

/*   BUFFER_CACHE 的空间大幅增加 * /

SQL> !free
             total       used       free     shared    buffers     cached
Mem:       4046352    3187784     858568          0        536    2889920
-/+ buffers/cache:     297328    3749024
Swap:     10241428          0   10241428

/* used mem部分上升到3113M，随着SGA_TARGET的动态增长而增长了 */

/* 如果我们改小SGA_TARGET呢？ */

SQL> alter system set sga_target=368M;

System altered.

SQL> select component, current_size / 1024 / 1024, last_oper_type
  2    from v$sga_dynamic_components
  3   where current_size != 0;

COMPONENT                 CURRENT_SIZE/1024/1024 LAST_OPER_TYP
------------------------- ---------------------- -------------
shared pool                                   96 STATIC
large pool                                    16 STATIC
java pool                                     16 STATIC
DEFAULT buffer cache                         224 SHRINK

/* 可以看到BUFFER CACHE最近执行了SHRINK收缩操作，SIZE下降到224M */

QL> !free
             total       used       free     shared    buffers     cached
Mem:       4046352    3204608     841744          0       1596    2912888
-/+ buffers/cache:     290124    3756228
Swap:     10241428          0   10241428

/* 此时OS层仍认为used memory部分为3130M;但可以放心，它们是可被其他进程复用的  * /

官方对pre_page_sga参数的定义是”PRE_PAGE_SGA determines whether Oracle reads the entire SGA into memory at instance startup. Operating system page table entries are then prebuilt for each page of the SGA. This setting can increase the amount of time necessary for instance startup, but it is likely to decrease the amount of time necessary for Oracle to reach its full performance capacity after startup.” 它决定Oracle是否在启动阶段将整个SGA读入内存，实际上在启动阶段SGA的内容是十分稀疏的，当pre_page_sga为TRUE时Oracle所要做的是向OS所要SGA_MAX_SIZE大小的实际物理页。

我们来看一下这个参数的实际效果:

SQL> alter system set sga_max_size=2500M scope=spfile;
System altered.

SQL> alter system set pre_page_sga=true scope=spfile;
System altered.

SQL> startup force ;
ORACLE instance started.

Total System Global Area 2634022912 bytes
Fixed Size                  2086288 bytes
Variable Size            2382367344 bytes
Database Buffers          234881024 bytes
Redo Buffers               14688256 bytes
Database mounted.
Database opened.

[root@rh2 ~]# echo 3 > /proc/sys/vm/drop_caches

SQL> show parameter sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     TRUE
sga_max_size                         big integer 2512M
sga_target                           big integer 368M

SQL> !free
             total       used       free     shared    buffers     cached
Mem:       4046352    2895256    1151096          0        648    2623692
-/+ buffers/cache:     270916    3775436
Swap:     10241428     104936   10136492

/*  设置pre_page_sga参数为true后，Oracle实际分配的物理内存不再局限于SGA_TARGET，而是在实例生命周期内几乎恒等于SGA_MAX_SIZE指定的值了 * /

/*  当SGA_TARGET配合pre_page_sga使用时，这的确可能造成一种浪费 ! * /

10g中Oracle已经强烈不建议使用pre_page_sga参数了，在ASMM下它带来的问题可不小!

Filed Under: Oracle Tagged With: database, ORA, Oracle, RAC, sql

Comments

admin says

2010/08/26 at 20:09

dr: 9205576 10.2.0.4.0 RDBMS 10.2.0.4.0 UNKNOWN PRODID-5 PORTID-197
Abstract: CONNECTION TAKES MORE TIME WITH PRE_PAGE_SGA=TRUE IN 4 CELL COMPARED TO 3 CELL

pre_page_sga should just cause the shadow to loop across
all pages in the SGA and read a value from the start of
each page so that it has been touched within that process.
As per the notes this is done within ksmprepage() which is
where the time is shown.
To help check the time difference please get for EACH
configuration (3 and 4 cell):
a. Details of the init.ora parameter values
b. The tracefile output from:
oradebug setmypid
oradebug ipc
c. Details of opatch lsinventory -detail
From the uploaded init.ora and trace files there is no
difference in granule size, pagesize, subarea quantity,
subarea sizes etc.. between the two runs.

Hence in both configurations it looks like ksmprepage()
is executing on exactly the same values so it should be
performing the exact same sequence of assembler instructions
in each case.
This function just reads the first byte from every single
page in the SGA. For such a large SGA this represents a
loop of about 15 million iterations (which is why ksmprepage
is a hot function in itself on connect).

check.sh looks like it just runs a number of concurrent
sessions to get various bits of information from the DB.
The HP profile data shows ksmprepage() as the hot function
so it seems like the problem may be related to:

– A number of concurrent Unix processes running
– With a common large chunk of shared memory attached to them all
– Each process looping across the shared memory chunk and reading
the first byte of each page (where the pagesize is 0x1000).

In our source we are doing something like this:

dvoid *first;
ub8 remsize, cnt;
ub4 sum = (ub4)0;
volatile ub1 *p;

// first gets set to the start address
// remsize = num bytes in the shared memory chunk
// pagesize is 0x1000, zero is (ub8)0 , ub8 is 8 byte unsigned int

for (p = (ub1 *)first, cnt = (remsize/(ub8)(pagesize));
!(cnt==zero); –cnt)
{
/* Make some half-hearted attempt to avoid optimizing the read
away */
sum += *p;
p += pagesize;
}

Which looks like it has compiled to this sort of assembler loop:

:1 : mov.i ar.lc=r16
:2 : nop.i 0x0;;

:0 : ld1 r23=[r14],r45;;
:1 : lfetch [r14],r44
:2 : adds r24=32,r12

:0 : nop.m 0x0
:1 : nop.m 0x0
:2 : br.cloop.dptk.many ksmprepage+0x690;;

It may be sensible to work with HP to create a small C program that
creates a large chunk (60Gb) of shared memory (akin to the SGA),
then have a number of concurrent processes attach to that memory and
execute code similar to the above to see if it exhibits similar
issues between 3 and 4 cell configurations. You would want to try
to get similar assembler for the C code to be sure to be comparing
similar instruction sequences.

回复
admin says

2010/08/26 at 20:11

Hdr: 575023 7.3.3 RDBMS 7.3.3 A58506-01 PRODID-5 PORTID-319
Abstract: PRE_PAGE_SGA IS NOT RELEVANT IN AIX – PLEASE DOCUMENT THAT

Problem:
~~~~~~~~
Customer tried to lock the SGA into main memory, and set in his init.ora the
parameter
PRE_PAGE_SGA = TRUE

However, customer is still able to startup the database with a SGA bigger than
the machine physical memory. Obvioulsy, PRE_PAGE_SGA is not doing his job.
After some research, I found that PRE_PAGE_SGA is not used
—
For Oracle on AIX we can not gurantee locking the SGA in memory .
Yes we can bring up the instance even though complete SGA is not in the memory.

回复
admin says

2010/08/26 at 20:15

Hdr: 5072402 9.2.0.6 RDBMS 9.2.0.6 VOS PRODID-5 PORTID-59
Abstract: SLOW CONNECT TIME WITH SQLPLUS TO INSTNACE WITH 22GIG SGA AND PRE_PAGE_SGA
PROBLEM:
——–
With a 22gig sga and PRE_PAGE_SGA = true, the sqlplus “/ as sysdba”
can take up between 3 and 10 seconds.

DIAGNOSTIC ANALYSIS:
——————–
It seems to be a pause after ioctl

WORKAROUND:
———–
Connect through sqlnet.

RELATED BUGS:
————-

REPRODUCIBILITY:
—————-
WE are working on reproducing in house but it could be difficult to find a
machine capable of such a large SGA.

TEST CASE:
———-

STACK TRACE:
————
1141261744.684820 shmget(171386209, NULL, 0) …………. ERR#2 ENOENT
1141261744.685453 shmget(171386210, NULL, 0) …………. ERR#2 ENOENT
1141261744.685874 shmget(171386211, NULL, 0) …………. ERR#2 ENOENT
1141261744.686560 shmat(4009, NULL, 0) ………………. =
0xc000000040000000
##MIKE##1141261744.687048 shmctl(4009, IPC_STAT, 0x800003ffbfff2c90) = 0
##MIKE##1141261744.724370 ioctl(8, 0x80186103, 0x800003ffbfff2c78) = 0
##MIKE##1141261746.728660 In user-mode ……………………… [running]
##MIKE##1141261752.477845 getuid() …………………………. = 100
(100)
10
1141261754.280652 mmap(NULL, 532, PROT_READ,
MAP_SHARED|MAP_VARIABLE|MAP_FILE|MAP_ADDR32, 10, NULL) = 0x80013000
1141261754.377998 close(10) ………………………… = 0
1141261754.647684 socket(AF_UNIX, SOCK_DGRAM, 0) ……… = 10
1141261754.837198 getpid() …………………………. = 18682 (18655)

回复
admin says

2010/09/11 at 07:57

How SGA_MAX_SIZE Parameter Works
PURPOSE
——-

Give some concepts and explanation on SGA_MAX_SIZE
in particular when this init parameter is used on Windows OS

SCOPE & APPLICATION
——————-

Database Administrators and System Administrators

How SGA_MAX_SIZE Parameter Works
——————————–

– The Main Definition of SGA_MAX_SIZE parameter

SGA_MAX_SIZE specifies the maximum size of SGA for the lifetime of the instance.
In Oracle8i Database Administrators (DBAs) do not have any control over the
SGA size once the instance is started. Oracle9i allows a DBA to modify the SGA
size dynamically. This provides an SGA that will grow and shrink in response to
a DBA command.

This feature implements an infrastructure to allow the Oracle server to change
the SGA configuration without shutting down the instance. The Oracle Server
can modify the physical address space use to respond to the operating system’s
use of physical memory

– How SGA_MAX_SIZE Parameter Works

1. The components of sga are : fixed sga, variable sga, buffer cache, redo
buffers. The sga is allocated in multiples of granules.
2. The buffer cache and the variable sga are rounded up to the next granule
size. Shared pool, large pool and java pool are major components of
variable sga.
3. The granule size increases from 4m->8m->16m as sga size increases. The
sga size is set as sga_max_size which is by default calculated by adding
up all sga components. If sga_max_size is set in init.ora then sga of
sga_max_size or sum of sga components whichever higher is allocated at
the instance startup.
4. When sga_max_size is set to higher value resulting in change in granule
size, we see increased allocation for buffer cache and variable sga as
round off boundary has increased.
5. If sga_max_size is set to a value greater than the sum of sga components
then the excess amount is seen in variable sga. Because of this we see
variable sga size quite larger than the sum of its components.

– SGA_MAX_SIZE and some related concepts on Windows OS

o sga_max_size is the max size the SGA can get to.
As a result of this, Oracle must ‘reserve’ this size in the
process address space. If it does not, then something else may come along,
use the space and then the SGA cannot be expanded

o this is therefore reserved within the virtual address space and since the
physical memory has not yet been allocated, then this will not show up within
physical memory. It will ONLY show up in virtual memory

o the result of this, is that there may be further restrictions on certain resources
e.g. the number of concurrent oracle connections to this particular database

o this is due to the space being reserved, so the remaining memory
structures cannot grow as high as they could have done previously

o this is NOT a bug. Consider the virtual address space to be a memory ‘map’
which defines the layout of addressable memory
i.e. it is the template to which the process must conform

o the fact that this does not resemble physical memory used (yet) is the way
the memory management system works

o Oracle has to reserve space in the process address space and therefore
also in the virtual memory layout, but since it has not yet physically allocated the extra
memory, then it is correct that physical memory does not show this memory as being used

o sysadmins should consider the memory mapping as signifying how things can get
fitted in and what space they have left for other things.
Looking at physical memory (which is mapped to virtual address space via windows O/S)
shows how much physical memory is used/free at any one time,
but does not reflect the memory model.

o The 2 memory types, physical and virtual, although related, are not to be
compared directly.

– Virtual memory is a ‘model’ or template that signifies how processes
should structure and address the memory for their use and consists of the entire memory
mapping that a given process can access.
However, given that not every single page of a process’ memory will be
required at any one time, the amount of physical memory used by each process is typically a
very small subset of the virtual memory layout.
– However, Physical memory is where the actual memory is taken from when
needed. This physical memory is mapped to the relevant virtual memory ‘pages’ that are
required at that specific time.

o since windows uses a virtual memory system, then in reality, even though
there may be loads of virtual memory ‘reserved’ the reality is that only a small amount
of physical memory (a subset of the virtual if you like) is actually in use at any one time.
The store for the virtual memory is typically disk-based.

– thus with a large enough backing file, since not all the physical
memory is required simultaneously, we can have a lot more virtual memory ‘committed’ than
physical memory. So if there is sufficient virtual memory there could be 5GB of virtual
memory used even though the system only has 4GB of physical memory, as an example. The virtual
memory has not all been mapped to physical pages since it is not needed all the time.

o however, for any given process we must still maintain the template of memory so
that we know what kernel and application structures can be accomodated

o the main philosophy behind the virtual memory system is that we can allow
more applications to run than we physically have enough memory for,
were we to allocate it all in physical memory
concurrently; essentially we ‘share’ the physical memory between a
much larger number of consumers by only mapping the amount of physical memory that
is strictly needed at any one time.

– Question
Is there a way that Oracle can check the amount of available
memory and map, move, or fit address space that it
needs from virtual memory to available physical memory ?

– Answer
o this is essentially asking if we can use spare physical memory to house extra address
space that doesn’t fit in our existing process space.
Due to the sga_max_size virtual memory layout reducing the amount of space left
to map database connections, the question here is could we ‘borrow’ the space from
elsewhere where there is memory available.

The answer to this is ‘no’ (although VLM sort of does this, but only for
the buffer cache and within the O/S bounds)

– the reason is 2-fold:

1/ it is the operating system’s job to worry about how virtual/physical
memory is managed/mapped.
The application (in this case Oracle) just asks for memory and the O/S either
gives it to us or says there is insufficient.
An application does not want to have to worry about how to get memory aside
from just requesting it from the O/S, else it would have to know in detail the architecture,
underlying memory model structures and so on
2/ if applications were to start ‘borrowing’ memory from other places, bypassing
the O/S, then if something else wanted that memory the O/S may believe it to be free
and then problems will arise when something else asks for the memory

– there are minor exceptions to this, for example VLM, although this is not
quote the same thing

Summary
——-
Therefore memory layout needs to be managed in order to allow the most
flexible resource usage with the optimum SGA sizings required for the
application. To accomplish this and reach an appropriate compromise, the
following points should be considered when investigating the type of
setup/architecture required for a given application:

– don’t set sga_max_size
this will allow a larger piece of address space to be used for database connections
– use VLM this will allow for a larger buffer cache using certain areas of memory
that are outside of normal process address space, so the SGA can be increased
without sacrificing too many database connections, still do not set sga_max_size
– go to 64-bit architecture
this gives a much larger overall address space and solves many of these
restrictions (essentially it raises the restrictions to a higher level,
so they’re still there but the maximum figures are bigger)

回复

SGA_MAX_SIZE,SGA_TARGET以及PRE_PAGE_SGA参数

10g中Oracle已经强烈不建议使用pre_page_sga参数了，在ASMM下它带来的问题可不小!

Comments

Comment 取消回复