Beruflich Dokumente
Kultur Dokumente
com/aix/how-oracle-uses-memory-on-aix-part-1-processes/
http://intermediatesql.com/aix/how-oracle-uses-memory-on-aix-part-2-sga/
http://intermediatesql.com/aix/how-oracle-uses-memory-on-aix-part-3-locking-sga/
In this post I am going to talk about how ORACLE allocates and uses memory when running on AIX, but I will
also talk about the power of approximation and how it can sometimes be misused for ill purposes
On the outset, ORACLE/AIX memory deal seems simple enough obviously, ORACLE will use memory
when it runs and many AIX commands (such as vmstat or ps) will show memory usage both system wide and
specific to particular process. But, as always, the devil is in the details and the effect of those details may be
far from subtle.
So, why dont we go ahead and find that devil, shall we ?
Well, let me tell you right away where the punch line is this calculation is WRONG in a typical case, it
overestimates ORACLE memory usage by, at least, a factor of 5-10 or more.
But where exactly have we made a mistake ?
Have we identified ORACLE memory parts incorrectly ? No, ORACLE memory does indeed take 2 parts:
SGA and combined per-process memory from instance processes.
Have we made a mistake in the summing formula somewhere ? Well, not really, the formula is trivial not
much room for errors here
The reason our answer is wrong is more subtle and is related to the fact that modern operating systems (AIX
included) employ a number of smart tricks to allocate and manage system memory.
For whatever reason, most AIX commands do not take these tricks into account and display memory sizes as
if nothing is going on (read: the way it was done in the 70s).
Nevertheless, the tricks are there and their effects are real, so lets see how we can out trick the trickster and
find out the real memory allocation.
We will start with process memory.
To simplify a little, memory that is used by a typical AIX process usually is divided into 2 major parts:
1. User Data Variables, dynamically allocated data, function parameters and return values etc
2. Program Code (program itself, shared libraries etc This is also known as: Text)
While user data is obviously unique to each process and will change in size slightly (or not so slightly) as the
process runs, the code part is different it is static and, moreover, it is exactly the same for all programs that
run it.
Since all ORACLE processes that make ORACLE instance are instantiated from the same binary disk image
$ORACLE_HOME/bin/oracle (you did know that, didnt you? ), there is no reason for operating system
to duplicate ORACLE code segment instead AIX loads it in memory once and then links to each process.
Lets prove this:
The interesting observation here is that under normal circumstances and especially for idle (ORACLE)
processes, code segment will be much larger than data (yes, ORACLE has a big code!), reaching up to 9095% of the size reported by PS. That means that for ORACLE processes:
ps -l usually significantly over reports memory size, and the real size is MUCH less
Alternatively, I guess, you could say that ps does report size properly but only if this was the only process in the
system you pick your poison
A simple shortcut to see the real memory size of the process (excluding memory that is shared) is to use ps v
command:
AIX> ps v 880802
PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND
880802 - A 86:59 89065 7088 58020 xx 88839 52048 2.0 0.0 ora_s00
AIX> ps -elf | head -1;ps -elf | grep 880802 | grep -v grep
F
S
UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
240001 A oracle 880802 1 8 64 20 4d345400 95924 Mar 01 - 86:59 ora_s000_qaten
Here, the SIZE column shows the virtual size of process DATA segment (we will talk about what virtual means
shortly) and as you can see it is much smaller (7088) than SZ (95924) that is reported by ps -l.
A more precise way is to use svmon command, that displays all process memory segments and can neatly
group them into SHARED and EXCLUSIVE categories (well, there is also SYSTEM, but that is another story).
AIX> svmon -P 880802 -O segment=category -O filterprop=data
...............................................................................
EXCLUSIVE segments
Inuse Pin Pgsp Virtual
1486
22 306 1765
Vsid
Esid Type Description
PSize Inuse Pin Pgsp Virtual
144370
11 work text data BSS heap
s 1108 0 135 1211
7d4e3 ffffffff work application stack
s 132 0 18 150
...
...............................................................................
SHARED segments
Inuse Pin Pgsp Virtual
1528419
0 418040 1565766
Vsid
Esid Type Description
PSize Inuse Pin Pgsp Virtual
1e483c
10 clnt text data BSS heap,
s 13012 0 - /dev/fslv04:911099
30a0 90000000 work shared library text
s 4914 0 76 9685
...
Ok, we can see now that ps -l process size is bloated because it does not take sharable segments into account.
But is this the whole story ?
Not quite, there is still one other notable trick in AIX bag
Trick #2: Some memory may be swapped away
Let me ask you this would it be possible for the process to allocate 2 Gb of RAM on the system that only has
1 Gb of physical memory ?
The answer is: of course, and it happens every day on many systems (albeit, the ratio in this example is
somewhat extreme). That is: most modern operating systems (AIX is no exception) are designed to handle
workloads that require more memory than the system has.
So, right of the bat, when we are talking about memory, we may actually mean 2 quite different things:
1. Memory that is requested by system processes (we will call it Virtual)
2. Physical memory that the system has (we will cal it Real or Physical)
It should be obvious that if Virtual > Real, something must happen to the portion of Virtual memory that does
not fit into Real memory for the system to continue working properly. What usually happens is that the excess
of Virtual memory (normally the oldest or least used pages) is saved to a special area on disk called SWAP (or
Paging Space in AIX).
I guess you see where Im going with this how do we know whether the memory allocated by ORACLE
processes is really in memory or has it been swapped to disk?
If you do NOT want to do these calculations by hand, you can download omem_proc.sh and ora_mem.pl
tools from this site.
Finally, is there any way to control how much memory ORACLE instance processes are using ?
How to control ORACLE process memory usage
There are several process memory controls that can be implemented.
The simplest way is to use AIX ulimits you can set maximum memory allocation limits, separately for
process data (User Data), stack and rss (virtual memory) components. You can set these (per user) settings in
/etc/security/limits (or through smit) and you can view them with ulimit -a command:
AIX> ulimit -a
data(kbytes)
unlimited
stack(kbytes)
4194304
memory(kbytes)
unlimited
...
But limiting process memory usage in this way is like having a firing squad enforce parking rules one small
mistake and you are dead! Plus, ORACLE processes do not really know that they are NOT supposed to exceed
AIX ulimits and some of them oftentimes (you know ORACLE ) might need to have a lot more additional
memory.
So, we need another mechanism, one that is gentler and aware what ORACLE processes are doing, but at the
same time, sane enough to not let ORACLE kill the system with unreasonable memory demands.
This mechanism is provided by ORACLE and there are actually two of them:
The older one manual process memory management sets memory usage limits individually per process.
MAX per-process memory allocation is controlled by parameter sort_area_size (with sort_area_retained_size
acting as the required minimum). This is the only mechanism available up to ORACLE 8i and it will still be
used in later versions if parameter workarea_size_policy is set to MANUAL.
The newer one automatic process memory management sets memory usage limit collectively for ALL
ORACLE server processes. It is controlled by pga_aggregate_target parameter and works when
workarea_size_policy=AUTO.
A couple of things to remember about pga_aggregate_target:
It is an advisory upper target that ORACLE will try to enforce, but might not be able to under
extraordinary circumstances (i.e. you if have 5000 concurrent active sessions that need to sort data but
only allocated 200M of pga_aggregate_target there is no way 200M target will be met)
It might not cover all the process memory. Rather what it covers is various work areas: sort, hash,
bitmap merge etc but if you decide to allocate another 1,000,000 item PL/SQL array in your session
ORACLE has no choice but to let you use memory for that (however, subsequent sessions will have to
use less memory for sorting,hashing etc).
Ok, so I think we have a better idea now how to see the real memory usage by ORACLE processes and how to
control that usage.
But what about the other (and arguably bigger) chunk of memory that ORACLE uses SGA ? Stay tuned as we
will talk about that in Part 2.
Useful Commands
# Regular AIX commands that should be available on any system ...
# ps v: Shortcut for real memory usage
AIX> ps v pid
# ps v statistics for all processes that belong to $ORACLE_SID instance
AIX> ps gvw | head -1; ps gvw |
egrep " oracle${ORACLE_SID} | ora_.*_${ORACLE_SID} " | grep -v egrep
# Detailed memory usage by process:
# All memory segments allocated to particular process
AIX> svmon -P pid
In the previous post we discussed memory usage by ORACLE processes and, if you remember, it took us some
effort to get to their actual memory usage.
In this post we are going to talk about the other large memory area that ORACLE instance uses System
Global Area or SGA.
In many respects, finding out SGA memory usage is going to be simpler as we have to deal with only one large
entity (AIX shared memory segment) instead of many small memory chunks in separate ORACLE processes.
But as we will see, this process is still rather involved for AIX has a few tricks up its sleeve for shared
memory as well.
How much memory is used by ORACLE SGA. Really
Lets start with a simple question.
What happens if we try to allocate 12 Gb SGA on a machine with only 8 Gb of physical memory?
There are three typical answers to this question:
1. ORACLE instance would NOT start as ORACLE would NOT find enough memory for SGA
2. ORACLE instance WOULD start, but performance would suck as the system will be heavily paging
most of the time
3. ORACLE instance WILL start and nothing of significance will happen to ORACLE or AIX
performance
To most people, answer #2: Instance will start, but the system will page is the most logical. After all, this is
what virtual memory management is about letting much bigger workloads run on limited physical resources.
But at the same time, there must be a penalty for using (much) more memory than we have and that penalty is:
paging
Still, lets not rush the answer and rather run this experiment and see for ourselves.
Wow! According to the results above, answer #3 seems to be correct But why ? In other words, has
ORACLE lied to us by misrepresenting SGA size? Or has AIX lied to us by not reporting obvious paging ?
Well, of course, the answer is neither and the real reason is, once again, comes from the operating system bag
of tricks.
When a process asks AIX to allocate shared memory segment, it receives a pointer to the segment and a
promise from AIX that memory will be there when the process needs it (aint AIX working like a good
salesman here ? ). That means that:
sga_max_size
big integer 12G
sga_target
big integer 10G
AIX> ipcs -bm | grep oracle
m 92274693 0xd9bf0b18 --rw-r----- oracle
dba 12884930560
AIX> svmon -P $(ps -elf | egrep " ora_smon_${ORACLE_SID} " | grep -v egrep |
awk '{print $4}') | grep shmat
33202 70000030 work default shmat/mmap
131a6 70000001 work default shmat/mmap
f321a 70000023 work default shmat/mmap
s 39274 0 0 39274
s 4078 0 0 4078
s 2842 0 0 2842
s 2055 0 0 2055
s 1511 0 0 1511
s 16 0 0 16
s
s
s
s
s
16 0 0 16
1 0 0 1
0 0 0 0
0 0 0 0
0
AIX> svmon -P $(ps -elf | egrep " ora_smon_${ORACLE_SID} " | grep -v egrep |
awk '{print $4}') | grep shmat | wc -l
49
As you can see ipcs still reports SGA shared memory segment to be 12 Gb in size, which is expected. The size,
however, now matches sga_max_size (and not sga_target). So, right of the bat, we can see that sga_target is a
logical (rather than physical) limit.
But svmon results are strange why do we see multiple memory segments allocated when we really only have
1 SGA ? This requires a bit of explanation.
In AIX, virtual memory manager organizes memory into segments (which are further subdivided into pages).
Each segment has a maximum size of 256 Mb and therefore, if you allocate, say 1 Gb of shared memory AIX
gives you 4 segments.
Notice that in the example above, 49 segments are allocated in total. That is: 48 (12*4) +1, where 48, again,
matches allocation for 12 Gb. (+1 here is a little weird whenever sga_max_size is set in Gigabytes (say, 12G
as opposed to 12000M) this additional segment always pops up Im guessing, this is the result of rounding
issues when translating sga_ parameters from ORACLE to AIX).
Anyway, forgetting about rounding issue for the moment, the total size of all SGA segments should be 48 * 256
Mb = 12 Gb. But this is the requested size that, if you remember, AIX promised but may have not necessarily
delivered. In other words, it will only fully materialize when all requested memory is actually used.
So, how much memory are we really using right now ? This is where other svmon numbers come in.
Lets look at the first segment.
33202 70000030 work default shmat/mmap
s 39274
0 39274
It allocates 39274 small pages (s = 4K), which translates to, roughly, 159 Mb, out of possible 256. The next
segment allocates 4078 pages or ~16 Mb and so on Notice that memory allocation across segments is not
uniform: it quickly drops down for further segments down the line. As you might have already guessed, this
means that ORACLE does not yet need this memory and hence it has not yet been allocated.
When I summed these numbers up, I got total memory allocation of ~ 900 Mb, far smaller than the requested 12
Gb (your mileage may vary, of course).
Which explains why the system is NOT paging -it does not need to as the entire ORACLE instance memory fits
into physical memory with ease (which is further corroborated by the absence of in paging space shmat
pages in svmon report (3rd column))
Also notice that, at the end, some segments do not allocate any pages, as in the example below.
7b36b 70000003 work default shmat/mmap
c323c 70000002 work default shmat/mmap
s
s
0
0
0 0
0 0
0
0
These are the segments that have been allocated beyond sga_target. In other words, when sga_max_size >
sga_target, AIX allocates some metadata (segments) for ORACLE, but does not fill them up and hence memory
is NOT used. Thus, regardless of what ipcs reports, setting sga_max_size beyond sga_target is a pretty safe
operation that does not actually use memory (but see below!).
There is a number of caveats here as usual.
First of all, for segments below sga_target, some memory will always be allocated (even if it is one page).
My guess is ORACLE needs to touch some pages in every shared segment that it can work with and that
includes all segments up to sga_target.
Second, the examples above describe the default system setup. The rules of ORACLE memory allocation
can be adjusted both on ORACLE side and AIX side, which might change allocation picture entirely (we will
discuss these changes and their implications in the next post)
Third, and this is important, remember that you set certain SGA size for a reason. And that reason is you want
ORACLE instance to work efficiently and, this is achieved, to a large degree, by having enough memory and
effectively using it. That means that if you size your SGA properly memory WILL be used eventually. Up
to and including your sga_target setting. And before ORACLE 11g this memory can never be released (short
of restarting the instance). So, do not assume that you can run five instances with 4 Gb SGA each on a system
with 8 Gb of physical RAM without suffering the consequences eventually
And lastly, this is not to say that you can allocate ANY amount of RAM to the instance just because that
memory may not be used (and hence allocated) immediately. At some point you will hit the dreaded ORA00064: object is too large to allocate on this O/S error and you might or might not easily get around that My
guess is: ORACLE did it mostly to discourage people from requesting 100 Tb SGAs on a 2 Gb machine
This is, of course, all well and good it is nice to know that memory is not used unless requested and you can
enjoy plenty of resources in the system, for some time at least But you have probably heard that sizing SGA
properly from the start and potentially pinning it in physical memory is much better as it allows AIX to NOT
start paging out the most critical memory at the most inopportune times.
In the next post, we are going to discuss ORACLE and AIX ways to do just that.
Useful Commands
# ORACLE command to show which AIX shared memory segment
# is "attached" to the instance
AIX> sysresv
# Basic information about SGA shared memory segment
# (use syresv to find out segment id or key)
AIX> ipcs ma | grep segment_id
# The list of VMM segments (virtual segment ids) that comprise SGA segment
AIX> ipcs -bmS1 | grep segment_id
# Detailed information about individual SGA segments
AIX> svmon -S $(ipcs -bmS1 | grep segment_id | perl -pe "s/(\S+\s+){7,7}//")
# The same information, but through a different avenue
# All ORACLE instance processes attach THE SAME (SGA)
# shared memory segments.
# So, we check shmat attachments in any process (i.e. SMON) for details
AIX> svmon -P $(ps -elf | egrep " ora_smon_${ORACLE_SID} " |
grep -v egrep | awk '{print $4}') | grep shmat
# All VMM shared memory segments that belong to $ORACLE_SID
# You can download the script from the TOOLS area on this site
AIX> omem_shared.sh $ORACLE_SID
In the previous post we discussed how ORACLE allocates shared memory for SGA in AIX and one of the
conclusions was that AIX does not give all the requested memory to ORACLE instance right away but merely
promises it.
While this technique allows to use memory more efficiently and you (at least temporarily), can request more
memory for processes and shared segments than what AIX physically has, it also has a rather unpleasant
consequence when we get to the limit of physical memory, AIX will have no choice but to start paging
memory.
Paging is not necessarily a bad thing moving older and not-so-often used data out of memory is something
that will be done rather routinely this is how AIX keeps a healthy system. However, when SGA memory
starts to page out (and, more importantly, page back in) things can go bad quickly as, well, ORACLE does not
really expect SGA to be a disk based area (ORACLE would have called it SDA if that was the case )
You probably know that in the vast majority of configurations, it is strongly advised to size SGA so that it fits
entirely into physical memory and never pages out. The question becomes: how can we accomplish that on
AIX?
Pinning ORACLE SGA into AIX Memory
It turns out that there are several ways to pin ORACLE SGA into AIX memory, some of them ORACLEdriven, some AIX-driven and a combination of both
First of all, lets look at what ORACLE offers.
We will start by checking ORACLE sga-related parameters:
SQL> SHOW parameter sga
NAME
TYPE
VALUE
------------------------------------ ----------- -----------------------------lock_sga
BOOLEAN FALSE
pre_page_sga
BOOLEAN FALSE
sga_max_size big INTEGER 8G
sga_target
big INTEGER 8G
The first two parameters (lock_sga and pre_page_sga) look promising and, in fact, they can be used to control
how SGA memory is allocated.
Lets look at pre_page_sga first.
Controlling memory allocation with pre_page_sga
According to ORACLE documentation, when pre_page_sga is set to true every process that starts must
access every page in SGA. Obviously, when this happens, the entire SGA memory is used and thus allocated.
Lets see for ourselves. Before we even begin, lets remind ourselves how memory is allocated when this
parameter is NOT set.
As you can see, in the beginning, most of the memory is under allocated (AIX promised it but did not yet
deliver) as not all of the memory has been used.
After setting pre_page_sga to TRUE and restarting the database the picture changes:
Notice that all segments are allocated to the MAX - that is the result of instance processes reading and touching
all the memory pages during startup. This obviously has a direct effect on the time it takes to start up the
database in my environment it took ~ 40 seconds (compared to ~ 12 seconds with default settings) for a 4Gb
SGA. Presumably, however, this additional time has not been wasted all further requests to SGA memory are
supposed to hit real physical memory and AIX will not need to do any additional allocations.
Still, there are two problems with this approach:
1. Notice that the memory, although fully allocated, is NOT really pinned. That means that if AIX starts
experiencing memory shortages, you can bet that it will start paging SGA memory out with all the
unpleasant consequences.
2. A somewhat unexpected consequence is that it now takes more time for any ORACLE process to
start as the touching it is not done just during instance startup it is happening for any new
ORACLE process (i.e. dedicated server). In my environment, average database connection time went
from ~ 0.2 second to ~ 0.8 second, a 4 time increase.
Given these downsides, it is really hard to find a good justification for using pre_page_sga to load ORACLE
memory in memory. Im guessing this parameter is probably a relict of the past, or, perhaps, a way to pre-load
memory for systems that do not support real memory pinning (remember that ORACLE can run on many
operating systems). But in modern AIX, I just do not see how it can be effectively used.
So, lets move on to the next parameter lock_sga
Controlling memory allocation with lock_sga
When lock_sga is set to true, ORACLE (based on what truss output shows), runs this additional command on a
(global think ipcs) shared memory segment:
shmctl(..., SHM_LOCK, ...)
which pins shared memory region into physical memory.
Lets see how it works. After setting lock_sga=true, and restarting the database, here is what I see:
Notice, that memory is not only allocated fully, but is also pinned and this is really what we want to achieve.
The database startup still takes more time than without this parameter (on my system, ~ 34 seconds compared
to ~ 12 seconds, again, for a 4Gb SGA), but normal database connections do not suffer any longer as, beyond
startup, ORACLE processes do not need to do any (major) extra stuff.
One note here: Many ORACLE documentation sources recommend to also set v_pinshm AIX vmo parameter
to enable memory pinning as in:
vmo -p -o v_pinshm = 1
However this is no longer required, unless you are dealing with a really old version of ORACLE.
With versions up to 9i, ORACLE used a different call for memory pinning:
shmget(IPC_PRIVATE, shm_size, IPC_CREAT|SHM_PIN)
which required that v_pinshm is also set. As I mentioned, in 10g and beyond ORACLE uses:
shmctl(shm_id, SHM_LOCK, ...)
that completely ignores v_pinshm settings (special thanks to Leszek Leszczynski for researching this in detail).
In my tests, ORACLE 10g/11g memory was pinned regardless of the value of v_pinshm. You can of course,
still set it if you need it for ORACLE 9i or for other applications.
In any case, looks like setting lock_sga=true (and, v_pinshm=1, if needed) solves the problem of SGA pinning
to our satisfaction memory is pinned and everybody is happy.
But I would submit that for larger SGAs (and what SGA these days is NOT large?
way to work with AIX memory and that is using AIX large memory pages.
Page size
Svmon
symbol
Configuration
How to use
4K
Traditional,
Automatic
YES
N/A
By default
64K
Automatic
YES
N/A
By default
16M
Manual
NO
vmo -p -o
lgpg_regions=2048
lgpg_size=16777216
chuser capabilities=
CAP_BYPASS_RAC_VMM,
CAP_PROPAGATE
oracle
lock_sga=TRUE
16G
Manual
NO
vmo -p -o
lgpg_regions=10
lgpg_size=17179869184
chuser capabilities=
CAP_BYPASS_RAC_VMM,
CAP_PROPAGATE
oracle
lock_sga=TRUE
As you can see, 4K pages are still there and still default, but AIX has now also added new default 64K pages.
Default in this context means that you do not need to do anything special to either enable these pages or use
them AIX will decide when to use 64K pages instead of 4K and this will be done completely transparently to
programs (including ORACLE, of course). In fact, in modern hardware, you would most likely see 64K pages
used by ORACLE SGA as (a large) SGA size will definitely warrant them.
There is also an interesting development with AIX 6.1. While AIX 5.3 can allocate 64K pages from the start,
AIX 6.1 can take existing 4K memory regions and see if they can be collapsed from 4K to 64K pages.
svmon will show collapsed regions as sm.
But back to memory pages. Beyond medium (64K) pages, AIX also allows to use even larger pages 16M or
16G. However, there are two important differences here:
1. Large pages are NOT available by default. They require extra steps to enable them and (separately) to
use them
2. Large pages are NOT pageable. Once allocated, they always stay in memory and cannot be paged in or
out (which is probably a good thing, but you do need to pay special attention to how you size them).
In addition to that, not all AIX hardware will support larger pages. To see if your particular hardware supports
them run:
AIX> pagesize -a
4096
65536
16777216
17179869184
The one problem with large pages is that they are somewhat cumbersome to use.
First of all, you have to pre calculate the large page memory size and explicitly set it with the VMM (this will
take memory away from regular VMM operations and designate it to large page region).
AIX> vmo -p -o lgpg_regions=2048 lgpg_size=16777216
AIX> bosboot -ad /dev/hdisk0; bosboot -ad /dev/hdisk1; bosboot ad /dev/ipldevice
Personally, I do not see it as a major issue as you have to assign a specific size for your SGA anyway, albeit
now you will have to do it on 2 levels: ORACLE and AIX.
Second, even then allocated, large pages cannot be used unless you allow user to skip regular VMM allocation
policies with this command:
AIX> chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE $USER
which, again, in my mind is only a minor nuisance.
(Of course, there are also bugs in particular, ORACLE 10.2.0.4 will not use AIX large pages even if all
settings are made, unless one-off patch: 7226548 is applied But I digress )
Anyway, now we know what large pages are, but why exactly do we want to use them?
Well, for larger SGA sizes the benefits should be fairly obviously: making page sizes larger reduces the number
of pages that AIX has to manage and that makes managing memory more efficient. Think about this: for a 30
Gb SGA (which is not excessively big these days ), the number of pages is reduced from 7,864,320 (for 4K
pages) or 491,520 (for 64K pages) to 1,920 if we switch to 16M pages and that is, indeed, quite a savings
I.e. look at how this reduction affects database startup time (test results from one of my systems):
the 30 Gb SGA database started in ~ 6 seconds with default settings (but remember that memory is not
really fully allocated)
lock_sga=TRUE with 64K pages changed that startup time to ~ 35 seconds
lock_sga=TRUE + large (16M) pages drove the startup time back to ~ 6 seconds
On top of that, once you set up large pages, you effectively shielded this memory from the rest of the system
it will not be paged out or affected by regular memory operations, which is, ultimately, what you want to
achieve in most cases.
Finally, once allocated, how will large page memory be reported by svmon? Well, see for yourself:
This would normally conclude AIX memory story, if not for one thing ORACLE 11g made a major changes
in this area, making SGA, in addition to PGA much more dynamic
In the next post we are going to have some fun with AIX and ORACLE 11g memory_targets.
How ORACLE Uses Memory on AIX. Part 4: Having Fun with 11g Memory_target
This is going to be a long post but dont be discouraged: most of it will involve snapshots and screen examples,
so it shouldnt be too bad
Anyway, here is the short recap from the previous 3 posts (Part 1, Part 2, Part 3):
ORACLE Instance Memory consists of 2 parts: process memory and shared (SGA) memory
Process memory is a bunch of memory segments allocated in individual ORACLE processes and their
collective size is (attempted to be) managed by pga_aggregate_target parameter. AIX improves process
memory usage by identifying sharable segments (such as program or shared library text) and not
duplicating them for each individual process.
SGA memory is allocated as a single AIX shared memory segment (which, in reality, turns out to be a
bunch of smaller VMM segments) and (in ORACLE 10g) is managed by sga_target and sga_max_size
parameters. AIX, by default, helps with shared memory usage by allocating it only as needed. However,
you can overwrite this behavior and force AIX to allocate all the shared memory at once and,
additionally, put a pin on it in order to prevent paging.
If you read these descriptions carefully, you would notice that process and SGA memory, while being two parts
of the same coin, are very different from each other: they are allocated by AIX differently, they are managed by
ORACLE differently and they, in a sense, almost feel different.
In other words, while these two memory regions are related to each other, they are by no means close relatives:
one is a big chunk of almost static memory that lives completely independently from any process and the other:
an amorphous and ever changing haze of small memory pieces that are completely privatized by individual
processes
Still, memory is memory is memory and the ultimate question for every DBA is:
The Ultimate Question ...
How to size ORACLE memory properly so that the database runs efficiently on this particular machine ?
In other words, what we really want to know is the Total Memory that is taken by instance. And the fact that we
have to (artificially) deal with two separate memory regions here brings a couple of complications
The first complication is rather cosmetic: Obviously, dealing with one thing is easier than with two. Yet,
sizing two things is still pretty straightforward (plus, it gives us more control), so we can almost dismiss it as
merely an inconvenience if not for the second complication
The second complication is more fundamental: process and SGA memory regions have rather different
purposes: SGA memory is mostly used to cache database blocks while process memory (that is manageable by
ORACLE) is used for various temporary areas: sorting, hashing etc It is unlikely that both SGA and PGA
will be simultaneously used to the max at any given time. Yet they are usually both sized to the max as it is
likely that they may be used to the max individually at one time or another.
I guess you see where Im going with this with Berlin Wall between SGA and PGA memory pools, it is quite
possible that one pool will be starving while the other only lightly used. Traditionally, this situation has been
addressed by increasing physical memory of the system to fit both pools (which is a definition of waste), but
perhaps there is a better way
And a better way is indeed what ORACLE introduced with 11g an ability to manage these two pools together
and shift memory to the part where it is most needed.
But enough with the theory. Lets see exactly what ORACLE did.
Static SGA/PGA in ORACLE 10g
Lets start with establishing a baseline before we see how the memory is shifted between SGA and PGA in
11g, lets see how it is NOT shifted in ORACLE 10g (so that we can better appreciate this new 11g feature).
Here is how we are going to do that:
-- We request 2 Gb for SGA and 2 Gb for PGA
SQL> SHOW parameter target
...
pga_aggregate_target
sga_target
big INTEGER 2G
big INTEGER 2G
Segments: 9
Lets now use SGA to the fullest. We could have done it by running a full table scan on our table, but, in this
case, we have an even easier way to fill up the SGA - collect table statistics:
SQL> EXEC dbms_stats.gather_table_stats(USER, 't', estimate_percent => NULL);
After statistic collection conveniently filled up database buffer cache, here is what our SGA looks like:
AIX> omem_shared.sh
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------6886f s
0.00
0.00
0.00
0.00
206a6 s
64.03
0.00
0.00
64.03
8743 s
91.21
0.97
0.00
91.21
7876d s 255.88
0.00
0.00 255.88
d0718 s 255.94
0.00
0.00 255.94
988b1 s 255.94
0.00
0.00 255.94
89e3 s 255.94
0.00
0.00 255.94
287a7 s 255.94
0.00
0.00 255.94
7036c s 255.94
0.00
0.00 255.94
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------TOTAL:
1690.81
0.97
0.00 1690.81
Requested SGA: 2048.01
Segments: 9
Notice that SGA is now is using ~ 80% of the requested capacity (the unallocated portion is reserved for things
other than buffer cache).
Ok, now lets test the sorting. To maximize requirements for PGA, we are going to start 10 parallel sessions,
each of which will be doing a full sort of the table:
AIX> cat order.sql
set autotrace traceonly
SELECT * FROM t ORDER BY c DESC;
exit
AIX> cat order.ksh
#! /usr/bin/ksh
integer i=0
while ((i < 10));
do
echo "Starting sqlplus: $i";
sqlplus user/password @order.sql &
(( i = i + 1));
done
AIX> order.ksh
And after a few minutes of work, here is what SGA and PGA memory looks like:
AIX> omem_shared.sh;omem_proc.sh
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------6886f s
0.00
0.00
0.00
0.00
206a6 s
64.03
0.00
0.00
64.03
8743 s 100.84
0.00
0.00 100.84
7876d s 255.88
0.00
0.00 255.88
d0718 s 255.94
0.00
0.00 255.94
988b1 s 255.94
0.00
0.00 255.94
89e3 s 255.94
0.00
0.00 255.94
287a7 s 255.94
0.00
0.00 255.94
7036c s 255.94
0.00
0.00 255.94
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------TOTAL:
1700.43
0.00
0.00 1700.43
Requested SGA: 2048.01
Segments: 9
7036c s 255.94
0.00
0.00 255.94
89e3 s 255.97
0.00
0.00 255.97
d0718 s 255.99
0.00
0.00 255.99
287a7 s 256.00
0.00
0.00 256.00
988b1 s 256.00
0.00
0.00 256.00
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------TOTAL:
1804.65
0.00
0.00 1804.65
Requested SGA: 2048.01
Segments: 9
Segments: 9
Sizing Result
Comment
sga_target
UP
sga_target
DOWN
Nothing happens
pga_aggregate_target UP
pga_aggregate_target DOWN
Eventually
PGA is really pretty dynamic, even in 10g it can be sized UP and DOWN and this operation is
dynamic and almost immediate (depending on workload)
SGA is somewhat dynamic as well it can be sized UP (and is almost immediate if there is a pressing
need), but it cannot be sized DOWN - the memory SGA grabs will stay with SGA forever (or, at least,
until the next database restart)
But, a more important note is that there is really no communication and no relation between SGA and PGA in
ORACLE 10g, in other words:
The wall between SGA and PGA in ORACLE 10g is rock solid
Lets see if anything changed with ORACLE 11
(Hopefully) Dynamic SGA/PGA in ORACLE 11g
With 11g, the SGA/PGA is presumably managed as one memory area, so to keep things honest, we are going to
set 11g memory_target to the same size as combined SGA+PGA in our 10g example and we are going to repeat
the tests:
big INTEGER 4G
big INTEGER 0
big INTEGER 0
-- And, we are creating the same exact 3 Gb test table as with 10g database
SQL> CREATE TABLE t (n, c) NOLOGGING PARALLEL PCTFREE 90 PCTUSED 10
AS SELECT level, CAST(level AS CHAR(2000))
FROM dual CONNECT BY level <= 393216
/
Lets look at the state of SGA memory initially:
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------988b1 s
0.00
0.00
0.00
0.00
6886f s
0.00
0.00
0.00
0.00
206a6 s
0.00
0.00
0.00
0.00
287a7 s
0.00
0.00
0.00
0.00
8743 s
0.00
0.00
0.00
0.00
68a4f s
0.00
0.00
0.00
0.00
608ce s
0.06
0.00
0.00
0.06
98911 s
4.18
0.00
0.00
4.18
58ae9 s
9.56
0.00
0.00
9.56
f091c s
9.56
0.00
0.00
9.56
d8919 s
9.56
0.00
0.00
9.56
58a69 s
9.56
0.00
0.00
9.56
e08fe s
9.56
0.00
0.00
9.56
d0718 s
22.00
0.00
0.00
22.00
d0a78 s
24.96
0.00
0.00
24.96
d0938 s
62.19
0.00
0.00
62.19
387e5 s 110.36
0.00
0.00 110.36
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------TOTAL:
271.58
0.00
0.00 271.58
Requested SGA: 4096.01
Segments: 17
Notice the first major change VMM segments are now allocated for the entire 4 Gb memory_target that is,
ORACLE gives itself an option to use all of the instance memory for the SGA (of course, it is extremely
unlikely that this will ever happen). Also notice that potential MAX size of 7 VMM segments that are all (or
mostly) zeroes (7 * 256 Mb = 1792 Mb) almost exactly matches current dynamic value of PGA Target.
SQL> SELECT component, current_size
FROM v$memory_dynamic_components
WHERE component LIKE '%Target'
/
COMPONENT
CURRENT_SIZE
--------------- ------------
SGA Target
PGA Target
2566914048
1728053248
Now lets give it a spin and load our test table into memory
Here we are actually faced with a slight problem: Neither dbms_stats.gather_table_stats() nor SELECT /*+
full(t) */ * FROM t seem to fill the 11g cache completely instead, only a small portion of cache is used and
memory allocation remains largely the same (this is consistent with ORACLE documentation that states that
full table scan blocks are kept at the end of buffer cache and recycled).
By itself, this is a hell of the new feature, but it does screw our test So, as a workaround, we are going to
take a slightly longer road and fill database cache by lots of smaller a few blocks only SQLs:
SQL> CREATE INDEX t_idx ON t(n) NOLOGGING PARALLEL;
DECLARE
TYPE tC_t IS TABLE OF CHAR(2000);
tC tC_t;
i NUMBER;
BEGIN
i := 1;
while i <= 393216 loop
SELECT c bulk collect INTO tC FROM t WHERE n BETWEEN i AND i+1000;
i := i+1000;
END loop;
END;
/
Ok, now we are talking: buffer cache is fully allocated (again, the unallocated portion is reserved for shared
pool etc)
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------988b1 s
0.00
0.00
0.00
0.00
6886f s
0.00
0.00
0.00
0.00
206a6 s
0.00
0.00
0.00
0.00
287a7 s
0.00
0.00
0.00
0.00
8743 s
0.00
0.00
0.00
0.00
68a4f s
0.00
0.00
0.00
0.00
608ce s
0.06
0.00
0.00
0.06
d0718 s
22.01
0.00
0.00
22.01
98911 s 111.92
0.00
0.00 111.92
387e5 s 138.38
0.00
0.00 138.38
d0938 s 239.83
0.00
0.00 239.83
58ae9 s 255.81
0.00
0.00 255.81
f091c s 255.81
0.02
0.00 255.81
d8919 s 255.81
0.00
0.00 255.81
58a69 s 255.81
0.00
0.00 255.81
e08fe s 255.81
0.00
0.00 255.81
d0a78 s 255.82
0.00
0.00 255.82
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------TOTAL:
2047.09
0.02
0.00 2047.09
and we are ready to test memory shifting. Lets start sorting sessions
AIX> order.ksh
AIX> omem_shared.sh;omem_proc.sh
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------988b1 s
0.00
0.00
0.00
0.00
6886f s
0.00
0.00
0.00
0.00
206a6 s
0.00
0.00
0.00
0.00
287a7 s
0.00
0.00
0.00
0.00
8743 s
0.00
0.00
0.00
0.00
68a4f s
0.00
0.00
0.00
0.00
608ce s
0.06
0.00
0.00
0.06
d0718 s
22.01
0.00
0.00
22.01
98911 s 111.92
0.00
0.00 111.92
387e5 s 126.69
0.00
0.00 126.69
d0938 s 239.83
0.00
0.00 239.83
58ae9 s 255.81
0.00
0.00 255.81
f091c s 255.81
0.00
0.00 255.81
d8919 s 255.81
0.00
0.00 255.81
58a69 s 255.81
0.00
0.00 255.81
e08fe s 255.81
0.00
0.00 255.81
d0a78 s 255.82
0.00
0.00 255.82
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------TOTAL:
2035.40
0.00
0.00 2035.40
Requested SGA: 4096.01
Segments: 17
...
e08fe s 255.81
0.00
0.00 255.81
d0a78 s 255.82
0.00
0.00 255.82
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------TOTAL:
2168.62
0.00
0.00 2168.62
Requested SGA: 4096.01
Segments: 17
Segments: 17
Shifting memory between SGA and PGA in ORACLE 11g does work
However, by default, this process is very slow and gradual. Memory is not constantly moved back
and forth between SGA and PGA (that would kill performance for sure )
Still, if necessary it is possible to override default ORACLE behavior and move memory to the
other pool immediately.
If immediate changes are needed, sizing target region UP (rather than sizing the other region DOWN)
will give you results much faster
The last point begs for an explanation and I believe this is a result of ORACLE being lazy (or, perhaps, smart)
In other words, ORACLE would not size memory region DOWN immediately, because it does not have to
(remember, that PGA and SGA targets are minimum requirements). On the other hand, sizing SGA (or PGA)
UP makes ORACLE do it at once (and, in my tests, at least for SGA, it happened with NO workload present).
In this case, ORACLE really has to do it as a new minimum requirement has to be met.
What happens when you set pre_page_sga=TRUE with memory_target?
Honestly, what happens is rather weird: without manual tweaking, both SGA and PGA memory are allocated to
the MAX, at least, according to svmon:
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------...
10680 s 256.00
0.00
0.00 256.00
8a63 s 256.00
0.00
0.00 256.00
---------- -- ---------- ---------- ---------- ---------Vsid Pg InMem
Pin
Paging Virtual
---------- -- ---------- ---------- ---------- ---------TOTAL:
4096.00
0.00
0.00 4096.00
Requested SGA: 4096.01
Segments: 17
However, when manually tweaked (i.e. when you increase pga_aggregate_target), the database behaves exactly
like it does with pre_page_sga=FALSE. In this particular example, part of SGA memory is de-allocated (and
shifted to PGA).
What happens when you set lock_sga=TRUE with memory target?
This one is simple. When you attempt to start the database, it gives you this nice error:
SQL>startup OPEN
ORA-00847: MEMORY_TARGET/MEMORY_MAX_TARGET AND LOCK_SGA cannot be SET together
Which also means that you cannot use memory_target with AIX large pages.
It must come as no surprise though as you cannot have it both ways
This concludes my memory story I hope that these modest explanations have been useful and gave you some
insight of how ORACLE uses AIX memory.
If you want to know more, please, refer to these other excellent documents that describe memory behavior for
AIX as well as other UNIX operating systems.
Useful Links
Tanel Poder on Memory_Target in Linux
Tom Kyte on Memory Target
Tanel Poder on ORACLE Memory Usage in Solaris
AIX White Paper on Multiple AIX Page Support
Overview of AIX Process Memory Regions
AIX Performance Presentation by Steve Nasypany that includes an excellent overview of memory usage