Sie sind auf Seite 1von 14

Introduction to performance tuning for HP-UX

PDF
Print
Written by Geoff Wild
Thursday, 07 June 2007 22:57
NOTE: The following article was written by: James Tonguet

E-mail

When considering the performance of any system it is important to determine a ba


seline of what is acceptable.
How does the system perform when there is no load from applications or users?
What are the systems resources in terms of memory, both physical and virtual?
How many processors does the system have?
What is the speed and RISC level?
What is the layout of the data ?
What are the key kernel parameters set to?
How are those resources being utilized?
What are the utilities to measure these?
Memory Resources
HP-UX utilizes both physical memory ,RAM and disk memory, referred to as swap.
There are three resources that can be used to determine the amount of RAM: syslo
g.log,dmesg, and adb (absolute de-bugger) .
The information dmesg reports comes from /var/adm/syslog/syslog.log.
While using dmesg is convienient, if the system has logged too many errors recen
tly, the memory information may not be available.
Insufficient memory resources are a major cause of performance problems, and sho
uld be the first area to check.
The memory information from dmesg is at the bottom of the output.
example:
Memory Information :
physical page size =4096 bytes, logical page size= 4096 bytes
Physical: 524288 Kbytes, locakble: 380880 Kbytes, available: 439312
Using adb reads the memory from a more reliable source, the kernel.
To determine the physical memory (RAM) using adb:
for HP-UX 10.X
example:
echo physmem/D | adb /stand/vmunix /dev/kmem
physmem:
physmem: 24576
for HP-UX 11.X systems running on 32 bit architecture:
example:
echo phys_mem_pages/D | adb /stand/vmunix /dev/kmem
phys_mem_pages:
phys_mem_pages: 24576
for HP-UX 11.X systems running on 64 bit architecture:
example:

echo phys_mem_pages/D | adb64 /stand/vmunix /dev/mem


phys_mem_pages:
phys_mem_pages: 262144
The results of these commands are in 4 Kb memory pages, to determine the size
in bytes multiply by 4096 .
To fully utilize all of the RAM on a system there must be a sufficient amount
of virtual memory to accomodate all processes that will be opened on the
system . The HP recommendation is that virtual memory be at least equal to
physical memory plus application size.
This is outlined in the System Administration Tasks Manual.
To determine virtual memory configuration run the following command:
#swapinfo -tam
example:
Mb
Mb
TYPE AVAIL
dev
reserve
memory 372
total 1396

Mb
USED
1024
184
96
280

PCT
FREE
0
-184
276
1116

START/
Mb
USED
LIMIT
1024

RESERVE
PRI
NAME
0
1 /dev/vg00/lvol1

26
20

The key areas to monitor are reserve , memory and total .


For a process to spawn it needs a sufficient amount of virtual memory to be plac
ed in reserve. There should be a sufficient amount of free device swap to open a
ny processes that may be spawned during the course of operations. By subtractin
g the reserve from the device total you can determine this value.
In the example above , 184Mb of device swap has been reserved, this leaves 840Mb
to open up processes or for paging to disk.
If there is an insufficient amount of device swap available ,the system will use
RAM to reserve memory for the fork call. This is an inefficient use of fast mem
ory. If there is an insufficient amount of available memory to fork you will rec
eive an error : cannot fork : not enough virtual memory.
If this error is received , you will need to allocate more device swap. This sho
uld be configured on a disk with no other swap partitions, and ideally of the sa
me size and priority of existing swap logical volumes to enable interleaving.
Refer to the Application Note KBAN00000218
Configuring Device Swap for details on the procedure.
The memory line is enabled when the kernel parameter swapmem_on is set to 1 .
This allows a percentage of RAM to be allocated for pseudo-swap. This is the def
ault and should be used unless the amount of lockable memory exceeds 25% of RAM.
You can determine the amount of lockable memory by running the command:
example:
echo total_lockable_mem/D | adb /stand/vmunix /dev/mem
total_lockable_mem:
total_lockable_mem: 185280

This will return the amount in Kbytes of lockable memory in use.


Divide this by 1024 to get the size in megabytes, then divide by the amount of R
AM in megabytes to determine the percentage.
To avoid memory contention between the buffer cache and pseudo
swap ,dbc_max_pct should not be greater than the difference between lockable mem
ory and pseudo swap. Some overlap is acceptable under most conditions . As pseu
do swap is used to prevent paging to disk , and thus reducing disk i/o traffic i
t should be a more significant consideration when configuring memory.
If pseudo-swap is disabled by setting swapmem_on to 0 , there will typically be
a need to increase the amount of device swap in the system to accommodate paging
and reserve area. Ideally in a modern system paging to disk should be avoided.
If there is significant paging to disk and the buffer cache has been adjusted to
avoid contention , adding RAM would be advisable for maximum performance.
After physical and virtual memory is determined , we need to determine how much
buffer cache has been configured and how much is being used. By default the syst
em will use dynamic buffer cache . The kernel will show buf pages and nbuf set t
o 0 in SAM. The parameters that govern the size of the dynamic buffer cache are
dbc_min_pct and dbc_max_pct , these define the minimum and maximum percentage of
RAM allocated. The default values are 5% mimimum and 50% maximum .
On systems with small amounts of RAM these values may be useful for
dedicated applications. Since the introduction of HP-UX 11.0 the amount of RAM a
system can have has increased from 3.75Gb to our newest systems with up to 256G
b. Keeping the default values for systems with a large amount of RAM can have a
negative impact on performance, due to the time the lower level routines that c
heck on free memory in the cache take.
To monitor the use of the buffer cache run the following command :
sar -b 5 100
You will see output similar to :
bread/s lread/s %rcache bwrit/s lwrit/s %wcache pread/s pwrit/s
0
95
100
1
2
54
0
0
The statistical average will be reported at the end of the report. Ideally we wa
nt to see an average %wcache of 95 or greater. If the system consistently shows
%wcache less than 75 it would be advisable to lower the value of dbc_max_pct. In
32 bit architecture, the buffer cache resides in quadrant 3, limiting the maxim
um size to 1 Gb. Large and volatile buffer caches can have a negative impact on
performance, normally no more than 300Mb is required to provide a sufficent buff
er cache.
Keep in mind that many modern disk arrays buffer their writes with onboard memor
y, also many databases use lockable memory to buffer within the database.
Note : Buffers remain in the cache even after a file is closed , as they could b
e used again in the future.
Trade-offs are associated with either a static or dynamic buffer cache.
If me
mory pressure exists, a static buffer cache cannot be reduced, potentially causi
ng more important pages to be swapped out or processes deactivated. In contrast

, some overhead exists in managing the dynamic buffer cache, such as the dynamic
allocation of the buffers and managing the buffer cache address map or buffer c
ache virtual bitmap Also, a dynamic buffer cache expands very
rapidly, but contracts very slowly and only when memory pressure exists.
It is possible to bypass either static or dynamic buffer caches , in some
instances this allows for faster disk I/O .
This can be accomplished with the Online JFS mount options mincache=direct , co
nvosync=direct Other options would be raw I/O , aynchronus writes to raw logica
l volumes , discovered_direct_io , ioctl. These topics are covered later in the
text.
Tuning recommendations :
For databases , favor global area ( SGA) over the buffer cache
For most systems 200 -400 MB
Current patches relating to the buffer cache :
10.20
PHKL_28866 (Critical, Reboot) s800 10.20 VM read-ahead panic, buffer cache, pagi
ng
PHKL_26767 (Critical, Reboot) s800 10.20 Buffer cache deadlock;write gets VX_ERE
TRY

11.0
PHKL_18543(Critical, Reboot) s700_800 11.00
PM/VM/UFS/async/scsi/io/DMAPI/JFS/perf patch
11.11
Patch PHKL_27808 s700_800 11.11 Filesystem buffercache performance fix

Memory for applications


The following parameters configure memory resorces :
maxdsiz = Maximum Data Segment Size (Bytes) 32 bit
maxdsiz_64bit= Maximum Data Segment Size (Bytes)64 bit
maxssiz=Maximum Stack Segment Size (Bytes)32 bit
maxssiz_64bit = Maximum Stack Segment Size (Bytes)64 bit
maxtsiz= Maximum Text Segment Size (Bytes)32 bit
maxtsiz_64bit= Maximum Text Segment Size (Bytes)64 bit
shmmax =Maximum Shared Memory Segment Size (Bytes) 32 and 64 bit
For applications to have a sufficient amount of space for text, data and stack i
n memory the kernel has to be tuned. The total size for text, data and stack for
32 bit systems using EXEC_MAGIC is in quadrant 1 and 2, and is at maximum 2Gb l
ess the size of the Uarea . These are represented by the kernel parameters maxts
ize, maxdsize and maxssiz.

If there is 4Gb of total memory , the cumulative size of of data, stack and text
is 1984Mb . This represents quadrants 1 & 2 minus the Uarea in quadrant 2.
If there is less than 4Gb total memory, the quadrant size is 1/4th of total memo
ry. For 64 bit systems, while the address space in each quadrant is 4 Tb, the si
ze of the memory map is equal to the total memory of the system, and a quadrant
is 1/4 of this value. When sizing memory parameters for 64 bit it is important t
o keep this in mind. The quadrant boundary rules still apply.
It is important to remember that Uarea receives its memory allocation
in quadrant 2 first, then stack , the remainder of available space is
available for data. For HP-UX 11.X data can also occupy the free
space in quadrant 1 that is not used by text. A single process
cannot cross a quadrant boundary.
The last configurable area of memory to check is shared memory .
Any application running within the 32 bit architecture will have a limit of 1.75
Gb total for shared memory for EXEC_MAGIC and 2.75Gb using SHMEM_MAGIC.
Note :
This is only true when the total memory on the system equals at least 4Gb
Individual processes cannot cross quadrant boundaries , so the logical 32 bit l
imit for maxtsiz , maxdsiz and shmmax is 1 GB. For 64 bit , the quadrant size de
termines the logical limit.
Note: if a system is utilizing SHMEM_MAGIC the additional 1 Gb of shared object
space comes from quadrant 2, this means that the text,data, stack and Uarea must
all come from quadrant 1 .
This means maxtsiz,maxdsiz, maxssiz + Uarea can total no more than 1 Gb .
If these parameters are undersized the system will error.
maxdsiz will return "out of memory "
maxssiz will return "stack growth failure".
maxtsiz will return " /usr/lib/dld.sl: Call to mmap() failed - TEXT "
As of HP-UX 11, the kernel stack (maxssiz) will receive its memory allocation be
fore data ( maxdsiz) or text (maxtsiz).
For 64 bit systems, the quadrant size is determined by dividing the total memory
by 4 .
It is important to determine if the application is running 32 bit or 64 bit
when troubleshooting 64 bit systems.
This can be done with the file command :
example :
file /stand/vmunix
/stand/vmunix: ELF-64 executable object file - PA-RISC 2.0 (LP64)
PA-RISC versions under 2.0 are 32 bit
For an overview on shared memory for 32 bit systems refer to the Application Not
e RCMEMKBAN00000027 Understanding Shared Memory on PA RISC Systems.

The kernel parameter shmmax determines the size of the shared memory region. Unl
ess patched SAM will not allow this to be configured greater than 1 quadrant ,
or 1Gb even on 64 bit systems. If a larger shmmax value is needed for 64 bit sys
tems it has to be done using a manual kernel build.
The current patches to address this problem are :
11.00: [PHKL_24487/PACHRDME/English]
11.11 [PHKL_24032/PACHRDME/English]
Please refer to the patch database found at http://itrc.hp.com for the latest re
visions of these.
In a 64 bit system 32 bit applications will only address the 32 bit shared memor
y region, 64 bit applications will only address the 64 bit regions.
To determine shared memory allocation, use ipcs this utility is used to
report status of interprocess communication facilities. Run the following
command:
ipcs -mob
You will see an output similar to this :
IPC status from /dev/kmem as of Tue Apr 17 09:29:33 2001
T
ID
KEY
MODE
OWNER
GROUP NATTCH SEGSZ
Shared Memory:
m
0 0x411c0359 --rw-rw-rwroot
root
0
348
m
1 0x4e0c0002 --rw-rw-rwroot
root
1 61760
m
2 0x412006c9 --rw-rw-rwroot
root
1 8192
m
3 0x301c3445 --rw-rw-rwroot
root
3 1048576
m
4004 0x0c6629c9 --rw-r----root
root
2 7235252
m
5 0x06347849 --rw-rw-rwroot
root
1 77384
m
206 0x4918190d --rw-r--rwroot
root
0 22908
m
6607 0x431c52bc --rw-rw-rwdaemon
daemon
1 5767168
The two fields of the most interest are NATTCH and SEGSZ.
NATTCH -The number of processes attached to the associated shared
memory segment. Look for those that are 0, they indicate processes who have not
released their shared memory segment.
If there are multiple segments showing with an NATTACH of zero , especially if t
hey are owned by a database, this can be an indication that the segments are not
being efficiently released . This is due to the program not calling detachreg
. These segments can be removed using ipcrm -m shmid.
Note : Even though there is no process attached to the segment , the data struct
ure is still intact. The shared memory segment and data structure associated wit
h it are destroyed by executing this command.
SEGSZ The size of the associated shared memory segment in bytes. The total of SE
GSZ for a 32 bit system using EXEC_MAGIC cannot exeed 1879048192 bytes or 1.75Gb
, or 2952790016 bytes or 2.75Gb for SHMEM_MAGIC.
If more than 1.75GB total shared object space ( shared memory) is required for 3
2bit enviroments memory windows can be implemented. This configuration will allo
w discrete 1Gb windows to be opened up to a limit of the total amount of memory

on the system up to 8192Gb .


For more information on memory windows see refer to :
Memory Windows White Paper Doc ID : HPUXWP19
Using Memory Windows with 11.0 Doc ID: KBAN00000306
These are available in the technical knowledge database at http://itrc.hp.com

CPU load
Once we have determined that the memory resources are adequate, we need to addre
ss the processors. We need to determine how many processors there are, what spee
d they run at and what load they are under during a variety of system loads .
To find out processor speed,run :
example:
echo itick_per_usec/D | adb -k /stand/vmunix /dev/mem
itick_per_usec:
itick_per_usec: 360
This will be the speed in MHz.
To find out how many processors are in use, run :
example:
echo runningprocs/D | adb -k /stand/vmunix /dev/mem
runningprocs:
runningprocs: 2
This can also be done by using sar -Mu
To find out cpu load on a multi-processor system, run :
example:
sar -Mu 5 100 this will produce 100 data points 5 seconds apart.
The output will look similar to :
11:20:05
11:20:10
1
17
system

cpu
0
83
9

%usr
1

%sys
1
0

42

%wio
0

%idle
99

0
0

49

After all samples are taken an average is printed


This will return data on the cpu load for each processor:
cpu - cpu number (only on a multi-processor system and used with the -M option)
%usr - user mode
%sys- system mode
%wio - idle with some process waiting for I/O
(only block I/O, raw I/O, or VM pageins/swapins indicated)
%idle - other idle

Typically the %usr value will be higher than %sys . If the system is making many
read/write transactions this may not be true as they are systme calls.
Out of memory errors can occur When excessive CPU time given to system versus us
er processes. These can also be caused when maxdsiz is undersized. As a rule ,
we should expect to see %usr at 80% or less, and %sys at 50% or less.
Values higher than these can indicate a CPU bottleneck.
The %wio should ideally be 0%, values less than 15% are acceptable. The %idle b
eing low over short periods of time is not a major concern . This is the percen
tage of time that the CPU is not running processes. However low %idle over a
sustained period could be an indication of a CPU bottleneck.
If the %wio is greater than 15% and %idle is low , consider the size of the runq
(runq-sz). Ideally we would like to see values less than 4 . If the runq-sz is
high and the %wio is 0 then there is no bottleneck . This is usually a case of m
any small processes running that do not overload the processors.
If the system is a single processor system under heavy load the CPU bottleneck m
ay be unavoidable.
If the cpu load appears high , but the system is not heavily loaded check the va
lue of the kernel parameter timeslice. By default it is 10, if a Tuned Parameter
Set was applied to the kernel, it will change timeslice to 1. This will cause t
he cpu to context switch every 10mS instead of 100mS. In most instances this wil
l have a negative effect on cpu efficiency.
To find out what the run queue load is, run :
sar -q 5 100
example:
runq-sz %runocc swpq-sz %swpocc
10:06:36
0.0
0
0.0
10:06:41
1.5
40
0.0
10:06:46
3.0
20
0.0
10:06:51
1.0
20
0.0
Average
1.8
16
0.0

0
0
0
0
0

runq-sz - Average length of the run queue(s) of processes


(in memory and runnable)
%runocc - The percentage of time the run queue(s) were occupied by processes (in
memory and runnable)
swpq-sz - Average length of the swap queue of runnable processes
(processes swapped out but ready to run)
These cpu reports can be combined using sar -Muq .
Oversized System Tables can negatively effect system performance
Three of the most critical kernel resources are nproc, ninode and nfile .
These parameters govern the size of the process , HFS inode , and file tables. B
y default these are controlled by the formula value of maxusers.
Ideally we want to keep these settings within 25% of the peak observed usage.
Using sar -v we can monitor the proc table and file table, the inode
table reporting reflects the cache , not inodes in use.

The output of sar -v will show the usage/kernel value for each area.
example:
08:05:08 text-sz ov proc-sz ov inod-sz
ov file-sz ov
08:05:10 N/A
0 272/6420 0 3427/7668 0 5458/12139 0

What do these parameters control?


nfile
Number of open files for all processes running on the system. Though each entry
is relatively small, there is some kernel overhead in managing this table.
Additionally, each time a file is opened, it will consume an entry in nfile
even if the file is already opened by another process. When nfile entries are ex
hausted, a console and/or syslog error message will appear specifically indicati
ng "File table full". The value should usually be 10-25% greater than the maximu
m number during peak load .
The user limit on open files is set by the kernel parameter maxfiles.
This is limited by the hard limit parameter maxfiles_lim , by default
this is limited to a value of 2048.
ninode
The kernel parameter ninode only effects HFS file systems , JFS (VxFS)file syste
ms allocate their own inodes dynamically (vx_ninode) based on the amount of avai
lable kernel memory . The true inode count is only incremented by each unique HF
S file open, ie. the initial open of a file, each subsequent opens of that file
increments the file-sz column, and decrements the available nfile value.
This variable is frequently oversized, and can impose a heavy toll on the proces
sor (especially machines with multiple CPUs).It also can have a negative effect
on the system memory map, in some cases causing fragmentation .
The HFS Inode Cache
The HFS Inode cache contains information about the file type , size ,
timestamp , permissions and block map.
This information is stored in the On disk inode . The In-memory inode contains i
nformation on on-disk inode, linked list and other pointers inode number and loc
k primitives. One inode entry for every open file must exist in memory .
Closed file inode are kept on the free list.
The HFS inode table is controlled by the kernel parameter ninode .
Memory costs for the HFS inode cache in bytes for inode/vnode /hash entry
10.20
424

11.0 32 bit
444

11.0 64 bit
11i 32 bit
680
475

11i 64 bit
688

On 10.20 the inode table and dnlc (directory name lookup cache) are combined.
The tunable parameter for dnlc ncsize was introduced in patch PHKL_18335.
On 11.00 the dnlc is now configurable using the ncsize and vx_ncsize kernel para

meters.
By default ncsize =(ninode+vx_ncsize) +(8*dnlc_hash_locks) . The parameter vx_n
csize defines the memory space reserved for VxFS directory path-name cache (in b
ytes) The default value for vx_ncsize is 1024, dnlc_hash_ locks defaults to 512.
As of JFS 3.5 vx_ncsize became obsolete.
The JFS Inode Cache
A VxFS file system obtains the value of vx_ninode from the system configuration
file used for making the kernel (/stand/system for example). This value is used
to determine the number of entries in the VxFS inode table. By default, vx_ninod
e initializes at zero; the file system then computes a value based on the system
memory size (see Inode Table Size).
To change the computed value of vx_ninode, you can hard code the value in SAM .
For example:
Set vx_ninode=16,000.
The number of inodes in the inode table is calculated according to the
following table. The first column is the amount of system memory, the second is
the number of inodes. If the available memory is a value between two entries, t
he value of vx_ninode is interpolated.
The memory requirements for JFS are dependent on the revision of JFS and system
memory.
Maximum VxFS inodes in the cache based on system memory
System Memory in MB
JFS 3.1
256 18666
16000
512 37333 32000
1024 74666 64000
2048 149333 128000
8192
32768
131072

JFS.3.3-3.5

149333
149333
149333

256000
512000
1024000

To determine the amount of vxfs inodes allocated ( these are not reported by sar
) run :
example:
echo vxfs_ninode/D | adb -k /stand/vmunix /dev/mem
vxfs_ninode:
vxfs_ninode:

64000

for JFS 3.5 use the vxfsstat command :


vxfsstat -v / | grep maxino
vxi_icache_maxino
128000 vxi_icache_peakino

128002

The JFS daemon ( vxfsd ) scans the free list , if inodes are on the free list f
or given length of time the inode is freed back to the kernel memory allocator .
The amount of time this takes , and the amount freed varies by revison .
Maximum time in seconds before being freed

JFS 3.1 300


JFS 3.3 500
Maximum inodes to free per second
1/300th of current
50

JFS 3.5 1800


1-25

Memory cost per in bytes for JFS inode by revision for inode/vnode/locks :
JFS
JFS
JFS
JFS

3.1
3.3
3.3
3.5

11.0 32 bit
11.0 32 bit
11.11 32 bit
11.11

1220 64 bit 2244


1494 64 bit 1632
1352 64 bit 1902
64 bit 1850

Tuning the maximum size of the JFS Inode Cache


Remember each environment is different. There must be one inode entry for each
file opened at any given time. Most systems will run fine with 2% or less of
memory used for the JFS Inode Cache Large file sservers ie Web servers , NFS
servers which randomly access a
large set of inodes benefit from a large cache The inode cache typically appears
full after accessing many files sequentially ie . find,ll , backups The HFS ni
node parameter has no impact on the JFS Inode Cache While a static cache ( setti
ng a non 0 value for vx_ninode ) may save memory , there are factors to keep in
mind :
Inodes freed to the kernel memory allocator may not be available for immidiate u
se by other objects Static inode caches keep inodes in the cache longer.
In 11.0 only there is a parameter vx_noifree . The vx_noifree parameter controls
whether to free memory from the VxFS inode cache. If set to zero (the default),
inodes are eventually freed back to the general memory pool if they are unused.
If vx_noifree is non-zero, then memory is never freed from the VxFS inode ca
che. It may seem counter-intuitive to hoard memory to prevent memory problems, b
ut not freeing the 1 KB buckets holding VxFS inodes to the general memory pool p
revents building up large per processor private bucket
pools. Once the maximum size is reached for the inode cache, VxFS will always re
-use older inodes.
nproc
This pertains to the number of processes system-wide. This is another variable a
ffected by indiscriminate setting of maxusers. It is most commonly referenced wh
en a ps -ef is run or when Glance/GPM and similar commands are initiated. The va
lue should usually be 10-25% greater than the maximum number of processes observ
ed under load to allow for unanticipated process growth.
The user limit to open processes is set by the parameter maxuprc, this
value can be no greater than nproc -4. Typically maxuprc should be set no higher
than 60% of nproc.
For a complete overview of 11.X kernel parameters refer to :
http://www.docs.hp.com/hpux/onlinedocs/939/KCParms/KCparams.OverviewAll.html
Disk I/O
Disk bottlenecks can be caused by a number of factors. The buffer cache usage, c
pu load and high disk I/O load can all contribute to a bottleneck .
After determining the cpu and buffer cache load check the disk I/O load.

To determine disk i/o performance run:


sar -d 5 100
The output will look similar to :
device
c1t6d0
c4t0d0

%busy
0.80
0.60

avque
0.50
0.50

r+w/s
1
1

blks/s avwait avserv


4
0.27
13.07
4
0.26
8.60

There will be an average printed at the end of the report.


%busy

Portion of time device was busy servicing a request

avque

Average number of requests outstanding for the device

r+w/s
Number of data transfers per second (read and writes)
from and to the device
blks/s
Number of bytes transferred (in 512-byte units)
from and to the device
avwait
Average time (in milliseconds) that transfer requests
waited idly on queue for the device
avserv
Average time (in milliseconds) to service each
transfer request (includes seek, rotational latency,
and data transfer times) for the device.
When average wait (avwait) is greater than average service time (avserv) it indi
cates the disk can't keep up with the load during that sample. When the average
queue length exceeds the norm of .50 it is an indication of jobs stacking up.
These conditions are considered to be a bottleneck. It is prudent to keep in min
d how long these conditions last. If the queue flushes, or the avwait clears in
a reasonable time, (ie 5 seconds), it is not a cause for concern.
Keep in mind that the more jobs in a queue, the greater the effect on wait on I/
O even if they are small. Large jobs, those greater than 1000 blks/s will also e
ffect throughput.
Also consider the type of disks being used. Modern disk arrays are capable of ha
ndling very large amounts of data in very short processing times. Processing loa
ds of 5000 blks/s or greater in under 10mS. Older standard disks may show far le
ss capability.
The avwait is similar to %wio returned for sar -u on cpu .
If a bottleneck is identified, run:
strings /etc/lvmtab
to identify the volume group associated with the disks.
lvdisplay -v /dev/vgXX/lvolX where x represents the lvol name.
This will tell you what disks are associated with the logical volume.
bdf
to see if this volume groups files sytems are full ( > 85%)

cat /etc/fstab
to determine the file system type assiciated with the lvol/mountpoint
How to improve disk I/O ?
1. Reduce the volume of data on the disk to less than 90%
2. Stripe the data across disks to improve I/O speed
3. If you are using Online JFS , run fsadm -e to defragment the extents.
4. If you are using HFS filesystems , implement asynchronous writes by setting
the kernel parameter fs_async to 1 or consider converting to VxFS.
5. Reduce the size of the buffer cache ( if %wcache is less than 90)
6. Consider changing the vxfs mount options to mincache=direct and nolog , these
are available on Online JFS.
7. If you are using raw logical volumes , consider implementing asynchronous IO.
The difference between the async i/o and the synchronous i/o is that async does
not wait for confirmation of the write before moving on to the next task. This d
oes increase the speed of the disk performance at the expense of robustness.
Synchronous I/O waits for acknowledgement of the write (or fail) before continui
ng on. The write can have physically taken place or could be in the buffer cache
but in either case, acknowledgement has been sent. In the case of async, no wai
ting.

To implement asynchronous I/O on HP-UX for raw logical volumes:


* set the async_disk driver (Asynchronous Disk Pseudo Driver)to IN
in the HP-UX Kernel, this will require generating a new kernel and rebooting .
* create the device file:
# mknod /dev/async c 101 0x00000#
#=the minor number can be one of the following values:
0x000000 default-immediate reporting will not be used
a cache flush will be done before posting an IO operation complete
0x000001
0x000002
0x000004
0x000005
0x000007

enable immediate reporting


flush the CPU cache after reads
allow disks to timeout
is a combination of 1 and 4
is a combination of 1, 2 and 4

Note: Contact your database vendor or product vendor to determine the correct mi
nor number for your application.
Change the ownership to the approriate group and owner :
chown oracle:dba /dev/async
change the permissions :

chmod 660 /dev/async


vi /etc/privgroup
add 1 line : dba MLOCK
give the group MLOCK priviledges
/usr/sbin/setprivgrp MLOCK
To verify if a group has the MLOCK privilege execute:
/usr/bin/getprivgrp

The default number of available ports for asynchronus disks is 50 , this is tune
d with the kernel parameter max_async_ports, if greater than 50
disks are being used, this parameter needs to be increased .
PATCHES
There are a number of OS performance issues that are resolved by current patches
.

Das könnte Ihnen auch gefallen