Sie sind auf Seite 1von 30

Linux I/O performance

An end-to-end methodology for maximizing Linux I/O performance on the


IBM System x servers in a typical SAN environment.

David Quenzler
IBM Systems and Technology Group ISV Enablement
June 2012

Copyright IBM Corporation, 2012

Table of contents
Abstract........................................................................................................................................1
Introduction .................................................................................................................................1
External storage subsystem - XIV .............................................................................................2
External SAN switches ...............................................................................................................4
Bottleneck monitoring .............................................................................................................................. 4
Fabric parameters.................................................................................................................................... 5
Basic port configuration ........................................................................................................................... 5
Advanced port configuration .................................................................................................................... 5

Host adapter placement rules ....................................................................................................6


System BIOS settings .................................................................................................................6
HBA BIOS settings......................................................................................................................6
Linux kernel parameters.............................................................................................................9
Linux memory settings...............................................................................................................9
Page size ................................................................................................................................................. 9
Transparent huge pages.......................................................................................................................... 9

Linux module settings - qla2xxx..............................................................................................10


Linux SCSI subsystem tuning - /sys .......................................................................................12
Linux XFS file system create options .....................................................................................14
Linux XFS file system mount options .....................................................................................17
Red Hat tuned............................................................................................................................18
ktune.sh ................................................................................................................................................. 18
ktune.sysconfig ...................................................................................................................................... 19
sysctl.ktune ............................................................................................................................................ 19
tuned.conf .............................................................................................................................................. 19

Linux multipath .........................................................................................................................20


Sample scripts...........................................................................................................................22
Summary....................................................................................................................................24
Resources..................................................................................................................................25
About the author .......................................................................................................................26
Trademarks and special notices..............................................................................................27

Linux I/O Performance

Abstract
This white paper discusses an end-to-end approach for Linux I/O tuning in a typical data center
environment consisting of external storage subsystems, storage area network (SAN) switches,
IBM System x Intel servers, Fibre Channel host bus adapters (HBAs) and 64-bit Red Hat
Enterprise Linux.
Anyone with an interest in I/O tuning is welcome to read this white paper.

Introduction
Linux I/O tuning is complex. In a typical environment, I/O makes several transitions from the client
application out to disk and vice versa. There are many pieces to the puzzle.
We will examine the following topics in detail:

External storage subsystems


External SAN switches
Host adapter placement rules
System BIOS settings
Adapter BIOS settings
Linux kernel parameters
Linux memory settings
Linux module settings
Linux SCSI subsystem settings
Linux file system create options
Linux file system mount options
Red Hat tuned
Linux multipath

You should follow an end-to-end tuning methodology in order to minimize the risk of poor tuning.
Recommendations in this white paper are based on the following environment under test:

IBM System x 3850 (64 processors and 640 GB RAM)


Red Hat Enterprise Linux 6.1 x86_64
The Linux XFS file system
IBM XIV external storage subsystem, Fibre Channel (FC) attached

An architecture comprising IBM hardware and Red Hat Linux provides a solid framework for maximizing
I/O performance.

Linux I/O performance


1

External storage subsystem - XIV


The XIV has few manual tunables. Here are a few tips:

Familiarize yourself with the XIV command-line interface (XCLI) as documented in the IBM XIV
Storage System User Manual.
Ensure that you connect the XIV system to your environment in the FC fully redundant
configuration as documented in the XIV Storage System: Host Attachment and Interoperability
guide from IBM Redbooks.

Figure 1: FC fully redundant configuration

Although you can define up to 12 paths per host, a maximum of six paths per host provides sufficient
redundancy and performance.
Useful XCLI commands:
# module_list -t all
# module_list -x
# fc_port_list

The XIV storage subsystem contains six FC data modules (4 to 9), each with 8 GB memory. The FC rate
is 4 Gbps and the data partition size is 1 MB.

Check the XIV HBA queue depth setting: The higher the host HBA queue depth, the more
parallel I/O goes to the XIV system, but each XIV port can only sustain up to 1400 concurrent
I/Os to the same type target or logical unit (LUN). Therefore, the number of connections
multiplied by the host HBA queue depth should not exceed that value. The number of
connections should take the multipath configuration into account.
Note: The XIV queue limit is 1400 per XIV FC host port and 256 per LUN per worldwide port
name (WWPN) per port.

Linux I/O performance


2

Twenty-four multipath connections to the XIV system would dictate that host queue depth be set
to 58. (24*58=1392)
Check the operating system (OS) disk queue depth (see below)
Make use of the XIV host attachment kit for RHEL

Useful commands:
# xiv_devlist

Linux I/O performance


3

External SAN switches


As a best practice, set SAN Switch port speeds to Auto (auto-negotiate).
Typical bottlenecks are:

Latency bottleneck
Congestion bottleneck

Latency bottlenecks occur when frames are sent faster than they can be received. This can be due to
buffer credit starvation or slow drain devices in the fabric.
Congestion bottlenecks occur when the required throughput exceeds the physical data rate for the
connection.
Most SAN Switch web interfaces can be used to monitor the basic performance metrics, such as
throughput utilization, aggregate throughput, and percentage of utilization.
The Fabric OS command-line interface (CLI) can also be used to create frame monitors. These monitors
analyze the first 64 bytes of each frame and can detect various types of protocols that can be monitored.
Some performance features, such as frame monitor configuration (fmconfig), require a license.
Some of the useful commands:

switch:admin>perfhelp
switch:admin>perfmonitorshow
switch:admin>perfaddeemonitor
switch:admin>fmconfig

Bottleneck monitoring
Enable bottleneck monitoring on SAN switches by using the following command:
switch:admin> bottleneckmon --enable -alert

Useful commands
switch:admin> bottleneckmon --status
switch:admin> bottleneckmon --show -interval 5 -span 300
switch:admin> switchstatusshow
switch:admin> switchshow
switch:admin> configshow
switch:admin> configshow -pattern "fabric"
switch:admin> diagshow
switch:admin> porterrshow

Linux I/O performance


4

Fabric parameters
Fabric parameters are described in the following table. Default values are in brackets []:
Fabric parameter

Description

BBCredit

Increasing the buffer-to buffer (BB) credit parameter may increase


performance by buffering FC frames coming from 8Gb/s FC server
ports and going to 4Gb/s FC ports on the XIV. SAN segments can run
at different rates.
Frame pacing (BB credit starvation) occurs when no more BB credits
are available. Frame pacing delay: AVG FRAME PACING should
always be zero. If not, increase buffer credits. But, over-increasing the
number of BB credits does not increase performance
[16]

E_D_TOV

Error Detect TimeOut Value [2000]

R_A_TOV

Resource Allocation TimeOut Value [10000]

dataFieldSize

512, 1024, 2048, 2112 [2112]

Sequence Level Switching

Under normal conditions, disable for better performance (interleave


frames, do not group frames) [0]

Disable Device Probing

Set this mode only if N_Pord discovery causes attached devices to fail
[0]

Per-Frame Routing Priority

[0]

Suppress Class F Traffic

Used with ATM gateways only [0]

Insistent Domain ID Mode

fabric.ididmode [0]

Table 1: Fabric Parameters default values are in brackets []

Basic port configuration


Target rate limiting (ratelim) is used to minimize congestion at the adapter port caused by a slow drain
device operating in the fabric at a slower speed (for example a 4 GBps XIV system)

Advanced port configuration


Turning on Interrupt Control Coalesce and increasing the latency monitor timeout value can improve
performance by reducing interrupts and processor utilization.

Linux I/O performance


5

Host adapter placement rules


It is extremely important for you to follow the adapter placement rules for your server in order to minimize
PCI bus saturation.

System BIOS settings


Use recommended CMOS settings for your IBM System x server.
You can use the IBM Advanced Settings Utility (asu64) to modify the System x BIOS settings from the
Linux command line. It is normally installed in /opt/ibm/toolscenter/asu
ASU normally tries to communicate over the LAN through the USB interface. Disable the LAN over USB
interface with the following command:
# asu64 set IMM.LanOverUsb Disabled --kcs

The following settings can result in better performance


uEFI.TurboModeEnable=Enable
uEFI.PerformanceStates=Enable
uEFI.PackageCState=ACPI C3
uEFI.ProcessorC1eEnable=Disable
uEFI.DDRspeed=Max Performance
uEFI.QPISpeed=Max Performance
uEFI.EnergyManager=Disable
uEFI.OperatingMode=Performance Mode

Also: enabling or disabling Hyperthreading can improve application performance.

Useful commands:
# asu64 show
# asu64 show --help
# asu64 set IMM.LanOverUsb Disabled --kcs
# asu64 set uEFI.OperatingMode Performance

HBA BIOS settings


You can use the QLogic SANSurfer command-line utility (scli) to show or modify HBA settings.

Linux I/O performance


6

Task

Command

Display current HBA Parameter


settings

# scli -c

Display WWPNs only

# scli -c | grep WWPN

Display settings only

# scli -c | grep \: | grep -v WWPN | sort | uniq -c

Restore default settings

# scli -n all default

Table 2: Modifying HBA settings

WWPNs can also be determined from the Linux command line or using a small script
#!/bin/sh
###
hba_location=$(lspci | grep HBA | awk '{print $1}')

for adapter in $hba_location


do
cat $(find /sys/devices -name \*${adapter})/host*/fc_host/host*/port_name
done
Listing 1: Determining WWPNs

HBA parameters as reported by the scli command appear in the following table:
Parameter

Default value

Connection Options

2 - Loop Preferred, Otherwise Point-to-Point

Data Rate

Auto

Enable FC Tape Support

Disabled

Enable Hard Loop ID

Disabled

Enable Host HBA BIOS

Disabled

Enable LIP Full Login

Yes

Enable Target Reset

Yes

Execution Throttle

16

Frame Size

2048

Linux I/O performance


7

Hard Loop ID

Interrupt Delay Timer (100ms)

Link Down Timeout (seconds)

30

Login Retry Count

Loop Reset Delay (seconds)

LUNs Per Target

128

Operation Mode

Out Of Order Frame Assembly

Disabled

Port Down Retry Count

30 seconds

Table 3: HBA BIOS tunable parameters (sorted)

Use the lspci command to show which type(s) of Fibre Channel adapters exist in the system. For
example:
# lspci | grep HBA

Note: Adapters from different vendors have different default values.

Linux I/O performance


8

Linux kernel parameters


The available options for the Linux scheduler are noop, anticipatory, deadline, or cfq.
echo "Linux: SCHEDULER"
cat /sys/block/*/queue/scheduler | grep -v none | sort | uniq -c
echo ""
Listing 2: Determining the Linux scheduler for block devices

The Red Hat enterprise-storage tuned profile uses the deadline scheduler. The deadline scheduler can
be enabled by adding the elevator=deadline parameter to the kernel command line in grub.conf.

Useful commands:
# cat /proc/cmdline

Linux memory settings


This section shows you the Linux memory settings.

Page size
The default page size for Red Hat Linux is 4096 bytes.
# getconf PAGESIZE

Transparent huge pages


The default size for huge pages is 2048 KB for most large systems.
echo "Linux: HUGEPAGES"
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo ""
Listing 3: Determining the Linux huge page setting

The Red Hat enterprise-storage tuned profile enables huge pages.

Linux I/O performance


9

Linux module settings - qla2xxx


You can see the parameters for the qla2xxx module using the following script:
#!/bin/sh
###
for param in $(ls /sys/module/qla2xxx/parameters)
do
echo -n "${param} = "
cat /sys/module/qla2xxx/parameters/${param}
done
Listing 4: Determining qla2xxx module parameters

Disable Qlogic failover. If the output of the following command shows the -k driver (not the -fo driver) then
failover is disabled.
# modprobe qla2xxx | grep -w ^version
version: <some_version>-k

Qlogic lists the highlights of the 2400 series HBAs:

150,000 IOPS per port


Out-of-order frame reassembly
T10 CRC for end-to-end data integrity

Useful commands:
# modinfo -p qla2xxx

The qla_os.c file in the Linux kernel source contains information on many of the qla2xxx module
parameters. Some parameters as listed by modinfo -p do not exist in the Linux source code. Others
are not explicitly defined but may be initialized by the adapter firmware.

Descriptions of module parameters appear in the following table:


Parameter

Description

Linux kernel source

Default value

ql2xallocfwdump

Allocate memory for a


firmware dump during

1 - allocate memory

Linux I/O performance


10

HBA initialization
ql2xasynctmfenable

Issue TM IOCBs
asynchronously via
IOCB mechanism

does not exist

0 - issue TM IOCBs via


mailbox mechanism

ql2xdbwr

Scheme for request


queue posting

does not exist

1 - CAMRAM doorbell
(faster)

ql2xdontresethba

Reset behavior

does not exist

0 - reset on failure

ql2xenabledif

T10-CRC-DIF

does not exist

1 - DIF support

ql2xenablehba_err_chk

T10-CRC-DIF Error
isolation by HBA

does not exist

0 - disabled

ql2xetsenable

Firmware ETS burst

does not exist

0 - skip ETS enablement

ql2xextended_error_log
ging

Extended error logging

not explicitly defined

0 - no logging

ql2xfdmienable

FDMI registrations

0 - no FDMI

ql2xfwloadbin

Location from which to


load firmware

not explicitly defined

0 - use default semantics

ql2xgffidenable

GFF_ID checks of port


type

does not exist

0 - do not use GFF_ID

ql2xiidmaenable

IDMA setting

1 - perform iIDMA

ql2xloginretrycount

Alternate value for


NVRAM login retry
count

ql2xlogintimeout

Login timeout value in


seconds

20

20

ql2xmaxqdepth

Maximum queue depth


for target devices -used to seed queue
depth for scsi devices

32

32

ql2xmaxqueues

MQ

1 - single queue

ql2xmultique_tag

CPU affinity

not defined

0 - no affinity

ql2xplogiabsentdevice

PLOGI

not defined

0 - no PLOGI

ql2xqfulrampup

Time in seconds to wait


to begin to ramp-up the
queue depth for a
device after a queue-full
condition has been

does not exist

120 seconds

Linux I/O performance


11

detected
ql2xqfulltracking

Track and dynamically


adjust queue depth for a
scsi devices

does not exist

1 - perform tracking

ql2xshiftctondsd

Control shifting of
command type
processing based on
total number of SG
elements

does not exist

ql2xtargetreset

Target reset

does not exist

1 - use hw defaults

qlport_down_retry

Maximum number of
command retries to a
port in PORT-DOWN
state

not defined

Table 4: qla2xxx module parameters

Linux SCSI subsystem tuning - /sys


See /sys/block/<device>/queue/<parameter>
Block device parameter values can be determined using a small script:
#!/bin/sh
###

param_list=$(find /sys/block/sda/queue -maxdepth 1 -type f -exec basename '{}' \;


| sort)
dev_list=$(ls -l /dev/disk/by-path | grep -w fc | awk -F \/ '{print $3}')
dm_list=$(ls -d /sys/block/dm-* | awk -F \/ '{print $NF}')

for param in ${param_list}


do
echo -n "${param} = "
for dev in ${dev_list} ${dm_list}
do
cat /sys/block/${dev}/queue/${param}

Linux I/O performance


12

done | sort | uniq -c


done
echo -n "queue_depth = "
for dev in ${dev_list}
do
cat /sys/block/${dev}/device/queue_depth
done | sort | uniq -c
Determining block device parameters

To send down large-size requests (greater than 512 KB on 4 KB page size systems):

Consider increasing max_segments to 1024 or greater


Set max_sectors_kb equal to max_hw_sectors_kb

SCSI device parameters appear in the following table. Values that can be changed are shown as (rw):
Parameter

Description

Value

hw_sector_size (ro)

Hardware sector size in bytes

512

max_hw_sectors_kb (ro)

Maximum number of kilobytes


supported in a single data
transfer

32767

max_sectors_kb (rw)

Maximum number of kilobytes


that the block layer will allow for
a file system request

512

nomerges (rw)

Enable or disable lookup logic

0- all merges are enabled

nr_requests (rw)

Number of read or write requests


which can be allocated in the
block layer

128

read_ahead_kb (rw)
rq_affinity (rw)

8192
Always complete a request on
the same CPU that queued it.

1 - CPU group affinity

1- CPU group affinity


2- strict CPU affinity
scheduler (rw)

deadline

Table 5: SCSI subsystem tunable parameters

Linux I/O performance


13

Using max_sectors_kb:
By default, Linux devices are configured for a maximum 512 KB I/O size. When using a larger file system
block size, increase the max_sectors_kb parameter. Max_sectors_kb must be less than or equal to
max_hw_sectors_kb.

The default queue_depth is 32 and represents the total number of transfers that can be queued to a
device. You can check the queue depth by examining /sys/block/<device>/device/queue_depth.

Linux XFS file system create options


Useful commands:
# getconf PAGESIZE
# man mkfs.xfs

Note: XFS writes are not guaranteed to be committed unless the program issues a fsync() call
afterwards.

Red Hat: Optimizing for a large number of files

If necessary, you can increase the amount of space allowed for inodes using the
mkfs.xfs -i maxpct= option. The default percentage of space allowed for inodes varies by file
system size. For example, a file system between 1 TB and 50 GB in size will allocate 5% of the
total space for inodes.

Red Hat: Optimizing for a large number of files in a single directory

Normally, the XFS file system directory block size is the same as the file system block size.
Choose a larger value for the mkfs.xfs -n size= option, if there are many millions of directory
entries.

Red Hat: Optimizing for concurrency

Increase the number of allocation groups on systems with many processors.

Red Hat: Optimizing for applications that use extended attributes


1. Increasing inode size might be necessary if applications use extended attributes.
2. Multiple attributes can be stored in an inode provided that they do not exceed the maximum size
limit (in bytes) for attribute+value.

Linux I/O performance


14

Red Hat: Optimizing for sustained metadata modifications


1. Systems with large amounts of RAM could benefit from larger XFS log sizes.
2. The log should be aligned with the device stripe size (the mkfs command may do this
automatically)

The metadata log can be placed on another device, for example, a solid-state drive (SSD) to reduce disk
seeks.
Specify the stripe unit and width for hardware RAID devices

Syntax (options not related to performance are omitted)


# mkfs.xfs [ options ] device

-b block_size_options
size=<int>

-- size in bytes

default

4096

minimum

512

maximum 65536 (must be <= PAGESIZE)

-d data_section_options
More allocation groups imply that more parallelism can be achieved when
allocating blocks and inodes
agcount=<int> -- number of allocation groups
agsize
name
file
size
sunit
su
swidth
sw

Linux I/O performance


15

-i inode_options
size
log
perblock
maxpct
align
attr

-l log_section_options
internal
logdev
size
version
sunit
su
lazy-count

-n naming_options
size
log
version

-r realtime_section_options
rtdev
extsize
size

-s sector_size

Linux I/O performance


16

log
size

-N
Dry run.

Print out filesystem parameters without creating the filesystem.

Listing 5: Create options for XFS file systems

Linux XFS file system mount options


Useful commands
# xfs_info
# xfs_quota
# grep xfs /proc/mounts
# mount | grep xfs

nobarrier

noatime

inode64 XFS is allowed to create inodes at any location in the file system. Starting from kernel 2.6.35,
XFS file systems will mount either with or without the inode64 option.

logbsize Larger values can improve performance. Smaller values should be used with fsync-heavy
workloads.

delaylog RAM is used to reduces the number of changes to the log.

The Red Hat 6.2 Release Notes mention that XFS has been improved in order to better handle metadata
intensive workloads. The default mount options have been updated to use delayed logging.

Linux I/O performance


17

Red Hat tuned


Red Hat Enterprise Linux has a tuning package called tuned which sets certain parameters based on a
chosen profile.

Useful commands:
# tuned-adm help
# tuned-adm list
# tuned-adm active

The enterprise-storage profile contains the following files. When comparing the enterprise-storage profile
with the throughput-performance profile, some files are identical:
# cd /etc/tune-profiles
# ls enterprise-storage/
ktune.sh

ktune.sysconfig

sysctl.ktune

tuned.conf

# sum throughput-performance/* enterprise-storage/* | sort


03295

2 throughput-performance/sysctl.s390x.ktune

08073

2 enterprise-storage/sysctl.ktune

15419

2 enterprise-storage/ktune.sysconfig

15419

2 throughput-performance/ktune.sysconfig

15570

1 enterprise-storage/ktune.sh

43756

1 enterprise-storage/tuned.conf

43756

1 throughput-performance/tuned.conf

47739

2 throughput-performance/sysctl.ktune

57787

1 throughput-performance/ktune.sh

ktune.sh
The enterprise-storage ktune.sh is the same as the throughput-performance ktune.sh but adds
functionality for disabling or enabling I/O barriers. The enterprise-storage profile is preferred when using
XIV storage. Important functions include:

set_cpu_governor performance -- uses cpuspeed to set the governor


enable_transparent_hugepages -- does what it says

Linux I/O performance


18

remount_partitions nobarrier -- disables write barriers


multiply_disk_readahead -- modifies /sys/block/sd*/queue/read_ahead_kb

ktune.sysconfig
ktune.sysconfig is identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/ktune.sysconfig \ throughputperformance/ktune.sysconfig | sort | uniq -c
2 ELEVATOR="deadline"
2 ELEVATOR_TUNE_DEVS="/sys/block/{sd,cciss,dm-}*/queue/scheduler"
2 SYSCTL_POST="/etc/sysctl.conf"
2 USE_KTUNE_D="yes"
Listing 6: Sorting the ktune.sysconfig file

sysctl.ktune
sysctl.ktune is functionally identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/sysctl.ktune \ throughputperformance/sysctl.ktune | sort | uniq -c
2 kernel.sched_min_granularity_ns = 10000000
2 kernel.sched_wakeup_granularity_ns = 15000000
2 vm.dirty_ratio = 40
Listing 7: Sorting the sysctl.ktune file

tuned.conf
tuned.conf is identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/tuned.conf \throughputperformance/tuned.conf | sort | uniq -c
12 enabled=False
Listing 8: Sorting the tuned.conf file

Linux I/O performance


19

Linux multipath
Keep it simple: configure just enough paths for redundancy and performance.

features='1 queue_if_no_path' hwhandler='0' wp=rw


policy='round-robin 0' prio=-1

features='1 queue_if_no_path'
Set 'no_path_retry N', then remove features='1 queue_if_no_path' option or set 'features 0'

Multipath configuration defaults


Parameter

Default value

polling interval

udev_dir/dev

/dev

multipath_dir

/lib/multipath

find_multipaths

no

verbosity

path_selector

round-robin 0

path_grouping_policy

failover

getuid_callout

/lib/udev/scsi_id -- whitelisted --device=/dev/%n

prio

const

features

queue_if_no_path

path_checker

directio

failback

manual

rr_min_io

1000

rr_weight

uniform

no_path_retry

user_friendly_names

no

queue_without_daemon

yes

flush_on_last_del

no

max_fds

determined by the calling process

Linux I/O performance


20

checker_timer

/sys/block/sdX/device/timeout

fast_io_fail_tmo

determined by the OS

dev_loss_tmo

determined by the OS

mode

determined by the process

uid

determined by the process

gid

determined by the process

Table 6: Multipath configuration options

The default load balancing policy (path_selector) is round-robin 0. Other choices are queue-length 0 and
service-time 0.
Consider using the XIV Linux host attachment kit to create the multipath configuration file.
# cat /etc/multipath.conf
devices {
device {
vendor "IBM"
product "2810XIV"
path_selector "round-robin 0"
path_grouping_policy multibus
rr_min_io 15
path_checker tur
failback 15
no_path_retry 5
#polling_interval 3
}
}

defaults {
...
user_friendly_names yes
...

Linux I/O performance


21

}
Listing 9: A sample multipath.conf file

Sample scripts
You can use the following script to query various settings related to I/O tuning:
#!/bin/sh
#!/bin/sh
# Query scheduler, hugepages, and readahead settings for fibre channel scsi
devices
###

#hba_pci_loc=$(lspci | grep HBA | awk '{print $1}')

echo "Linux: HUGEPAGES"


cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo ""

echo "Linux: SCHEDULER"


cat /sys/block/*/queue/scheduler | grep -v none | sort | uniq -c
echo ""

echo "FC: max_sectors_kb"


ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i cat
/sys/block/{}/queue/max_sectors_kb | sort | uniq -c
echo ""

echo "Linux: dm-* READAHEAD"


ls /dev/dm-* | xargs -n1 -i blockdev --getra {} | sort | uniq -c
blockdev --report /dev/dm-*
echo ""

Linux I/O performance


22

echo "Linux: FC disk sd* READAHEAD"


ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i
blockdev --getra /dev/{} | sort | uniq -c
ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i
blockdev --report /dev/{} | grep dev
echo ""
Listing 10: Sorting the ktune.sysconfig file

Linux I/O performance


23

Summary
This white paper presented an end-to-end approach for Linux I/O tuning in a typical data center
environment consisting of external storage subsystems, storage area network (SAN) switches, IBM
System x Intel servers, Fibre Channel HBAs and 64-bit Red Hat Enterprise Linux.
Visit the links in the Resources section for more information on topics presented in this white paper.

Linux I/O performance


24

Resources
The following websites provide useful references to supplement the information contained in this paper:

XIV Redbooks
ibm.com/redbooks/abstracts/sg247659.html
ibm.com/redbooks/abstracts/sg247904.html
Note: IBM Redbooks are not official IBM product documentation.

XIV Infocenter
http://publib.boulder.ibm.com/infocenter/ibmxiv/r2

XIV Host Attachment Kit for RHEL can be downloaded from Fix Central
ibm.com/support/fixcentral

Qlogic
http://driverdownloads.qlogic.com
ftp://ftp.qlogic.com/outgoing/linux/firmware/rpms

Red Hat Enterprise Linux Documentation


http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux

IBM Advanced Settings Utility


ibm.com/support/entry/portal/docdisplay?Indocid=TOOL-ASU

Linux
Documentation/kernel-parameters.txt
Documentation/block/queue-sysfs.txt
Documentation/filesystems/xfs.txt
drivers/scsi/qla2xxx
http://xfs.org/index.php/XFS_FAQ

Linux I/O performance


25

About the author


David Quenzler is a consultant in IBM Systems and Technology Group ISV Enablement Organization.
He has more than 15 years experience working with the IBM System x (Linux) and IBM Power Systems
(IBM AIX) platforms. You can reach David at quenzler@us.ibm.com.

Linux I/O performance


26

Trademarks and special notices


Copyright IBM Corporation 2012.
References in this document to IBM products or services do not imply that IBM intends to make them
available in every country.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked
terms are marked on their first occurrence in this information with a trademark symbol ( or ), these
symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information
was published. Such trademarks may also be registered or common law trademarks in other countries. A
current list of IBM trademarks is available on the Web at "Copyright and trademark information" at
www.ibm.com/legal/copytrade.shtml.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or
its affiliates.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the
United States, other countries, or both.
Intel, Intel Inside (logos), MMX, and Pentium are trademarks of Intel Corporation in the United States,
other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product, or service names may be trademarks or service marks of others.
Information is provided "AS IS" without warranty of any kind.
All customer examples described are presented as illustrations of how those customers have used IBM
products and the results they may have achieved. Actual environmental costs and performance
characteristics may vary by customer.
Information concerning non-IBM products was obtained from a supplier of these products, published
announcement material, or other publicly available sources and does not constitute an endorsement of
such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly
available information, including vendor announcements and vendor worldwide homepages. IBM has not
tested these products and cannot confirm the accuracy of performance, capability, or any other claims
related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the
supplier of those products.
All statements regarding IBM future direction and intent are subject to change or withdrawal without
notice, and represent goals and objectives only. Contact your local IBM office or IBM authorized reseller
for the full text of the specific Statement of Direction.
Some information addresses anticipated future capabilities. Such information is not intended as a
definitive statement of a commitment to specific levels of performance, function or delivery schedules with
respect to any future products. Such commitments are only made in IBM product announcements. The

Linux I/O performance


27

information is presented here to communicate IBM's current investment and development activities as a
good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled
environment. The actual throughput or performance that any user will experience will vary depending
upon considerations such as the amount of multiprogramming in the user's job stream, the I/O
configuration, the storage configuration, and the workload processed. Therefore, no assurance can be
given that an individual user will achieve throughput or performance improvements equivalent to the
ratios stated here.
Photographs shown are of engineering prototypes. Changes may be incorporated in production models.
Any references in this information to non-IBM websites are provided for convenience only and do not in
any manner serve as an endorsement of those websites. The materials at those websites are not part of
the materials for this IBM product and use of those websites is at your own risk.

Linux I/O performance


28

Das könnte Ihnen auch gefallen