Beruflich Dokumente
Kultur Dokumente
Overview
RHEL filesystems
EXT2/3/4
XFS
GFS2
CPUs
Memory
File Systems
Scalability
Memory
2.1
64GB
16
128GB
32
256GB
255
1TB
4096
64TB
C groups (2.6.18/2.6.29)
Efficient reclaim
500
1.8
NoNUMA Memory
interleaved
NUMA (default)
%NUMA Gain
1.54
1.6
1.4
Application Performance
400
1.13
1.2
1.2
1
300
0.8
200
0.6
0.4
100
0.2
0
Oracle oltp(k) 4-node IntelEX
Sybase oltp(k) - 2-node AMD
SAS jobs/ksec 8-node IntelEX
Read()/Write()
memory copy
Pagecache
page
Flush daemon
buffer
User space
File system
Kernel
Technology Innovation
RHEL 6 File Systems
ext4
XFS
Supports 2 to 16 nodes
BTRFS
LVM or MD devices
Partions w/ fdisk
IOzone commands
Iozone a f /perf1/t1 (incache)
Iozone a -I f /perf1/t1 (w/ dio)
Iozone s 2xmem f /perf1/t1 (big)
RHEL6.0 ext4
%R6vsR55
2000
1.45
1800
1.4
1600
1.35
1400
1.3
1200
1000
1.25
800
1.2
600
1.15
400
1.1
200
0
In-Cache
Direct IO
Out-Cache
11
1.05
2
0
1600
300
2.6.32-71
2.6.32-125
1400
2.6.32-71
2.6.32-125
250
1200
200
MB / Sec
MB / Sec
1000
800
600
150
100
400
50
200
0
gfs2
xfs
FS Type
ext4
ext3
gfs2
xfs
ext4
ext3
FS Type
12
2
0
1400
2.6.32-71
2.6.32-125
1200
100
1000
80
MB / Sec
800
MB / Sec
2.6.32-71
2.6.32-125
600
60
40
400
20
200
0
gfs2
xfs
FS Type
ext4
ext3
gfs2
xfs
ext4
ext3
FS Type
13
2
0
14
30000
25000
20000
15000
10000
5000
0
ext3
ext4
xfs
15
SAS-systime
ext3-R5
ext4-R6*
ext4-R5
xfs-R6*
xfs-R5
gfs2-R6*
gfs2-R5
16
Example:
# tuned-adm profile enterprise-storage
Memory Performance
Huge Pages
NUMA
Swap
CPU Performance
Multiple cores
AsynchronousI/OtoFileSystems
App I/O
Request
Device
Driver
I/O Request
Issue
I/O
Synchronous I/O
Application
I/O Request
Completion
Asynchronous I/O
No stall for
completion
App I/O
Request
I/O
I/O
Completion
Application
Device
Driver
I/O Request
Issue
I/O Request
Completion
RHEL4/54tunableI/OSchedulers
CFQelevator=cfq.CompletelyFairQueuingdefault,balanced,fair
formultipleluns,adaptors,SMPservers
NOOPelevator=noop.Nooperationinkernel,simple,lowcpu
overhead,leaveopttoramdisk,raidcntrletc.
Deadlineelevator=deadline.Optimizeforruntimelikebehavior,low
latencyperIO,balanceissueswithlargeI/Oluns/controllers(NOTE:
currentbestforFC5)
Anticipatoryelevator=as.InsertsdelaystohelpstackaggregateIO,
bestonsystemw/limitedphysicalI/OSATA
RHEL4Setatboottimeoncommandline
RHEL5Changeonthefly
echodeadline>/sys/block/<sdx>/queue/scheduler
Red Hat Performance NDA Required 2009
10:30.00
60
09:00.00
50
07:30.00
40
06:00.00
30
04:30.00
20
03:00.00
10
01:30.00
00:00.00
16
32
CFQ
Deadline
% diff
OLTP (tpm)
300,000
70.00
2-FiberChannel 4Gb
FusionIO-duo
% diff
60.00
250,000
50.00
200,000
40.00
150,000
30.00
100,000
20.00
50,000
10.00
0.00
10U
20U
40U
60U
80U
100U
26
1 SPECvirt Tile/core
1 SPECvirt Tile/core
Key Enablers:
SR-IOV
Blue
= Disk I/O
Green = Network I/O
Huge Pages
NUMA
Node Binding
http://www.spec.org/virt_sc2010/results/
27
28
1600
1367
1400
1200
1763
1169
1221
vmware
RHEL5
1811
RHEL6
1820
1369
1.4
1.2
1
0.8
1000
800
0.6
600
0.4
400
0.2
200
0
RHEL 5.5
(KVM) /
IBMx365
0 M3 /
12 cores
Vmware
ESX 4.1 /
HP D380
G7 / 12
cores
RHEL 6.0
(KVM) /
IBM
HS22V/
12 cores
RHEL 5.5
(KVM) /
IBMx369
0 X5 / 16
cores
System
RHEL 6
(KVM) /
IBMx369
0 X5 / 16
cores
Vmware
ESX 4.1 /
HP
BL620c
G7 / 20
cores
RHEL6.1
(KVM) /
HP
BL620c
G7 / 20
cores
Tiles / Core
SPECvirt_sc2010 Score
1800
SPECvirt_sc2010 Score
SPECvirt Tiles/Core
7000
6000
5000
0.8
5466
SPECvirt_sc2010 Score
SPECvirt Tiles/Core
0.7
0.6
3723
4000
3000
0.9
2721
0.5
0.4
2742
0.3
2000
0.2
1000
0.1
0
0
Vmware ESX
4.1 / Bull
SAS /32
cores
Vmware ESX
4.1 /
IBMx3850X5
/ 32 cores
Vmware ESX
4.1 / HP
DL580 G7 /
40 cores
System
RHEL 6
(KVM) /
IBMx3850
X5 / 64
cores
RHEL 6
(KVM) /
IBMx3850
X5 / 80
cores
Tiles / core
SPECvirt_sc2010 Score
8000
Virtualization
Memory Enhancements
Transparent hugepages
Virtualization:
RHEL6 2.6.32 SAS Intel EP (8core/48GB)
SAS multi-stream workload in KVM guest (RHEL5.5 vs RHEL6.0)
Intel Nahelem 8core, 48GB, 2FC
Guest (8x44GB virtIO, nocache)
5.5 - 5.5 - ext3
SAS-systime
32
Guest VM
Guest VM
virtio-net guest driver
tx
tx
rx
rx
QEMU
VF NIC #2
kernel/HV
kernel/HV
VF NIC #1
bridge
NIC
Physical NIC
Virtualization:
RHEL6 2.6.32 SAS Intel EP (12cpu/24GB)
RHEL6.1 SAS Mixed Analytics Workload - Bare-Metal/KVM
Intel Westmere EP 12-core, 24 GB Mem, LSI 16 SAS drives
20000
1.2
18000
0.94
16000
14000
0.79
0.8
12000
10000
0.6
SAS system
SAS Total
%virt
8000
0.4
6000
4000
0.2
2000
0
KVM VirtIO
KVM/PCI-PassThrough
Bare-Metal
34
M
P
lO
a
t
o
T
DVDStoreVersion2results
100,000
90,000
80,000
70,000
60,000
50,000
40,000
30,000
20,000
10,000
0
86,469
92,680
69,984
1database instance
(baremetal)
Summary
RHEL filesystems
Pagecache Tuning(RHEL)
Filesystem/pagecache Allocation
Accessed(pagecache under limit)
ACTIVE
Aging
INACTIVE
(new -> old)
FREE
reclaim
swappiness
/proc/sys/vm/swappiness
Database server with /proc/sys/vm/swappiness set to 60(default)
procs -----------memory---------- ---swap-r b
swpd
free
buff cache
si
so
5 1 643644 26788
3544 32341788 880 120
zone_reclaim_mode
/proc/sys/vm/min_free_kbytes
Directly controls the page reclaim watermarks in KB
Defaults are higher when THP is enabled
# echo 1024 > /proc/sys/vm/min_free_kbytes
----------------------------------------------------------Node 0 DMA free:4420kB min:8kB low:8kB high:12kB
Node 0 DMA32 free:14456kB min:1012kB low:1264kB high:1516kB
----------------------------------------------------------# echo 2048 > /proc/sys/vm/min_free_kbytes
----------------------------------------------------------Node 0 DMA free:4420kB min:20kB low:24kB high:28kB
Node 0 DMA32 free:14456kB min:2024kB low:2528kB high:3036kB
-----------------------------------------------------------
/proc/sys/vm/dirty_background_ratio
/proc/sys/vm/dirty_background_bytes
Controls when dirty pagecache memory starts getting
written asynchronously
Default is 10%
Lower
Higher
/proc/sys/vm/dirty_ratio
/proc/sys/vm/dirty_bytes
Default is 20%
Slab:
Slab:
Hugepagesize:
415420 kB
2048 kB
Hugepagesize:
218208 kB
2048 kB