Sie sind auf Seite 1von 69

Performance Analysis and Tuning Part 1

Larry Woodman Senior Consulting Engineering RHEL/VM Bill Gray Principal Performance Engineer Red Hat !une "# $%"#

Agenda: Performance Analysis Tuning Part II

Part I

Red Hat Enterprise Linux tuned profiles! top "enc#mar$ results %cala"ilty &'% %c#eduler tuna"les ( &groups Hugepages Transparent Hugepages! )*+(1,+ -on .niform *emory Access /-.*A0 and -.*A1 -et2or$ Performance and Latency3performance 1is$ and 'ilesystem I4 3 T#roug#put3performance %ystem Performance(Tools perf! tuna! systemtap

Part II

56A

Red Hat Enterprise Linux: %cale .p 6 4ut


*raditional scale+out capa,ilities -a.e ,een complemented o.er t-e past fi.e years /it- scale+up capa,ilities Brings open source .alue and fle0i,ility to 01)2)' Ser.er mar3et
Support for scala,le arc-itectures Multi+core and -ypert-reading 4ernel 5&M6 and SMP en-ancements
&p to '%() CP&s

%cale .p

" CP& " node

%cale 4ut

"%%%s nodes

Red Hat Enterprise Linux 7 +enc#mar$ platform of c#oice

Survey of Benchmark Results 6/2013

System Performance Evaluation Committee

www.spec.org

SPECcpu2006 SPECvirt_sc2010, sc2013 SPEC !!2013 "ransaction Processing Council #"PC$

www.tpc.org

"PC%& 3 of top 6 categories "PC%C "op virtuali'ation w( )*+ S",C -S) wor.loa/s % www.stacresearc0.com S,P Sales an/ 1istri!ution %

www.sap.com(campaigns(!enc0mar.

Red Hat Enterprise Linux 7 +enc#mar$ platform of c#oice


%PE& +enc#mar$ Pu"lications )811 3 )81<
1)8=
Percentage using RHEL) 9as of !une " $%"#:

188=

"%%7

?7= ?8=
8(7

?8=

78=

78= @8= >@=

>8=

#17

)8=
)7

8= cpu)887 9irt:sc)818
SPEC is a registered trademark of the Standard Performance Evaluation Corporation. For more information about SPEC and it's benchmarks see www.spec.org

%7

%7

%7

%7

%7

%7

;Enterprise)818

9irt:sc)81<

;"")81<

$%""

$%"$

$%"#

Red Hat Enterprise Linux 7 +enc#mar$ platform of c#oice


TP& +enc#mar$ Pu"lications )811 3 )81<
1)8=
Percentage &sing RHEL) 9as of !une " $%"#:

188=

188=

?8=
7A=

78=
@8=

>8=

<<=

)8=

8=
For more information about the PC and it's benchmarks see www.tpc.org.

TP&3&
$%"" $%"$ $%"#

TP&3H

Red Hat Enterprise Linux 7B> 9s Cindo2s %er9er )81) LI-PA&D

Principled *ec-nologies <nc= > Red Hat <nc= ? Confidential

%;/$1/"#

tuned Profile &omparison *atrix


*una,le
3ernel=sc-ed2min2 granularity2ns 3ernel=sc-ed2/a3eup 2granularity2ns .m=dirty2ratio .m=dirty2,ac3ground 2ratio .m=s/appiness </@ Sc-eduler 9Ele.ator:

default 'ms 'ms $%7 R6M "%7 R6M )% CAB

enterprise3 storage 18ms 1@ms >8=

.irtual+ -ost "%ms ";ms "%7 ;7 "%

.irtual+ guest "%ms ";ms '%7

latency+ performance

t-roug-put+ performance "%ms ";ms '%7

#%
deadline

deadline 4ff performance '0

deadline @ff

deadline

deadline

Ailesystem Barriers @n CP& Go.ernor Cis3 Read+a-ead 1isa"le THP 1isa"le &3%tates ondemand

@ff performance performance

Des Des

-ttpsE//access=red-at=com/site/solutions/#)(%(#

Red Hat Enterprise Linux 7 %c#eduler Tuna"les


<mplements multile.el run Fueues for soc3ets and cores 9as opposed to one run Fueue per processor or per system:
Soc.et 0
Core 0
"0rea/ 0 "0rea/ 1

Core 1

Soc.et 1
"0rea/ 0 "0rea/ 1

RHEL) tuna,les sc-ed2min2granularity2ns sc-ed2/a3eup2granularity2ns sc-ed2migration2cost sc-ed2c-ild2runs2first sc-ed2latency2ns

"0rea/ 0 "0rea/ 1

Soc.et 2

Process Process Process Process Process

Process Process Process

Process Process Process Process

Sc0e/uler Compute 3ueues

'iner grained sc#eduler tuning

/proc/sys/3ernel/sc-ed2G Red Hat Enterprise Linu0 ) *uned+adm /ill increase Fuantum on par /it- Red Hat Enterprise Linu0 ;

ec-o "%%%%%%% H /proc/sys/3ernel/sc-ed2min2granularity2ns Minimal preemption granularity for CP& ,ound tas3s= See sc-ed2latency2ns for details= *-e default .alue is '%%%%%% 9ns:= ec-o ";%%%%%% H /proc/sys/3ernel/sc-ed2/a3eup2granularity2ns

*-e /a3e+up preemption granularity= <ncreasing t-is .aria,le reduces /a3e+up preemption reducing distur,ance of compute ,ound tas3s= Lo/ering it impro.es /a3e+up latency and t-roug-put for latency critical tas3s particularly /-en a s-ort duty cycle load component must compete /it- CP& ,ound components= *-e default .alue is ;%%%%%% 9ns:=

Load +alancing

Sc-eduler tries to 3eep all CP&s ,usy ,y mo.ing tas3s form o.erloaded CP&s to idle CP&s Cetect using Iperf statJ loo3 for e0cessi.e ImigrationsJ /proc/sys/3ernel/sc-ed2migration2cost

6mount of time after t-e last e0ecution t-at a tas3 is considered to ,e Icac-e -otJ in migration decisions= 6 I-otJ tas3 is less li3ely to ,e migrated so increasing t-is .aria,le reduces tas3 migrations= *-e default .alue is ;%%%%% 9ns:= <f t-e CP& idle time is -ig-er t-an e0pected /-en t-ere are runna,le processes try reducing t-is .alue= <f tas3s ,ounce ,et/een CP&s or nodes too often try increasing it=

Rule of t-um, ? increase ,y $+"%0 to reduce load ,alancing <ncrease ,y "%0 on large systems /-en many CGR@&Ps are acti.ely used 9e0E RHEV/ 4VM/RH@S:

%c#ed:*igration &ost
RHEL7B< Effect of sc#ed:migration cost on for$(exit
<ntel Westmere EP $'cpu/"$core $' GB mem

)@8B88

1>8B88= 1)8B88=

)88B88 188B88=

usec(call

1@8B88

?8B88= 78B88= >8B88=


Percent

usec(call default @88us usec(call tuned >ms percent impro9ement

188B88

@8B88 )8B88= 8B88 exit:18 exit:188 exit:1888 for$:18 for$:188 for$:1888 8B88=

for$/0 "e#a9ior
sc#ed:c#ild:runs:first &ontrols 2#et#er parent or c#ild runs first 1efault is 8: parent continues "efore c#ildren runB 1efault is different t#an RHEL@

Red Hat Enterprise Linux 7 Hugepages( E* Tuning

Standard HugePages $MB

Reser.e/free .ia

/proc/sys/.m/nr2-ugepages /sys/de.ices/node/G /-ugepages/G/nr-ugepages


*LB

&sed .ia -ugetl,fs Reser.ed at ,oot time/no freeing &sed .ia -ugetl,fs @n ,y default .ia ,oot args or /sys &sed for anonymous memory

GB Hugepages "GB

"$1 data "$1 instruction

P-ysical Memory Virtual 6ddress Space

*ransparent HugePages $MB

)*+ standard Hugepages


# echo 2000 > /proc/sys/vm/nr_hugepages # cat /proc/meminfo MemTotal: 16331124 kB MemFree: 11788608 kB HugePages_Total: HugePages_Free: HugePages_Rsvd: HugePages_Surp: Hugepagesize: # ./hugeshm 1000 # cat /proc/meminfo MemTotal: 16331124 kB MemFree: 11788608 kB HugePages_Total: HugePages_Free: HugePages_Rsvd: HugePages_Surp: Hugepagesize: 2000 1000 1000 0 2048 kB 2000 2000 0 0 2048 kB

1,+ Hugepages
*oot arguments % /efault_0ugepages'415, 0ugepages'415, 0ugepages42 K cat /proc/meminfo L more HugePages_Total: HugePages_Free: HugePages_Rsvd: HugePages_Surp: Kmount +t -ugetl,fs none /mnt 6 .(mmapwrite (mnt( un. 33 writing 2097152 pages of random junk to file /mnt/junk wrote 8589934592 bytes to file /mnt/junk K cat /proc/meminfo L more HugePages_Total: HugePages_Free: HugePages_Rsvd: HugePages_Surp: 8 0 0 0 8 8 0 0

Transparent Hugepages
ec-o ne.er H /sys/3ernel/mm/transparent2-ugepagesMne.er [root@dhcp-100-1 -!0 code"# t$me %/memory 1! 0 real 0m12.434s user 0m0.936s sys 0m11.416s
# cat /proc/meminfo MemTotal: 16331124 kB AnonHugePages: 0 kB

Boot argumentE transparent2-ugepagesMal/ays 9ena,led ,y default: K ec-o al/ays H /sys/3ernel/mm/red-at2transparent2-ugepage/ena,led # t$me %/memory 1!&B real 0m7.024s user 0m0.073s sys 0m6.847s
# cat /proc/meminfo MemTotal: 16331124 kB AnonHugePages: 15590528 kB

S'(()*' 12%+/,%0 - 1%,,./ !60

Performance RHEL 7 ( %andy +ridge %pec;"" Fa9a 2( 1,+ #uge pages


Sandy Bridge -as "GB -ugepages

RHEL)=' SPECN,, // $M/"G -ugepages


<ntel Sandy Bridge ")core/#$GB
"'=%7

Support in RHEL;=1 and )=$

RHEL)= *ransparent Huge pages


1)B7=

"$=%7

7se 2+ 826_69 page vs 9. page : ;&E<6, static use of 0ugepages Static pages wire/%/own

"%=%7

GB1=
1=%7 sun2-otspot 7gain

=ee/ application support 1*(>ava etc

,ops

)=%7

,utomatically use 0uge pages -or all anonymous memory 1aemon to gat0er free /ynamically
8B8=

'=%7

$=%7

%=%7 RHEL)=$ $M HugePage RHEL)=$ 9disa,le *HP: RHEL)=$ "GB HugePage

*emory Hones
#$+,it
&p to )' GB9P6E:

)'+,it
End of R6M

Hig-mem Oone

5ormal Oone

1() MB or #()1MB 'GB 5ormal Oone ")MB CM6 Oone % CM6#$ Oone ")MB CM6 Oone %

%plit LR. pagelists

Separate page+lists for anonymous and pagecac-e Pre.ents mi0ing of anonymous and file+,ac3ed pages on acti.e and inacti.e LR& lists Eliminates long pauses /-en all CP&s enter direct reclaim during memory e0-austion Pre.ents s/apping /-en copying .ery large files Pre.ents s/apping of data,ase cac-e during ,ac3up=

Per -ode(Hone split LR. Paging 1ynamics


.ser Allocations Reacti9ate

anonLR. fileLR.

anonLR. 'REE fileLR.

Page aging
A&TIEE

I-A&TIEE

Reclaiming

s2apout flus# .ser deletions

C#at is -.*AI

5on &niform Memory 6ccess 6 result of ma3ing ,igger systems more scala,le ,y distri,uting system memory near indi.idual CP&s==== 6ll multi+soc3et 01)2)' ser.er systems are 5&M6

Most ser.ers -a.e " 5&M6 node / soc3et Recent 6MC systems -a.e $ 5&M6 nodes / soc3et Else @S /ill see only "+5&M6 nodePPP

4eep interlea.e memory in B<@S off 9default:

Typical %ystem +uilding +loc$


Memory Controller and node R6M

Core % Core $

S-ared L# Cac-e

Core " Core #

BP< lin3s <@ etc=

T2o -.*A node system

5ode %
5ode % R6M Core % Core $ L# Cac-e Core " Core # Core % Core $

5ode "
5ode " R6M L# Cac-e Core " Core #

BP< lin3s <@ etc=

BP< lin3s <@ etc=

'our -.*A node system! fully3connected topology


5ode %
5ode % R6M Core % Core $ L# Cac-e Core " Core # Core % Core $

5ode "
5ode " R6M L# Cac-e Core " Core #

BP< lin3s <@ etc=

BP< lin3s <@ etc=

5ode $
5ode $ R6M Core % Core $ L# Cac-e Core " Core # Core % Core $

5ode #
5ode # R6M L# Cac-e Core " Core #

BP< lin3s <@ etc=

BP< lin3s <@ etc=

'our -.*A node system! ring topology


5ode %
5ode % R6M Core % Core $ L# Cac-e Core " Core # Core % Core $

5ode "
5ode " R6M L# Cac-e Core " Core #

BP< lin3s <@ etc=

BP< lin3s <@ etc=

5ode $
5ode $ R6M Core % Core $ L# Cac-e Core " Core # Core % Core $

5ode #
5ode # R6M L# Cac-e Core " Core #

BP< lin3s <@ etc=

BP< lin3s <@ etc=

Per -.*A3-ode Resources

Memory Qones9CM6 > 5ormal Qones: CP&s <@/CM6 capacity <nterrupt processing Page reclamation 3ernel t-read 93s/apdK: Lots of ot-er 3ernel t-reads

-.*A -odes and Hones


)'+,it
End of R6M

5ode "

5ormal Oone

5ormal Oone 'GB

5ode %

CM6#$ Oone ")MB CM6 Oone %

Jone:reclaim:mode

Controls 5&M6 specific memory allocation policy W-en set and node memory is e0-austedE

Reclaim memory from local node rat-er t-an allocating from ne0t node Slo/er allocation -ig-er 5&M6 -it ratio 6llocate from all nodes ,efore reclaiming memory Aaster allocation -ig-er 5&M6 miss ratio

W-en clear and node memory is e0-austedE

Cefault is set at ,oot time ,ased on 5&M6 factor

Learn a"out &P.s 9ia lscpu


# lscpu Architecture: CPU op-mode(s): B te !rder: CPU(s): !%-#i%e CPU(s) #ist: )hre&d(s) per core: Core(s) per soc+et: CPU soc+et(s): ,U-A %ode(s): . . . . "*d c&che: "*i c&che: "2 c&che: "3 c&che: ,U-A %ode' CPU(s): ,U-A %ode* CPU(s): ,U-A %ode2 CPU(s): ,U-A %ode3 CPU(s): x86_64 32-bit, 64-bit "itt#e $%di&% 4' '-3( * *' 4 4 32/ 32/ 206/ 3'12'/ ',4,8,*2,*6,2',24,28,32,36 2,6,*',*4,*8,22,26,3',34,38 *,0,(,*3,*1,2*,20,2(,33,31 3,1,**,*0,*(,23,21,3*,30,3(

EisualiJe &P.s 9ia lstopo


# lstopo

/from #2loc pac$age0

Learn -.*A layout 9ia numactl


# numactl --hardware &2&i#&b#e: 4 %odes ('-3) %ode ' cpus: ' 4 8 *2 *6 2' 24 28 32 36 %ode ' si3e: 604*0 -B %ode ' 4ree: 63482 -B %ode * cpus: 2 6 *' *4 *8 22 26 3' 34 38 %ode * si3e: 60036 -B %ode * 4ree: 63(68 -B %ode 2 cpus: * 0 ( *3 *1 2* 20 2( 33 31 %ode 2 si3e: 60036 -B %ode 2 4ree: 638(1 -B %ode 3 cpus: 3 1 ** *0 *( 23 21 3* 30 3( %ode 3 si3e: 60036 -B %ode 3 4ree: 63(1* -B %ode dist&%ces: %ode ' * 2 3 ': *' 2* 2* 2* *: 2* *' 2* 2* 2: 2* 2* *' 2* 3: 2* 2* 2* *'

%ample remote access latencies


' soc3et / ' nodeE "=;0 ' soc3et / 1 nodeE $=80 1 soc3et / 1 nodeE $=10 #$ node systemE

;=;0

9#%/#$ inter+node latencies HM '0:


"% "# '% '1 ;; 9 #$/"%$'E #="7: 9 #$/"%$'E #="7: 9 )'/"%$'E )=$7: 9''1/"%$'E '#=17: 9''1/"%$'E '#=17:

Red Hat Enterprise Linux 7B> %PE&;"")88@ opt 2( numactl


RHEL7B> %PE&;"" 4F1D numactl
<ntel Westmere ER '%core ' soc3et $;) GB

1>88888

1)88888

1888888
inst> inst< inst) inst1

?88888

"ops

788888

>88888

)88888

8 *etal 1efault *etal -uma&TL DE* 1efault DE* -uma&TL

%o! 2#atKs t#e -.*A pro"lemI

*-e Linu0 system sc-eduler is .ery good at maintaining responsi.eness and optimiQing for CP& utiliQation *ries to use idle CP&s regardless of /-ere process memory is located==== &sing remote memory degrades performanceP

Red Hat is /or3ing /it- t-e upstream community to increase 5&M6 a/areness of t-e sc-eduler and to implement automatic 5&M6 ,alancing=

Remote memory latency matters most for long+ running significant processes e=g= HP*C VMs etc=

.se numastat to see memory layout

Re/ritten for Red Hat Enterprise Linu0 )=' to s-o/ per+node system and process memory information "%%7 compati,le /it- prior .ersion ,y default displaying /sys===nodeSnH/numastat memory allocation statistics 6ny command options in.o3e ne/ functionality

+m for per+node system memory info SpatternH for per+node process memory info

See numastat91:

numastat: compati"ility mode


# numastat numa_hit numa_miss numa_foreign interleave_hit local_node other_node numa_hit numa_miss numa_foreign interleave_hit local_node other_node node0 1655286 0 2790 14365 1652364 2922 node4 252059 0 0 14367 235903 16156 node1 266159 2790 0 14354 249938 19011 node5 529980 0 0 14336 513789 16191 node2 314693 0 0 14366 298463 16230 node6 240696 0 0 14333 224511 16185 node3 273846 0 0 14348 257638 16208 node7 375607 0 0 14388 361928 13679

numastat: compressed display


# numastat -c er-node numastat info !in "#s$% &ode 0 &ode 1 &ode 2 &ode 3 &ode 4 &ode 5 &ode 6 &ode 7 'otal ------ ------ ------ ------ ------ ------ ------ ------ ----&uma_(it 6479 1040 1230 1070 985 2070 941 1468 15284 &uma_"iss 0 11 0 0 0 0 0 0 11 &uma_)oreign 11 0 0 0 0 0 0 0 11 *nterleave_(it 56 56 56 56 56 56 56 56 449 +ocal_&ode 6468 977 1166 1007 922 2007 877 1415 14839 ,ther_&ode 11 74 63 63 63 63 63 53 455

numastat: per3node meminfo


# numastat -mc-s er-node s.stem memor. usage !in "#s$% &ode 0 &ode 1 &ode 2 &ode 3 &ode 4 &ode 5 &ode 6 &ode 7 'otal ------ ------ ------ ------ ------ ------ ------ ------ -----"em'otal 32766 32768 32768 32768 32768 32768 32768 32752 262126 "em)ree 31863 31965 32120 32086 32098 32080 32114 32062 256388 "em/sed 903 803 648 682 670 688 654 690 5738 )ile ages 11 26 8 37 21 18 9 45 176 0la1 25 16 7 10 12 36 10 10 126 2ctive 5 13 4 25 10 9 6 41 113 2ctive!file$ 4 11 3 23 8 6 3 40 99 0/nreclaim 19 10 6 6 9 33 7 7 97 *nactive 7 15 4 14 12 12 6 6 76 *nactive!file$ 7 15 4 14 12 12 6 6 76 03eclaima1le 7 6 2 4 3 3 3 2 29 2ctive!anon$ 2 1 1 2 2 2 3 2 14 2non ages 2 1 1 2 2 2 3 2 14 "a44ed 0 0 0 1 4 3 1 1 11 5ernel0tac6 9 0 0 0 0 0 0 0 10 age'a1les 0 0 0 0 1 1 0 1 3 0hmem 0 0 0 0 0 0 0 0 0 *nactive!anon$ 0 0 0 0 0 0 0 0 0

numastat s#o2s unaligned guests


5 %um&st&t -c 6emu Per-%ode process memor us&7e (i% -bs) )ot&# ----*'122 *'1*4 *'1*2 *'13' ----42811

P89 ,ode ' ,ode * ,ode 2 ,ode 3 --------------- ------ ------ ------ -----*'081 (6emu-+2m) *2*6 4'22 4'28 *406 *'62( (6emu-+2m) 2*'8 06 413 8'11 *'61* (6emu-+2m) 4'(6 341' 3'36 **' *'1*3 (6emu-+2m) 4'43 34(8 2*30 *'00 --------------- ------ ------ ------ -----)ot&# **462 **'40 (612 *'6(8

numastat s#o2s aligned guests


5 %um&st&t -c 6emu Per-%ode process memor us&7e (i% -bs)

P89 ,ode ' ,ode * ,ode 2 ,ode 3 )ot&# --------------- ------ ------ ------ ------ ----*'081 (6emu-+2m) ' *'123 0 ' *'128 *'62( (6emu-+2m) ' ' 0 *'1*1 *'122 *'61* (6emu-+2m) ' ' *'126 ' *'126 *'1*3 (6emu-+2m) *'133 ' 0 ' *'138 --------------- ------ ------ ------ ------ ----)ot&# *'133 *'123 *'14' *'1*1 42(*3

%ome DE* -.*A %uggestions

ConTt assign e0tra resources to guests


ConTt assign more memory t-an can ,e used ConTt ma3e guest unnecessarily /ide

5ot muc- point to more VCP&s t-an application t-reads

Aor ,est 5&M6 affinity and performance t-e num,er of guest VCP&s s-ould ,e SM num,er of p-ysical cores per node and guest memory S a.aila,le memory per node Guests t-at span nodes s-ould consider SL<*

Ho2 to manage -.*A manually

Researc- 5&M6 topology of eac- system Ma3e a resource plan for eac- system Bind ,ot- CP&s and Memory

Mig-t also consider de.ices and <RBs Inumactl +5 SnodesH +m SnodesH S/or3loadHJ Edit 0mlE SnumatuneH Smemory modeMUstrictU nodesetMU"+$U/H S/numatuneH

&se numactl for nati.e No,sE

&se numatune for li,.irt started guests

&se Cgroups // apps to ,ind cpu/mem to numa nodes

Resource 1ana2ement us$n2 c2roups


34$l$ty to mana2e lar2e system resources effect$vely

Control 5roup #Cgroups$ for CP7(+emory(=etwor.(1is. *enefit? guarantee 3uality of Service @ /ynamic resource allocation )/eal for managing any multi%application environment

-rom !ac.%ups to t0e Clou/

numad can #elp impro9e -.*A performance

5e/ Red Hat Enterprise Linu0 )=' user+le.el daemon to automatically impro.e out of t-e ,o0 5&M6 system performance and to ,alance 5&M6 usage in dynamic /or3load en.ironments Was tec-+pre.ie/ in Red Hat Enterprise Linu0 )=# <mpro.es 5&M6 performance for some /or3loads 5ot ena,led ,y default See numad91:

numad matc#es resource consumers 2it# a9aila"le resources


5ode ScannerE 6.aila,le CP&s 6.aila,le Memory Process ScannerE ReFuired CP&s ReFuired Memory

6.aila,le Resources Per 5ode

Consumed Resources Per Process

5umad Pic3er

5ode list Aor Process

numad aligns process memory and &P. t#reads 2it#in nodes

Before numad
5ode % 5ode " 5ode $ 5ode #

6fter numad
5ode % 5ode " 5ode $ 5ode #

Process #8 Process $( Process "( Process )" Proc $( Proc "( Proc )" Proc #8

-umad 3 aligning memory and t#reads in nodes: Reduces memory latency! impro9es determinism

numad usage

numad is intended primarily for ser.er consolidation en.ironments

Multiple applications running on t-e same ser.er Multiple instances of t-e same application Multiple .irtual guests

numad is most li3ely to -a.e a positi.e effect /-en processes can ,e localiQed in a fractional su,set of t-e systemVs 5&M6 nodes= <f t-e entire system is dedicated to a large in+memory data,ase application for e0ample ++ especially if memory accesses /ill li3ely remain unpredicta,le ++ numad /ill pro,a,ly not impro.e performance= Similarly .ery -ig- ,and/idt- applications ++ t-at really need all t-e system memory controllers ++ /ill li3ely not ,enefit from localiQation

%tart! stop numad! and set inter9al


K numad +i % to terminate t-e numad daemon +i WSminHEXSma0H to specify inter.al seconds


Cefault is I+i ;E";J <ncreasing t-e ma0 inter.al /ill decrease o.er-ead ++ ,ut /ill also decrease responsi.eness to c-anging loads=

To c#ange utiliJation target


+u SnH to specify target utiliQation percent Cefault is I+u 1;J <ncrease t-e utiliQation target to more fully utiliQe t-e entire resources on eac- node Cecrease t-e utiliQation target to maintain more per+node resource margin for ,ursty loads

6lso could decrease to force processes across multiple nodes

+are *etal 3 Fa9a Cor$load


Automatic -umad Impro9ement
Mulitinstance !a.a Wor3load on ' Soc3et 1 5ode system

1?88888 1788888 1>88888 1)88888

G8 ?8 A8 78 @8

+4Ps

1888888 >8 ?88888 <8 788888 >88888 )88888 8 )8 18 8 318

1efault -umadG@ -umactl =,ain

CHs

To get pre3placement ad9ice


+/ SCP&sHESMBsH for node suggestions @utput is a recommended node list e=g= I"+$ 'J Can ,e used regardless of /-et-er numad is running as a daemon Will ta3e a couple seconds if not running &sed ,y li,.irt for optional VM auto placement

Could ,e used in s-ell script for automated No, placement

numad 32 s#ell script example


KP/,in/,asPR@CESSESMY"Z s-ift *HRE6CSMY"Z s-ift G<G6BD*ESMY"Z s-ift ec-o U*rying YPR@CESSES fa3e TguestsT /it- Y*HRE6CS VCP&s and YG<G6BD*ES GB eac-=U ec-o U5ote a.erage /or3 accomplis-ed ++ displayed in a fe/ minutes=U for 99 iM"Z i SM YPR@CESSESZ i[[ :: do 5@CESM\=/numad +/ Y*HRE6CSEY]G<G6BD*ES^%%%\ ec-o Unumad ad.ises to use nodesE Y5@CES ++ ,ut ignoring t-at and not ,inding=U ec-o ==/pig2tool/pig +t Y*HRE6CS +gm Y]G<G6BD*ES^%%% +s )% +l mem ==/pig2tool/pig +t Y*HRE6CS +gm Y]G<G6BD*ES^%%% +s )% +l mem > done ec-o USleeping /-ile t-e fa3e guests finis- up===U sleep "%% ec-o ec-o U@4 no/ trying same siQe fa3e TguestsT using numad placement ad.ice=U ec-o U6.erage /or3 accomplis-ed s-ould ,e -ig-er stdde. mig-t ,e ,etter too=U for 99 iM"Z i SM YPR@CESSESZ i[[ :: do 5@CESM\=/numad +/ Y*HRE6CSEY]G<G6BD*ES^%%%\ ec-o Unumad ad.ises to use nodesE Y5@CESU ec-o numactl +5 Y5@CES +m Y5@CES ==/pig2tool/pig +t Y*HRE6CS +gm Y]G<G6BD*ES^%%% +s )% +l mem numactl +5 Y5@CES +m Y5@CES ==/pig2tool/pig +t Y*HRE6CS +gm Y]G<G6BD*ES^%%% +s )% +l mem > done ec-o USleeping /-ile t-e fa3e guests finis- up===U sleep "%% ec-o

numad 32 s#ell script /t#e important part0


for 99 iM"Z i SM YPR@CESSESZ i[[ :: do 5@CESM\=/numad +/ Y*HRE6CSEY]G<G6BD*ES^%%%\ === numactl +5 Y5@CES +m Y5@CES ==/pig2tool/pig = = = = > done

numad 32 s#ell script /ignorant0


K =/pig2place2test=s- ; ) 8 *rying ; fa3e TguestsT /it- ) VCP&s and 8 GB eac-= 5ote a.erage /or3 accomplis-ed ++ displayed in a fe/ minutes= numad ad.ises to use nodesE $ ++ ,ut ignoring t-at and not ,inding= ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem numad ad.ises to use nodesE 8 ++ ,ut ignoring t-at and not ,inding= ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem numad ad.ises to use nodesE # ++ ,ut ignoring t-at and not ,inding= ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem numad ad.ises to use nodesE # ++ ,ut ignoring t-at and not ,inding= ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem numad ad.ises to use nodesE # ++ ,ut ignoring t-at and not ,inding= ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem Sleeping /-ile t-e fa3e guests finis- up=== *-readsE ) 6.gE #(=; Stdde.E 8=) MinE ## Ma0E ;' *-readsE ) 6.gE #;=1 Stdde.E ;=; MinE #" Ma0E '' *-readsE ) 6.gE #(=8 Stdde.E '=$ MinE #; Ma0E '; *-readsE ) 6.gE '(=1 Stdde.E "$=; MinE ## Ma0E )$ *-readsE ) 6.gE )$=# Stdde.E "#=# MinE '( Ma0E 1%

numad 32 s#ell script /ad9ised0


@4 no/ trying same siQe fa3e TguestsT using numad placement ad.ice= 6.erage /or3 accomplis-ed s-ould ,e -ig-er stdde. mig-t ,e ,etter too= numad ad.ises to use nodesE $ numactl +5 $ +m $ ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem numad ad.ises to use nodesE " numactl +5 " +m " ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem numad ad.ises to use nodesE # numactl +5 # +m # ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem numad ad.ises to use nodesE 8 numactl +5 8 +m 8 ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem numad ad.ises to use nodesE ) numactl +5 ) +m ) ==/pig2tool/pig +t ) +gm 8%%% +s )% +l mem Sleeping /-ile t-e fa3e guests finis- up=== *-readsE ) 6.gE "%;=% Stdde.E %=% MinE "%; Ma0E "%; *-readsE ) 6.gE "%)=% Stdde.E %=% MinE "%) Ma0E "%) *-readsE ) 6.gE "%)=% Stdde.E %=% MinE "%) Ma0E "%) *-readsE ) 6.gE "%;=% Stdde.E %=% MinE "%; Ma0E "%; *-readsE ) 6.gE "%'=% Stdde.E %=% MinE "%' Ma0E "%'

*ultiguest 3 DE* Fa9a Cor$load


Eig#t DE* ,uests 2it# Fa9a Load
numad /it- RHEL)=' -ost and )=# guests

)@88888

1% 8%

)888888

)% ;% '%

1@88888

+4P%

1888888

default numad numatune = ,ain

#% $% "%

@88888

8 2#) 2#> 2#7 2#?

Care#ouses

*ultiguest 4racle 4LTP Cor$load


4racle 4LTP in TP* > DE* ,uests
<ntel #$+cpu "$1 GB $ AC

1>88888 1)88888 1888888 ?88888 788888 >88888 )88888 8 )8. >8. .sers ?8.

#; #% $; $% "; "% ; %

default numad -.*A pinned = ,ain

numad future

S-ipping in Red Hat Enterprise Linu0 )=' Potential future impro.ementsE Ce.ice and <RB affinity Related process -ints Auture *BC pending upstream 3ernel efforts

Per-aps complementary 5&M6 management roles as systems /ill continue to gro/ in siQe and comple0ity

%ummary ( 5uestions

Red Hat Enterprise Linu0 ) Performance Aeatures

I*&5ECJ tool ? adNusts system parameters to matcen.ironments + t-roug-put/latency= *ransparent Huge Pages ? auto select large pages for anonymous memory static -ugepages for s-ared mem 5on+uniform Memory 6ccess 95&M6:

numastat en-ancements numactl for manual control numad daemon for auto placement 9===Come ,ac3 for part $===:

*&56 ? integration // Red Hat Enterprise Linu0 )='

cgroups Arc#itecture

&group default mount points

K cat /etc/cgconfig=conf mount ] cpuset cpu cpuacct memory de.ices freeQer net2cls ,l3io ^ M /cgroup/cpusetZ M /cgroup/cpuZ M /cgroup/cpuacctZ M /cgroup/memoryZ M /cgroup/de.icesZ M /cgroup/freeQerZ M /cgroup/net2clsZ M /cgroup/,l3ioZ

K ls +l /cgroup dr/0r+0r+0 $ root root % !un $" "#E## ,l3io dr/0r+0r+0 # root root % !un $" "#E## cpu dr/0r+0r+0 # root root % !un $" "#E## cpuacct dr/0r+0r+0 # root root % !un $" "#E## cpuset dr/0r+0r+0 # root root % !un $" "#E## de.ices dr/0r+0r+0 # root root % !un $" "#E## freeQer dr/0r+0r+0 # root root % !un $" "#E## memory dr/0r+0r+0 $ root root % !un $" "#E## net2cls

&group #o23to
"GB/$CP& su,set of a ")GB/1CP& system
Knumactl ++-ard/are Kmount +t cgroup 000 /cgroups Km3dir +p /cgroups/test Kcd /cgroups/test Kec-o " H cpuset=mems Kec-o $+# H cpuset=cpus Kec-o "G H memory=limit2in2,ytes Kec-o YY H tas3s

cgroups
[root@dhcp-100-19-50 ~]# forkmany 20MB 100procs &
[root@dhcp-100-19-50 ~]# top -d 5 top - 12:24:13 up Tasks: 315 total, Cpu0 Cpu1 : : 0.0%us, 0.0%us, 1:36, 4 users, load average: 22.70, 5.32, 1.79 0 stopped, 0.0%wa, 0.0%wa, 0 zombie 0.0%si, 0.0%si, 0.0%st 0.0%st

93 running, 222 sleeping, 0.2%sy, 0.2%sy, 0.0%ni, 99.8%id, 0.0%ni, 99.8%id,

0.0%hi, 0.0%hi,

Cpu2 Cpu3
Cpu4 Cpu5 Cpu6 Cpu7

:100.0%us,

0.0%sy,

0.0%ni, 0.0%ni,

0.0%id, 0.0%id,

0.0%wa, 0.0%wa,
0.0%hi, 0.0%hi, 0.0%hi, 0.0%hi,

0.0%hi, 0.2%hi,
0.2%si, 0.4%si, 0.0%si, 0.2%si,

0.0%si, 0.2%si,
0.0%st 0.0%st 0.0%st 0.0%st

0.0%st 0.0%st

: 89.6%us, 10.0%sy,
: : : : 0.4%us, 0.4%us, 0.0%us, 0.0%us, 0.6%sy, 0.0%sy, 0.0%sy, 0.0%sy,

0.0%ni, 98.8%id, 0.0%ni, 99.2%id, 0.0%ni,100.0%id, 0.0%ni, 99.8%id,

0.0%wa, 0.0%wa, 0.0%wa, 0.0%wa,

Mem: Swap:

16469476k total, 2031608k total,

1993064k used, 14476412k free, 185404k used, 1846204k free,

33740k buffers 459644k cached

Eerify correct "indings


L ec#o 8 M cpusetBmems L ec#o 83< M cpusetBcpus L numastat
numa2-it numa2miss local2node ot-er2node node% node" 17>?AA) '#1881 $#';( $"#';$% ")'1)'1 '$#")$ $#;1# $";%"#)

K /common/l/oodman/code/memory ' faulting too3 "=)")%)$s touc-ing too3 %=#)'(#8s K numastat numa2-it numa2miss local2node ot-er2node node% node" )A88>)< '#(;;% $#';( $"#';$% $8%%$(( '$#(#' $#;1# $";%"#)

incorrect "indingsN
L ec#o 1 M cpusetBmems L ec#o 83< M cpusetBcpus K numastat node% node" numa2-it ")$##"1 '#'"%) numa2miss $#';( 18?)>@? local2node ")$#"(' '"1'(% ot-er2node $#;1# "%(1%8' K /common/l/oodman/code/memory ' faulting too$ 1BGA77)As touc#ing too$ 8B>@><))s K numastat numa2-it numa2miss local2node ot-er2node node% node" ")$##'" '#'"'8 $#';( )1<<A<? ")$#$"8 '"1;#" $#;1# $"'(#;'

FE* comparison on Red Hat %PE&;"")81<

6A

Principled *ec-nologies <nc= > Red Hat <nc= ? Confidential

%;/$1/"#

Das könnte Ihnen auch gefallen