Beruflich Dokumente
Kultur Dokumente
paulalex@de.ibm.com
ETS- Enhanced Technical Support
AIX-VUG
Demystifying 10 Gb Ethernet Performance
~1.3 Gbit/s
Warning: ftp client and ftpd server are single-threaded kernel processes!
Pid Command
9240598 ftpd
8782016 ftp
Inuse
24497
23847
Pin
Pgsp
9668
9668
16MB
N
N
N
N
Running multiple ftp client sessions in parallel to get the desired overall throughput
Example:
#vi .netrc
machine 10gbench2 login root password foo
macdef init
put "| dd if=/dev/zero bs=1M count=1000" /dev/null
bye
<insert a blank line here>
2
http://www.oss4aix.org/download/RPMS/iperf/
#iperf s
-----------------------------------------------------------Server listening on TCP port 5001
TCP window size: 16.0 KByte (default)
-----------------------------------------------------------# iperf -c localhost -t 60 -P 8
-----------------------------------------------------------Client connecting to localhost, TCP port 5001
TCP window size: 132 KByte (default)
[ ID] Interval
Transfer
Bandwidth
[ 10] 0.0-60.0 sec 40.7 GBytes 5.82 Gbits/sec
[ 3] 0.0-60.0 sec 40.6 GBytes 5.81 Gbits/sec
[ 4] 0.0-60.0 sec 40.4 GBytes 5.79 Gbits/sec
[ 5] 0.0-60.0 sec 40.5 GBytes 5.80 Gbits/sec
[ 6] 0.0-60.0 sec 40.9 GBytes 5.86 Gbits/sec
[ 7] 0.0-60.0 sec 40.7 GBytes 5.83 Gbits/sec
[ 8] 0.0-60.0 sec 40.6 GBytes 5.82 Gbits/sec
[ 9] 0.0-60.0 sec 40.7 GBytes 5.82 Gbits/sec
[SUM] 0.0-60.0 sec
325 GBytes 46.5 Gbits/sec
3
Control panel
Graphical output
TCP settings
Buffer Length,
Window Size
MSS
No Delay
4
CLI output
Throughput
benchmark:
TCP RTT
benchmark:
Benchmark system:
Throughput
test
Description
FC 5284
FC 5287
FC 528
Benchmark environment
~ 9 Gbit/s
Where is the problem?
AIX LPAR 1
AIX LPAR 2
~ 3 Gbit/s
Power 750 - 8408-E8D
AIX PAR 1
SEA
Virt.
Eth.
10GbE
SR
Virt.
Eth.
PVID 10
vSwitch
Etherchannel
10GbE
SR
PVID 10
AIX LPAR 2
10 GbE
Network
10GbE
SR
#5287
Forwarding
memcpy
Sending Direction
Sending Direction
Hypervisor level
Server LPAR:
Power 770 9117-MMB (Same as client)
AIX 6.1 TL6 SP 3
EC=3.0 Units, uncapped
4 VPs
Virtual Ethernet Adapter, MTU 1500
Server LPAR
uncapped
capped
Virt.
Eth.
Traffic
direction
PVID 1
PHYP
Switch
9
Virt.
Eth.
PVID 1
VLAN 1
MTU 1500:
10
Server LPAR:
Power 770 9117-MMB (Same as client)
AIX 6.1 TL6 SP 3
uncapped, EC=3.0 Units
4 VPs
Virtual Ethernet Adapter, MTU 1500
Maximum throughput
~ 1,25 Gbit/s
0,8
Throughput
MTU 1500
0,6
0,4
0,2
0
0
11
0,2
0,4
0,6
0,8
1,2
1,4
1,6
CPU Units
physc=0,66
physc=0,68
0,78
0,76
minus 19 Mbit/s
0,74
0,72
0,7
0,68
0,59
0,64
0,69
0,74
0,79
CPU units
mpstat s
physc=0,66
mpstat s
physc=0,68
12
--------------------------------------------------------------Proc0
Proc4
65.91%
0.01%
cpu0
cpu1
cpu2
cpu3
cpu4
cpu5
cpu6
cpu7
32.81% 16.36%
8.24%
8.50%
0.00%
0.00%
0.00%
0.01%
----------------------------------------------------------------------------------------------------------------------------Proc0
Proc4
52.82%
14.43%
cpu0
cpu1
cpu2
cpu3
cpu4
cpu5
cpu6
cpu7
26.09% 12.18%
7.21%
7.34%
5.33%
3.27%
3.02%
2.82%
-------------------------------------------------------------
Server LPAR:
Power 720 8202-E4C (Same as client)
AIX 7.1 TL1 SP 3
EC=3.0 Units, uncapped
4 VPs
Virtual Ethernet Adapter, MTU 1500
Maximum throughput
~ 1,6 Gbit/s
Throughput
[Gbps]
Baseline on E4C w.
0.4 CPU units with
991 Mbit/s
1,6
1,4
1,2
1
0,8
TP 9117-MMB
default
0,6
"TP 8202-E4C
default"
0,4
0,2
0
0
0,2
0,4
0,6
0,8
1,2
1,4
1,6
1,8
CPU Units
13
physc=0,45
physc=0,50
1,05
1
0,95
0,9
0,85
0,8
0,36
mpstat s
physc=0,45
mpstat s
physc=0,50
14
0,41
0,46
0,51
0,56
0,61
0,66
0,71
0,76
CPU units
--------------------------------------------------------------Proc0
Proc4
44.98%
0.01%
cpu0
cpu1
cpu2
cpu3
cpu4
cpu5
cpu6
cpu7
22.39% 11.16%
5.62%
5.80%
0.00%
0.00%
0.00%
0.01%
----------------------------------------------------------------------------------------------------------------------------Proc0
Proc4
39.22%
10.72%
cpu0
cpu1
cpu2
cpu3
cpu4
cpu5
cpu6
cpu7
19.37%
9.04%
5.35%
5.45%
3.96%
2.43%
2.24%
2.09%
-------------------------------------------------------------
samples:
min:
max:
avg:
stdev:
Starting with a
TCP trace analysis:
b->a:
8081
8081
0
0
0
0
11698392
8081
11701288
2
2896
0
0
2
5
0/0
N/Y
0
0
0
1448
1448
1447
32761
32761
0
32761
1448
1
NA
NA
11588154
8081
3.929
211.6
2977123
2706
11.7
46.9
25.0
6.9
bytes
pkts
secs
ms
Bps
total packets:
ack pkts sent:
pure acks sent:
sack pkts sent:
dsack pkts sent:
max sack blks/ack:
unique bytes sent:
actual data pkts:
actual data bytes:
rexmt data pkts:
rexmt data bytes:
zwnd probe pkts:
zwnd probe bytes:
outoforder pkts:
pushed data pkts:
SYN/FIN pkts sent:
req 1323 ws/ts:
urgent data pkts:
urgent data bytes:
mss requested:
max segm size:
min segm size:
avg segm size:
max win adv:
min win adv:
zero win adv:
avg win adv:
initial window:
initial window:
ttl stream length:
missed data:
truncated data:
truncated packets:
data xmit time:
idletime max:
throughput:
ms
ms
ms
ms
RTT
RTT
RTT
RTT
RTT
pkts
bytes
bytes
bytes
bytes
bytes
bytes
bytes
times
bytes
bytes
pkts
samples:
min:
max:
avg:
stdev:
3639
3639
3639
0
0
0
0
0
0
0
0
0
0
0
0
0/0
N/Y
0
0
0
0
0
0
65522
38734
0
65509
0
0
NA
NA
0
0
0.000
211.6
0
0
0.0
0.0
0.0
0.0
pkts
bytes
bytes
bytes
bytes
bytes
bytes
bytes
times
bytes
bytes
pkts
bytes
pkts
secs
ms
Bps
ms
ms
ms
ms
Count
========
30419
18173
3189
693
689
689
Total Time
(msec)
===========
157.0195
26.2612
2.7187
1.1146
0.7688
0.3535
% sys
time
======
0.63%
0.11%
0.01%
0.00%
0.00%
0.00%
Avg Time
(msec)
========
0.0052
0.0014
0.0009
0.0016
0.0011
0.0005
Min Time
(msec)
========
0.0004
0.0005
0.0005
0.0010
0.0005
0.0003
Min ETime
(msec)
=========
0.0019
0.0007
0.0005
0.0010
0.0005
0.0003
Max ETime
(msec)
=========
7.5224
6.0742
0.0050
0.0035
0.0026
0.0046
Min ETime
(msec)
=========
0.0020
0.0008
0.0006
0.0009
0.0007
0.0003
Max ETime
(msec)
=========
0.0221
0.0129
0.0081
0.0026
0.0157
0.0009
========================
H_SEND_LOGICAL_LAN((unknown) 41977e8)
H_ADD_LOGICAL_LAN_BUFFER((unknown) 4191
H_PROD((unknown) 6ffb8)
H_XIRR((unknown) 41187cc)
H_EOI((unknown) 41149b8)
H_CPPR((unknown) 4112b08)
Kernel trace curt report from client with 1.3 CPU units:
Count
========
27187
13489
2104
502
501
501
16
Total Time
(msec)
===========
133.6836
16.9196
3.4490
0.7127
0.5384
0.1983
% sys
time
======
4.39%
0.56%
0.11%
0.02%
0.02%
0.01%
Avg Time
(msec)
========
0.0049
0.0013
0.0016
0.0014
0.0011
0.0004
Min Time
(msec)
========
0.0005
0.0008
0.0006
0.0009
0.0007
0.0003
========================
H_SEND_LOGICAL_LAN((unknown) 41977e8)
H_ADD_LOGICAL_LAN_BUFFER((unknown) 4191
H_PROD((unknown) 6ffb8)
H_XIRR((unknown) 41187cc)
H_EOI((unknown) 41149b8)
H_CPPR((unknown) 4112b08)
SEA
Virt.
Eth.
vSwitch
10GbE
SR
10GbE
SR
Virt.
Eth.
PVID 1
17
Etherchannel
PVID 1
Server LPAR
10 GbE
Network
10GbE
SR
10GbE
SR
Etherchannel
SEA
Virt.
Eth.
Virt.
Eth.
PVID 1
vSwitch
PVID 1
Anatomy of seaproc
seaproc is a 64 bit, multithreaded kernel process
Each active Shared Ethernet Adapter runs a dedicated seaproc instance
seaproc needs CPU cycles for bridging activity
The efficiency of a particular Shared Ethernet Adapter depends on how the corresponding
seaproc threads can perform
# ps -alk | grep seaproc
40303 A
0 3080304
40303 A 10 3801156
40303 A
0 3866764
1
1
1
0
0
0
37 -- 86c0bb190
37 -- 87cc7f190
37 -- 82c0cb190
1024
1024
1024
*
*
*
- 0:00 seaproc
- 22:47 seaproc
- 126:14 seaproc
18
PPID
1
-
TID ST
5963991
6160625
15663143
18153487
18350131
22413341
24117287
CP
A
S
S
S
S
R
S
R
PRI
71
0
0
0
0
36
0
35
SC
37
37
37
37
37
37
37
37
WCHAN
7
1
1
1
1
1
1
1
F
*
40303
f1000a001c3c1318
f1000a001bd00c78
f1000a001c060fc8
f1000a001beb0e20
1000
f1000a001c5714c0
1400
TT BND COMMAND
- seaproc
1400
1400
1400
1400
- 1400
- -
2,82
2,5
Sending VIOS
0,74
CPU Units
Receiving VIOS
1,5
0,73
Server LPAR
0,95
0,5
Client LPAR
0,4
0
1
Numbers are dependent on Power Systems Model and hardware configuration
TP [Gb/s]
20
Jumbo Frames
The term Jumbo Frame specifies a payload size of more than 1500 bytes and up to
9000 bytes encapsulated within one ethernet frame
Jumbo Frames can significantly reduce the cpu time for data forwarding
Using Jumbo Frames has no effect for data packets, smaller than 1500 bytes
of payload content
Jumbo Frames must be implemented on an end-to-end basis
Networking equipment (physical and virtual) on all potential paths between sender and receiver
must be configured for Jumbo Frame
Internally on Power Systems (Examples will be in the following slides)
Virtual Ethernet Adapters in client partitions
Shared Ethernet Adapters in VIO-Servers
Etherchannel devices
Physical network adapters
At data center level:
Access Layer Switches
Aggregation and core Multilayer-Switches and Router
Security devices like Firewalls and Intrusion Detection Systems
Layer 3 devices like Routers and Firewalls can fragment Jumbo Frames into smaller MTU sized
data packets but with impact on performance (cf. see Andrew S. Tanenbaum: Computer Networks)
21
Server LPAR:
AIX 7.1 TL1 SP 3, uncapped,
weight 128, 4VPs
Virtual Ethernet Adapter, MTU 9000
Virtual I/O Servers:
EC=2.0 Units, uncapped, weight 255
PCIe2 2-port 10GbE SR Adapter
To be tuned
for MTU 9000
Power 720 - 8202-E4C
Client LPAR
SEA
Virt.
Eth.
Virt.
Eth.
PVID 1
vSwitch1
PHYP
22
10GbE
SR
Server LPAR
10 GbE
Network
10GbE
SR
SEA
Virt.
Eth.
Virt.
Eth.
PVID 1
PVID 1
vSwitch2
PVID 1
Server LPAR
Client LPAR
Virt.
Eth.
Virt.
Eth.
PVID 1
PHYP
23
PVID 1
VLAN 1
24
MTU 9000:
25
Server LPAR:
Power 770 9117-MMB (Same as client)
AIX 6.1 TL6 SP 3
uncapped, EC=3.0 Units
4 VPs
Virtual Ethernet Adapter, MTU 9000
Significant throughput
scale by factor ~ x3
26
Server LPAR:
Power 720 8202-E4C (Same as client)
AIX 7.1 TL1 SP 3
EC=3.0 Units, uncapped
4 VPs
Virtual Ethernet Adapter, MTU 1500
Client LPAR
Server LPAR
uncapped
capped
Virt.
Eth.
Traffic
direction
PVID 1
PHYP
Switch
27
Virt.
Eth.
PVID 1
VLAN 1
Server LPAR:
Power 720 8202-E4C (Same as client)
AIX 7.1 TL1 SP 3
EC=3.0 Units, uncapped
4 VPs
Virtual Ethernet Adapter, MTU 9000
Throughput
[Gbps]
7,00
6,00
5,00
4,00
TP 8102-E4C
MTU 9000
3,00
"TP 8202-E4C
MTU1500"
2,00
1,00
0,00
0
28
0,2
0,4
0,6
0,8
1,2
1,4
1,6
1,8
2
CPU Units
TP [Gb/s]
29
5,00
4,50
physc=0,45
physc=0,60
4,00
3,50
3,00
2,50
0,38
0,43
0,48
0,53
0,58
0,63
0,68
0,73
0,78
CPU units
mpstat s
physc=0,45
mpstat s
physc=0,60
30
--------------------------------------------------------------Proc0
Proc4
44.96%
0.00%
cpu0
cpu1
cpu2
cpu3
cpu4
cpu5
cpu6
cpu7
22.38% 11.16%
5.62%
5.80%
0.00%
0.00%
0.00%
0.00%
----------------------------------------------------------------------------------------------------------------------------Proc0
Proc4
47.04%
12.86%
cpu0
cpu1
cpu2
cpu3
cpu4
cpu5
cpu6
cpu7
23.23% 10.85%
6.42%
6.54%
4.75%
2.91%
2.69%
2.51%
-------------------------------------------------------------
Server LPAR:
Power 770 9117-MMD (Same as client)
AIX 7.1 TL2 SP 2
EC=3.0 Units, uncapped
4 VPs
Virtual Ethernet Adapter, MTU 65390
Client LPAR
Server LPAR
uncapped
capped
Virt.
Eth.
Traffic
direction
PVID 1
PHYP
Switch
31
Virt.
Eth.
PVID 1
VLAN 1
32
33
Client LPAR
SEA
capped
Virt.
Eth.
Virt.
Eth.
PVID 1
vSwitch1
PHYP
34
10GbE
SR
To be tuned
Power 720 - 8202-E4C
Server LPAR
10 GbE
Network
10GbE
SR
uncapped
SEA
Virt.
Eth.
Virt.
Eth.
PVID 1
PVID 1
vSwitch2
PVID 1
36
Phys.
Adapter
SEA
Virt.
Eth.
PVID 1
PHYP
Switch
37
Client LPAR
Virt.
Eth.
PVID 1
VLAN 1
TP [Gb/s]
38
TP [Gb/s]
39
Benchmark w. effective
Throughput: 930 Mbit/s
Overall CPU utilization
2,82
2,5
Sending VIOS
0,74
CPU Units
Receiving VIOS
1,5
0,73
0,73
Server LPAR
0,95
0,5
Client LPAR
0,4
0
1
Results are dependent on Power Systems Model and hardware configuration
42
Improved with
appropriate tuning
43
44
45
with preeemption = 0
with preeemption = 0
46
48
Receive Statistics:
------------------Packets: 148017042933
Bytes: 141743288103445
Interrupts: 68921391370
Receive Errors: 0
Packets Dropped: 235745
Bad Packets: 0
No Resource Errors
can occur when the
appropriate amount of
Max Packets on S/W Transmit Queue: 321
memory can not be
S/W Transmit Queue Overflow: 0
added quickly enough to
Current S/W+H/W Transmit Queue Length: 1
vent buffer space. This
has
mainly two reasons:
Elapsed Time: 0 days 0 hours 0 minutes 0 seconds
Broadcast Packets: 107560097
Broadcast Packets: 215156995Too much workload or
Multicast Packets: 118240081
Multicast Packets: 252467976too less access to CPU
time.
No Carrier Sense: 0
CRC Errors: 0
DMA Underrun: 0
DMA Overrun: 0
Lost CTS Errors: 0
Alignment Errors: 0
No Resource Errors: 235745
Max Collision Errors: 0
Late Collision Errors: 0
Receive Collision Errors: 0
Deferred: 0
Packet Too Short Errors: 0
SQE Test: 0
Packet Too Long Errors: 0
Timeout Errors: 0
Packets Discarded by Adapter: 0
Single Collision Count: 0
Receiver Start Count: 0
[]
49Hypervisor Receive Failures: 235745
Tiny
512
2048
512
512
Small
512
2048
512
512
Medium
128
256
128
128
Large
24
64
24
24
Huge
24
64
24
24
512
508
1750
502
128
128
24
24
24
24
Receive Information
Receive Buffers
Buffer Type
Min Buffers
Max Buffers
Allocated
Registered
History
Max Allocated
Lowest Registered
51
max alloc:
= min buf
< max buf
max alloc:
> min buf
< max buf
Tiny
512
2048
512
512
Small
512
2048
512
512
Medium
128
256
128
128
Large
24
64
24
24
Huge
24
64
24
24
512
509
523
502
138
123
39
19
64
18
max alloc:
> min buf
< max buf
max alloc:
> min buf
< max buf
max alloc:
> min buf
= max buf
VIOS
ent1
(VENT)
ent2
(SEA)
ent0
(phy)
52
chdev
chdev
chdev
chdev
chdev
chdev
chdev
chdev
chdev
chdev
-l
-l
l
-l
-l
-l
-l
-l
-l
-l
<VENT>
<VENT>
<VENT>
<VENT>
<VENT>
<VENT>
<VENT>
<VENT>
<VENT>
<VENT>
-a
-a
-a
-a
-a
-a
-a
-a
-a
-a
max_buf_huge=128 -P
min_buf_huge=64 -P
max_buf_large=128 -P
min_buf_large=64 -P
max_buf_medium=512 -P
min_buf_medium=256 -P
max_buf_small=4096 -P
min_buf_small=2048 -P
max_buf_tiny=4096 -P
min_buf_tiny=2048 P
Receive Information
Receive Buffers
Buffer Type
Min Buffers
Max Buffers
Allocated
Registered
History
Max Allocated
Lowest Registered
53
max alloc:
= min buf
< max buf
max alloc:
= min buf
< max buf
Tiny
1024
4096
1024
1024
Small
1024
4096
1024
1024
Medium
256
512
256
256
Large
48
128
48
48
Huge
48
128
48
48
1024
1023
1024
1024
256
256
48
48
48
48
max alloc:
= min buf
< max buf
max alloc:
= min buf
< max buf
max alloc:
= min buf
< max buf
Flow control
# entstat d ent5
PCIe2 2-port 10GbE SR Adapter (a21910071410d003) Specific Statistics:
--------------------------------------------------------------------Link Status: Up
Media Speed Running: 10 Gbps Full Duplex
PCI Mode: PCI-Express X8
Relaxed Ordering: Disabled
TLP Size: 512
MRR Size: 4096
PCIe Link Speed: 5.0 Gbps
Firmware Operating Mode: Legacy
Jumbo Frames: Enabled
Transmit TCP segmentation offload: Enabled
Receive TCP segment aggregation: Enabled
Transmit and receive flow control status: Enabled
Number of XOFF packets transmitted: 0
Number of XON packets transmitted: 0
Number of XOFF packets received: 0
Number of XON packets received: 0
54
Latency
AIX localhost in a SPP:
Alignment
Local Remote
Send
Recv
8
0
Offset
Local Remote
Send
Recv
0
0
RoundTrip Trans
Throughput
Latency
Rate
10^6bits/s
usec/Tran per sec Outbound
Inbound
30.769
32500.306 0.260
0.260
Offset
Local Remote
Send
Recv
0
0
RoundTrip Trans
Throughput
Latency
Rate
10^6bits/s
usec/Tran per sec Outbound
Inbound
57.785
17305.403 0.138
0.138
Offset
Local Remote
Send
Recv
0
0
RoundTrip Trans
Throughput
Latency
Rate
10^6bits/s
usec/Tran per sec Outbound
Inbound
185.065
5403.498 0.043
0.043
55
Offset
Local Remote
Send
Recv
0
0
RoundTrip
Latency
usec/Tran
2930.229
Trans
Throughput
Rate
10^6bits/s
per sec Outbound
Inbound
341.270 0.003
0.003
7,82
9,39
6,94
7
6
8,42
8,96
5,61
5
4
3
2
1
0
25000
50000
75000
100000
Throughput [Gbit/s]
56
150000
262144
16
15
14
12
10
10,3
7,82
7,01
9,47
9,23
9,09
8,42
8,96
9,39
6,94
5,61
4
2
0
25000
50000
75000
57
100000
150000
262144
Offset
Local Remote
Send
Recv
0
0
RoundTrip Trans
Latency
Rate
usec/Tran per sec
162.928
6137.665
Sending/Receiving VIOS
with dedicated donating CPUs:
Alignment
Local Remote
Send
Recv
8
0
58
Offset
Local Remote
Send
Recv
0
0
RoundTrip Trans
Latency
Rate
usec/Tran per sec
146.819
6811.107
~ 11 %
Better
Transaction Rate
Offset
Local Remote
Send
Recv
0
0
RoundTrip Trans
Latency
Rate
usec/Tran per sec
159.369
6274.746
Offset
Local Remote
Send
Recv
0
0
RoundTrip Trans
Latency
Rate
usec/Tran per sec
145.862
6855.812
9,2 %
Better
Latency
60
61
62
Throughput gain
HIGH
HIGH
++
Latency gain
HIGH
GOOD
Packet distribution
Load distribution
+++
Ease of
implementation
and/or risk
MED
MED
+++
HIGH
HIGH
#5: EC hashing
HIGH
HIGH
HIGH
HIGH
MED
HIGH
HIGH
GOOD
HIGH
++
HIGH
++
HIGH
63
+++
~x3-4 Throughput gain
for incomming packets
MED
GOOD
MED
++
MED
++
HIGH
Sufficient ENTC
GOOD
GOOD
++
GOOD
Sufficient ENTC
Sufficient ENTC
Sufficient ENTC
HIGH
POOR
MED
AIX LPAR 2
~ 3 Gbit/s
Power 750 - 8408-E8D
AIX PAR 1
SEA
Virt.
Eth.
10GbE
SR
Virt.
Eth.
PVID 10
vSwitch
64
Etherchannel
10GbE
SR
PVID 10
AIX LPAR 2
10 GbE
Network
10GbE
SR
#5287
Fact is
A 10 GbE adapter provides a physical
line speed of up to 10 Gbit/s
The network performance of an OS or
an Application depends on.:
available CPU power for
application and OS network stack
maximum Transmission Unit size
distance between sender
and receiver
offloading features
coalescing and aggregation
features
TCP configuration
65
Ill never
get 10 Gig..
THANK YOU!
VIELEN DANK!
Alexander Paul
paulalex@de.ibm.com
Enhanced Technical Support (ETS)
Meet you at
Enterprise2014
66