OneFS Cluster Performance Metrics, Tips, and Tricks PDF

EMC ISILON ONEFS CLUSTER PERFORMANCE
METRICS HINTS AND TIPS
Abstract
This document provides a summary of useful Isilon OneFS
commands you can run to examine the performance metrics
available on an Isilon cluster.
September 21, 2016

Contents
Introduction ........................................................................................................................................................... 4
Best practices ........................................................................................................................................................ 4
When to analyze your workload performance ......................................................................................................... 4
Cluster health state ............................................................................................................................................... 4
Overall cluster health ....................................................................................................................................... 4
Disk failures ..................................................................................................................................................... 5
Workload ............................................................................................................................................................... 5
Protocol traffic and balance .............................................................................................................................. 5
Protocol read, write, and metadata mix ............................................................................................................. 5
Users by protocol ............................................................................................................................................. 6
Busiest files and paths ..................................................................................................................................... 7
OneFS 7.2.x file sizes ....................................................................................................................................... 9
OneFS 7.2.x histogram ................................................................................................................................... 10
OneFS 8.0x file sizes ...................................................................................................................................... 11
OneFS 8.0.x file histogram .............................................................................................................................. 11
Misaligned writes ........................................................................................................................................... 13
Blocked, contended, and deadlocked events .................................................................................................. 13
Disk activity .................................................................................................................................................... 14
Network factors ................................................................................................................................................... 14
Latency and hops ........................................................................................................................................... 14
Latency and packet loss ................................................................................................................................. 15
MTU size of 1500 or 9000? ............................................................................................................................. 15
Bandwidth...................................................................................................................................................... 15
Jitter ............................................................................................................................................................... 16
Theoretical bandwidth maximum .................................................................................................................... 16
Retransmission rate ........................................................................................................................................ 17
Hostcache.list ................................................................................................................................................ 17
Cache response times and effectiveness ........................................................................................................ 18
Protocol operations ............................................................................................................................................. 19
Protocol operations most used ....................................................................................................................... 19
Protocol operations taking the most time ........................................................................................................ 19
Clients most-to-least demanding .................................................................................................................... 20
Connection distribution .................................................................................................................................. 20
Slow authentication........................................................................................................................................ 20
Protocol latency .............................................................................................................................................. 20
Clusters and nodes .............................................................................................................................................. 21
2 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 2

We appreciate your help in improving this document. You can submit your feedback at http://bit.ly/isi-docfeedback.
Capacity ......................................................................................................................................................... 21
CPU ................................................................................................................................................................ 21
Memory .......................................................................................................................................................... 21
Load average .................................................................................................................................................. 22
Balance across nodes ..................................................................................................................................... 22
SmartConnect and load balance ..................................................................................................................... 22
SMB open files list.......................................................................................................................................... 22
NFS locks list .................................................................................................................................................. 23
Disk drive behavior .............................................................................................................................................. 23
Disk: Time in queue ........................................................................................................................................ 24
Disk: Queue depth .......................................................................................................................................... 24
Disk: Percent busy .......................................................................................................................................... 24
Related resources ................................................................................................................................................ 25
Contacting EMC Isilon technical support .............................................................................................................. 26

Introduction
The diagnosis of performance-related issues can be a complicated task. You must often rely on your familiarity
with similar past incidents and your experience in understanding the relationship between subsystem metrics
and workflow effects. This white paper does not attempt to lead you through the complexity to a diagnostic
conclusion. Rather, it provides a summary of useful commands that you can run to collect and examine the
performance metrics that are available on an Isilon OneFS cluster.
Unless otherwise noted, the commands in this document apply to Isilon OneFS 8.0.x.
Best practices
Make sure that you follow the best practice procedures during your pre-deployment of an Isilon OneFS cluster
to prevent issues from arising. Integrating changes after a system is deployed, to bring the system into best
practice compliance can be operationally difficult. For more information, refer to the following list of best
practices for Isilon OneFS use cases.
Isilon OneFS Best Practice References
• Best Practices for Data Replication with EMC Isilon SyncIQ

• NFS File Migration to EMC Isilon
• EMC Isilon Media Asset Management Best Practices Guide
• SALIENT Systems CompleteView VMS (V4.4) Best Practices
• EMC Isilon SmartLock Compliance Mode Best Practices
• EMC Isilon Best Practices for Hadoop Data Storage
• EMC Isilon NFS Tuning and Best Practices for Next-Generation Sequencing
• EMC Isilon Storage Best Practices for Electronic Design Automation
• Best Practices for Integrating Telestream Vantage with EMC Isilon OneFS
• Best Practices for Using SMB3 Multichannel for 4K Video Playback
• Deploying a Backup Solution Using IBM Tivoli Storage Manager with EMC Isilon
• Best Practices Guide for Implementing EMC Isilon in a Video Surveillance Solution
• File Archival Using Symantec Enterprise Vault with EMC Isilon
When to analyze your workload performance

Analyze your workload performance when disruption has occurred or any time performance changes. For
example, when an application upgrade is performed, a new functionality enabled, or a migration to or from a
new pool. After reviewing this white paper, if you need help analyzing your workload results, contact your EMC
Isilon or Partner account team.
Cluster health state

Overall cluster health
You can provide visual cues about the status of cluster and node health, with green, amber, and red status
displays, by running the isi status command. The Health Field can also show the status: D = Down, A =
Attention, S = Smartfailed, R = Read-Only.

Critical Events are reported in the command output. You can review the running job list to see if maintenance
scripts, such as smartfail, are active.
isi status
Disk failures
To observe disk drives that are marked as smartfail, empty, stalled, or down, run the following command:
isi_for_array -sX 'isi devices list | grep -vi healthy'
Workload
Protocol traffic and balance
You can identify the balance of protocol traffic within a workflow by running the following isi statistics
command and viewing the busiest protocols as returned by NumOps:
isi statistics protocol list --totalby Op,Proto --protocols all --output Proto,Ops --
sort NumOps --long
Example output
NumOps Ops Proto Op

---------------------------------------
5.9k 1.1k smb2 write
5.5k 1.1k irp write
128.0 25.2 http upload
14.0 2.9 lsass_in lsa:flush_cache
9.0 1.8 papi get
8.0 1.7 lsass_in lsa:auth_user_pam
---------------------------------------
NOTE: The Identify Registration Protocol (IRP) protocol runs across InfiniBand and should not be evaluated as
a client protocol.
The operational measurements are defined as:
• NumOps is the number of operations in the sample time.

• Ops (operations per second) is the rate of operations in the sample time.
• Op is the name of the operation.
Protocol read, write, and metadata mix

You can identify the approximate mix of read, write, and metadata protocol components by running the
following isi statistics command and performing some simple calculations. The --protocol
parameter is extracted from the Proto column in the previous example output.
isi statistics pstat --protocol smb2
Example output

__________________________SMB2 Operations Per Second___________________________
cancel 0.00/s change_notify 0.00/s
close 43.39/s create 42.52/s
echo 0.00/s flush 0.00/s
ioctl 0.00/s ioctl:enumerate_snapshots 0.00/s
ioctl:unknown 0.00/s lock 0.00/s
logoff 0.00/s negotiate 0.00/s
oplock_break 0.00/s query_directory 0.00/s
query_info 0.00/s read 829.23/s
session_setup 0.00/s set_info 7.64/s
tree_connect 0.00/s tree_disconnect 0.00/s
write 333.22/s
Total 1255.99/s
___CPU Utilization___ _____OneFS Stats_____

user 1.2% In 23.01 MB/s
system 8.6% Out 68.27 MB/s
idle 90.2% Total 91.28 MB/s
____Network Input____ ___Network Output____ ______Disk I/O_______

MB/s 15.32 MB/s 56.34 Disk 818.80 iops
Pkt/s 14663.40 Pkt/s 9051.80 Read 50.42 MB/s
Errors/s 0.00 Errors/s 0.00 Write 29.38 MB/s
The top section of the output shows the protocol command rates. The Read plus Write plus Metadata
operation rates equals the Total. Using the isi statistics pstat output from the example, subtract the
Read value (829 ops/s) and the Write value (333 ops/s) from the Total value (1256 op/s), which leaves 94
op/s for Metadata operations. In this example, the Read, Write, and Metadata ratio for this protocol is the
following:
• Read (829/1256)*100 = 66%

• Write (333/1256)*100 = 26.5%
• Metadata ( 94/1256)*100 = 7.5%
Users by protocol
You can identify the top 20 users, and the external protocols they are using, by running the following
command:
isi statistics client --protocols external --no-footer | awk '{print $1 " "$6 " " $8}'
| head -22
To establish if any users are dominant or out of balance with other users of the same protocol, observe the
Ops count for each user. To determine the difference between a busy user and a non-busy user, you can
increase the command output to observe at what point the user operation counts begin to decrease.

Example output
Ops Proto UserName

----------------------------------------------
209.6 smb2 root
209.6 smb2 root
122.8 smb2 root
25.6 smb2 root
2.0 http *
Isilon stores 1024 user name records for each 15-second window. An UNKNOWN user name is not included in
the 1024 records for the window in which the command was issued. An asterisk “*” is displayed in the
UserName column for protocols that do not supply a user name.
The protocols are divided into external protocols and internal protocols, as shown in the following table.
External Internal
ftp nlm irp
hdfs papi jobd
http siq lsass_in
nfs3 smb1 lsass_out
nfs4 smb2
Busiest files and paths

You can identify the top 20 files in use and their use rate by running the following isi statistics
command. Busy files are indicated in the output by read/write (R/W) activity and data transfer.
isi statistics heat --nodes all --totalby path | awk '{print $1 " " $5}' | sort -n -r
| head -20
To determine the difference between a busy file and a non-busy file, you can increase the command output to
observe at what point the file operation rates begin to decrease.
Files marked as “UNKNOWN” are one of the following:
• A system file
• A file with a path name that is too long
• A snapshot that no longer exists
• An unlinked file that is still referenced somewhere
Example output 1 displays entries for identical paths with different operation rates. Each entry for the same
path is a different event (for example, a read, getattr, lookup, or other operation). Multiple instances of the
same path aggregate to indicate the total operation rate for that path.

Example 1
Run the following command to display all node operation rates and file paths.
isi statistics heat --nodes all --totalby path | awk '{print $1 " " $5}' | sort -n -r
| head -20
Example 1 output
419.4 /ifs
357.0 /ifs
182.7 /ifs/.ifsvar
145.1 /ifs/.ifsvar
76.9 /ifs/Test
66.3 /ifs
48.5 /ifs/Test/3
46.4 UNKNOWN
42.5 /ifs/.ifsvar/modules
41.4 /ifs/.ifsvar/modules/tardis
34.1 /ifs/Test/3/vdb.1_12.dir/vdb.2_23.dir/vdb.3_9.dir
31.5 /ifs/.ifsvar/modules
28.3 UNKNOWN
To view the events that produce each path instance in Example output 1, run the command to display node-
by-node operation rates and file paths, and add the event column to the awk print statement. The following
example runs on node 1.
Example 2
Run the following command to display one node (node 1), the operation rate, events, and file paths:
isi statistics heat --nodes 1 | awk '{print $1 " " $4 " " $5}' | sort -n -r | head -20
The output displays the events that are associated with each instance of duplicate paths.
Example 2 output
93.8 other /ifs

77.0 namespace_read /ifs
51.8 other /ifs
35.6 other /ifs/.ifsvar
34.2 namespace_read /ifs/Test

32.3 other /ifs/Test/3
32.3 other /ifs/Test
29.1 write /ifs/Test/3/vdb.1_17.dir/vdb.2_4.dir/vdb.3_9.dir
26.7 namespace_read /ifs/Test/3
25.9 other /ifs/.ifsvar
In Example 2 output, you can observe the multiple entries for the /ifs path results from “namespace_read”
and “other” events.
OneFS 7.2.x file sizes

You can identify the 20 largest files on an Isilon OneFS 7.2.x cluster by running a complete File System
Analytics (FSA) job. FSA is a Job Engine task that provides a breakdown of all the files and directories on /ifs. A
successful FSA job creates a table of the 1,000 largest files (both physical and logical sizes) and their paths. In
some instances, the job might not run to completion, making this query unusable
As an alternative, you can navigate to the /ifs/.ifsvar/modules/fsa/pub/latest directory. List the files within
this directory and observe the size and date for the results.db file. Verify that the file contents exist and
the date that the last FSA job was run. If the file contents exist, use the following query to obtain information
about the 20 largest files in the following order: physical size in MB, logical size in MB, and the path.
sqlite3 -header -column /ifs/.ifsvar/modules/fsa/pub/latest/results.db 'select path,

phys_size, log_size from list_top_n_files_by_phys_size order by phys_size desc limit
20' | awk '{ phys=$2/1024/1024; logical=$3/1024/1024} {printf "%-15s %-15s",phys,
logical} {print $1}'
Example output
15873.7 10563.6
data/pg/LS/reference/isaac2/iSAACIndex.20150312/Temp/neighbors.dat
15250.1 10148.6 data/pg/LS/reference/dbsnp_138.b37.vcf
10036 6678.19 SPEC/iobw.tst
5940.46 3953.25 data/pg/LS/src/parallel_studio_xe_2015_update2.tgz
4585.88 3051.67 data/pg/LS/reference/hg19.fa
4495.92 2991.83 data/pg/LS/reference/hg19.fa.bwt
2497.35 1661.91
data/pg/LS/reference/isaac2/iSAACIndex.20150312/hg19.fa-32mer-6bit
2497.35 1661.91
data/pg/LS/reference/isaac2/iSAACIndex.20150312/Temp/hg19.fa-32mer

2472.31 1645.13
2472.29 1645.13
2247.91 1495.92 data/pg/LS/reference/hg19.fa.sa
1685.96 1121.94 data/pg/LS/reference/hg19.fa.rbwt
1622.04 1079.38
1622.02 1079.38
1620.7 1078.49
1620.7 1078.49
1503.68 1000 data/testfile
1503.12 1000 data/zerofile
1433.79 954.07
1433.06 953.576
To view the smallest 20 files from the 1,000 recorded files, change the desc parameter to asc, potentially
providing an indication of the file size distribution. The ratio of the physical to logical size can be used to gain
an indication of storage efficiency.
OneFS 7.2.x histogram

An understanding of the distribution of file sizes can be useful when considering performance. The following
command interrogates the File System Analytics (FSA) database to obtain a count of each file size on record.
The output length depends on the amount of unique file sizes that are found on the cluster. The physical size
is given in bytes and must be manually converted to a unit of choice.
sqlite3 -header -column /ifs/.ifsvar/modules/fsa/pub/latest/results.db 'select

phys_size, sum(file_cnt) from stats_disk_pool_atime_phys_size group by phys_size'
Example output
phys_size sum(file_cnt)
---------- -------------
0 3536
8192 599126
131072 14144
1048576 240301
10485760 63276
104857600 1177
1073741824 147
1073741824 10

OneFS 8.0x file sizes
You can identify the 20 largest files on an Isilon OneFS 8.0.x cluster by running a complete File System
Analytics (FSA) job.
OneFS 8.0.x records the FSA results into multiple databases for parallel access. The list of the 1,000 largest
files still exists. However, in OneFS 8.0.x, the list is in a separate database outside of the results.db file.
Navigate to the /ifs/.ifsvar/modules/fsa/pub/latest directory. List the files within this directory, and observe
the size and date for the list_top_n_files_by_phys_size.db. Verify if file contents exist and the date
that the last FSA job was run. If the file exists, run the following query to obtain information about the 20
largest files by physical size in MB, logical size in MB, and the path:
sqlite3 -header -column

/ifs/.ifsvar/modules/fsa/pub/latest/list_top_n_files_by_phys_size.db 'select path,
phys_size, log_size from list_top_n_files_by_phys_size order by phys_size desc limit
20' | awk '{ phys=$2/1024/1024; logical=$3/1024/1024} {printf "%-15s %-15s",phys,
logical} {print $1}'
Example output
267.431 177.913 data/OneFS_v8.0.1.BETA.0_Install.tar.gz

267.4 177.882 install.tar.gz
47.1733 31.3272 mkqa.tar.gz
12.0483 8 Test/3/vdb.1_4.dir/vdb.2_12.dir/vdb.3_1
12.0483 8 Test/3/vdb.1_4.dir/vdb.2_7.dir/vdb.3_1.
12.0483 8 Test/3/vdb.1_8.dir/vdb.2_8.dir/vdb.3_1.
12.0249 8 Test/3/vdb.1_23.dir/vdb.2_25.dir/vdb.3_
OneFS 8.0.x file histogram

Obtaining the file size distribution information in OneFS 8.0.x relies on a successful run of the File System
Analytics (FSA) job. However, FSA results provide a predefined structure for a histogram of file sizes.
You must identify the most recently completed FSA job and then query it for file size results.
With OneFS 8.0.x, the Isilon OneFS Application Program Interface (API), you can interrogate the FSA database
by including the ID number of the completed FSA job.

To obtain the ID number of the most recently completed FSA job, insert the cluster root password into the
curl command.
curl -k -s -u root:<pw> --get "https://localhost:8080/platform/3/fsa/results?"
Example output
"begin_time" : 1463695240,
"content_path" : "/ifs/.ifsvar/modules/fsa/pub/job.378/results.db",
"delete_link" : "https://localhost:8080/platform/3/fsa/results",
"end_time" : 1463702519,
"fsa_state" : "publish",
"id" : 378,
"job_state" : [ "9", "STATE_FINISHED" ],
…
"version" : 3
Note the number for the top-most entry with "STATE_FINISHED." In this example the ID number is 378.
To obtain a count of the file occurring within the 11 predefined file size buckets, insert the cluster root
password and the ID number of the most recently completed FSA job into the following.
curl -k -s -u root:<pw> --get

"https://localhost:8080/platform/3/fsa/results/<ID>/histogram/atime?" | grep value
A table with each bucket size is shown below.
File Size Bucket Example output
1.5 GB
"value" : 54
60 MB
"value" : 405317
30 MB "value" : 314076
15 MB
"value" : 35231545
"value" : 47
7.5 MB
"value" : 57
5 MB "value" : 41
"value" : 16
2.5 MB
"value" : 2
.5 MB
"value" : 0
85 KB "value" : 0
4 KB
> 4 KB

Misaligned writes
The misaligned write counter increments when OneFS receives a write request that is a multiple of 4 KB in size
but not at the file offset that is 4 KB aligned. For example, a 64 KB write request at an offset of 65 KB in the file
increases the counter.
The misalignment is at the file level, not at the file system level and results from a storage abstraction layer,
such as the virtual machine (VM) storage not matching the OneFS storage blocking. The cost of each
misaligned write request is dependent on many variables and causes additional I/O load, ranging from 10% to
20%.
Run the following command and record the misaligned write request counts per node. Wait 30 seconds and
then run the command again.
isi_for_array -X sysctl efs.bam.bsw.num_misaligned_4k_writes
Example output
CSE-2-X200-1: efs.bam.bsw.num_misaligned_4k_writes: 1702
Calculate the difference between the misaligned write request counts from each command. Divide this
difference by the time between samples. The result is the rate-per-second of misaligned write requests.
The rate of the misaligned write requests must be weighed against the rate of write requests for the protocol
that is servicing the abstraction layer (probably VM) to see if an additional 10% to 20% I/O load is deemed
significant. You can run the following command to see write requests for a specific protocol.
isi statistics pstat --protocols <protocol of the virtual machine>
For example, if the protocol write request rate is 200 writes per second, and the misaligned write requests are
30 per second, the overhead of misalignment might be significant and causing an impact. Conversely, if the
protocol write request rate is 200 writes per second, and the misaligned write request rate is 2 per second,
this is not a significant performance factor.
Blocked, contended, and deadlocked events

In a clustered array, you can expect some resource sharing and locking events. View the locking event counts
when no performance issues occur and a baseline count is established for the workload. If you find
subsequent locking event counts to be much higher than the baseline, you can investigate locking events as a
contributing factor to performance degradation. Locking events are classified as Blocked, Contended, and
Deadlocked.
Blocked and Contended events tend to be correlated together. The new lock requester is blocked, and the
current lock holder gets the contended callback.
Deadlock events are very different, with no timeout, and deadlock events should be infrequent.

Blocked Access to the LIN is blocked waiting for another operation to
release a resource.
Contended A LIN is experiencing cross-node contention. A node holds

a lock after it finishes an operation because it might need
the resource again (lock caching). When another node
requests the same LIN lock, the coordinator node instructs
the original node to release the cached lock by using the
contended callback.
Deadlocked The attempt to lock the LIN resulted in a deadlock.
To obtain information on 50 recent lock events, you can run the following command:
isi statistics heat --totalby event,lin,path --limit 50
Disk activity
You can capture an overview profile of disk drive activity by running the following command, and examining
the Max, Min, and Average values for the disk time in queue and the number in queue (queue depth). For
information on SAS drives, you can include SAS instead of SATA.
Time in Queue
isi statistics drive --nodes=all --degraded --no-header --no-footer | awk ' /SATA/
{sum+=$8; max=0; min=1000} {if ($8>max) max=$8; if ($8<min) min=$8} END {print "Min =
",min; print "Max = ",max; print "Average = ",sum/NR}'
Number in Queue
Network factors
A baseline of output values is essential prior to an in-depth performance investigation. Comparing the metrics
from a performance-acceptable timeframe can help you make decisions about network factors that might be
adversely affecting performance.
NOTE: Unless otherwise stated, you must determine the significance of network command output values
based on a known baseline.
The quality of a network connection to the cluster can be defined by Hop Count, Latency (Response Time or
Round Trip Time), Jitter (which is a variation in Latency), and Packet Loss. Maximum Transmission Unit (MTU)
and bandwidth are also important factors in assessing network health.
Latency and hops

You must understand if hops exist between the cluster and a host because each network hop adds latency.
To display route and transit delays for packets over IP, run the traceroute command.

traceroute –q 5 <IP > Issue from client to a node
traceroute –q 5 <IP > Issue from node to another node
Example output
1 10.245.108.81 (10.245.108.81) 0.392 ms 0.283 ms 0.185 ms 0.118 ms 0.119 ms
In the example, only one line of output exists, which means no hops were encountered. As requested, the
latency values of 5 connection attempts are displayed.
Latency and packet loss

To send Internet Control Message Protocol (ICMP) Echo Request packets to the target and to time the echo
reply, allowing the measurement of round trip time and packet loss, run the following ping command.
ping -i .1 -c 250 <IP > Issue from client to a node

ping -i .1 -c 250 <IP > Issue from node to another node
MTU size of 1500 or 9000?

To set the number of bytes in the packet, run the ping command and use the ping –s parameter and –D
parameter to set "do not fragment packets." A 1500 MTU packet, less the header, is 1472 bytes, so if you run
the command with a byte count slightly higher than 1472, you can identify an MTU of 9000 or an MTU of 1500.
A successful run of this command indicates an MTU of 9000. Results of ping: sendto: Message too
long indicate that the MTU is 1500.
ping -s 1600 –c 5 -D <IP> Issue from node to client
Bandwidth
To measure bandwidth, run the following iperf command. Setting up the target must be done first.
iperf -s On target node

iperf -c <IP of Target Host> On source host
Results are available on either the source or target. Close iperf on the target using Ctrl-C.
Example output
------------------------------------------------------------
Client connecting to 10.245.108.26, TCP port 5001
TCP window size: 19.3 KByte (default)
------------------------------------------------------------
[ 3] local 10.245.108.81 port 60141 connected with 10.245.108.26 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1.10 GBytes 942 Mbits/sec

Jitter
To measure Jitter, run the following iperf command, which sends the User Datagram Protocol (UDP) packets
(parameter –u) between two hosts running iperf. (The bandwidth measurement is limited by UDP and must
not be considered a valid bandwidth indicator.) Setting up the target must be done first.
iperf -s -u On target node

iperf –u -c <IP of Target Host> On source host
Results are available on either the source or target node. Close iperf on the target using Ctrl-C.
Example output
------------------------------------------------------------
Client connecting to 10.245.108.27, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 9.00 KByte (default)
------------------------------------------------------------
[ 3] local 10.245.108.26 port 44598 connected with 10.245.108.27 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1.25 MBytes 1.05 Mbits/sec
[ 3] Sent 893 datagrams
[ 3] Server Report:
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 3] 0.0-10.0 sec 1.25 MBytes 1.05 Mbits/sec 0.003 ms 0/ 893 (0%)
Theoretical bandwidth maximum

The rates in the following table are theoretical maximum bandwidth numbers. In reality, the achieved
bandwidth is lower in proportion to other device loads and overheads in the link layer such as the interframe
gap (IFG). The maximum file transfer rate might be lower, due to protocol overhead and data packet
retransmissions.
Fast Ethernet
100 Mbit/s 12.5 MB/s
(100BASE-X)
Gigabit Ethernet
1 Gbit/s 125 MB/s
(1000BASE-X)
10 Gigabit Ethernet
10 Gbit/s 1.25 GB/s
(10GBASE-X)
40 Gigabit Ethernet
40 Gbit/s 5 GB/s
(40GBASE-X)

Retransmission rate
To display already collected network connection statistics by protocol, run the netstat command and use
the –s parameter. To limit the output to Transmission Control Protocol (TCP) results, use the –p tcp
parameter. To show the lines containing retrans with the line before and after to give context, include the
grep command. You can review the output to assess the retransmission activity against the total packets
counted for that statistic.
To view retransmit behavior issues, run the following command.
netstat -s -p tcp| grep -A 1 -B 1 retrans
Example output
27643409 data packets (57281629924 bytes)

606 data packets (204674 bytes) retransmitted
168 data packets unnecessarily retransmitted
0 resends initiated by MTU discovery
--
25363565 segments updated rtt (of 22446324 attempts)
604 retransmit timeouts
0 connections dropped by rexmit timeout
--
1896685 syncache entries added
0 retransmitted
3 dupsyn
Interpret the retransmission rates as a percentage of the total transmission. Below < 0.1% (one tenth of a
percent) retransmission for total transmitted bytes is acceptable for a local network. From the example output:
204674 retransmitted bytes*100/57281629924 transmitted bytes = .0004%
Since this retransmission rate is < 0.1%, this retransmission can be interpreted as not significant and not an
issue.
Hostcache.list
The hostcache.list holds a cache of the recent host connection information allowing faster re-connection of
returning hosts. You can examine the IP address, Round Trip Time (RTT), and Variance in RTT (RTTVAR) to verify
whether the round trip time in the hostcache.list matches expectations for the IP address and its physical
location. High RTT or RTTVAR values compared to a known baseline can indicate a problem on that client or
subnet. To view the hostcache.list, run the following command.
isi_for_array -X sysctl net.inet.tcp.hostcache.list
Example output
net.inet.tcp.hostcache.list:
IP address MTU SSTRESH RTT RTTVAR BANDWIDTH CWND SENDPIPE RECVPIPE HITS UPD EXP
10.7.190.74 0 0 77ms 5ms 0 15246 0 0 429 139 900

192.168.70.158 0 0 41ms 45ms 0 14756 0 0 83514 27568 3600
127.0.0.1 0 0 5ms 9ms 0 39696 0 0 194672 54104 3600
192.168.70.157 0 0 25ms 33ms 0 35013 0 0 203468 57769 3600

10.245.108.27 0 0 18ms 31ms 0 32727 0 0 12838 3668 3300
192.168.70.156 0 0 41ms 45ms 0 14756 0 0 81734 27066 3600
192.168.70.158 0 0 18ms 32ms 0 32727 0 0 206290 3913 3600
10.245.108.28 0 0 18ms 31ms 0 32727 0 0 12838 3668 3300
127.0.0.1 0 0 4ms 8ms 0 39696 0 0 195836 54048 3600
192.168.70.157 0 0 1ms 1ms 0 18029 0 0 110043 27126 3600
127.0.0.1 0 0 5ms 9ms 0 39696 0 0 194347 54240 3600

192.168.70.157 0 0 24ms 39ms 0 18029 0 0 108736 27064 3600
192.168.70.156 0 0 18ms 31ms 0 32727 0 0 13776 3770 3600
10.245.108.94 0 8927 1ms 1ms 0 14480 0 0 4853320 1213330 3600
10.245.108.26 0 0 18ms 31ms 0 32727 0 0 12929 3694 3300
Cache response times and effectiveness

You can obtain cache statistics by running the isi_cbind command. Only the display commands are shown,
and you must exercise caution when using this command.
Notice: Do not use the clear, set, or interactive mode commands without instructions or knowledge
of the potential implications.
To observe response times (µs) for cluster DNS servers, run the following command. In the output, note large
variations in response times (or other metrics) between DNS servers.
isi_cbind show dns
Example output
DNS server 1: (dns:10.7.190.3)

queries: 458 - queries sent to this DNS server
responses: 458 (100%) - responses that matched a pending query
spurious: 0 ( 0%) - responses that did not match a pending query
dropped: 0 ( 0%) - responses not installed into the cache (error)
timeouts: 0 ( 0%) - times no response was received in time
response_time: 0.001925 - average turnaround time from request to reply
DNS server 2: (dns:10.7.190.4)
queries: 0 - queries sent to this DNS server
responses: 0 ( 0%) - responses that matched a pending query
spurious: 1 (100%) - responses that did not match a pending query

To observe distributed cache response times (µs) from peer nodes in the cluster, run the following command.
In the output, note large variations in response times (or other metrics) between nodes.
isi_cbind show cluster
Example output
Logical cluster node number 2: (CSE-X200-1-2:192.168.2.22)

queries: 6139 - queries sent to this cluster node
responses: 6137 ( 99%) - responses that matched a pending query
spurious: 0 ( 0%) - responses that did not match a pending query (probably
expired)
To observe cache effectiveness (workload dependent), run the following command. In the output, observe the
hit rate, where above 70% is deemed effective use of cache. Check the count of expired entries, where fewer
entries shows that the workload is cache-friendly.
isi_cbind show cache
Example output
Cache:
entries: 10 - entries installed in the cache
max_entries: 33 - entries allocated, including for I/O and free list
expired: 12251 - entries that reached TTL and removed from the cache
probes: 56946 - count of attempts to match an entry in the cache
hits: 40879 (71%) - count of times that a match was found
updates: 3 - entries in the cache replaced with a new reply
response_time: 0.000004 - average turnaround time for cache hits
Protocol operations
Protocol operations most used
You can find what protocol operation is most used by running the following command.
isi statistics protocol list --sort Ops --degraded
Protocol operations taking the most time

You can determine which protocol operations are taking the most time by running the following command.
isi statistics protocol list --sort TimeAvg --degraded

Clients most-to-least demanding
You can sort clients from most-demanding to-least demanding, by running the following command.
isi statistics client list --sort Ops --degraded
Connection distribution
You can observe the balanced connection distribution across nodes of the cluster by running the following
command.
isi statistics query current --nodes all --degraded --stats

node.clientstats.connected.nfs,node.clientstats.active.nfs
isi statistics query current --nodes all --degraded --stats
node.clientstats.connected.smb,node.clientstats.active.smb2
Example output
Node node.clientstats.connected.smb node.clientstats.active.smb2

---------------------------------------------------------------------
1 0 0
2 1 0
3 0 0
average 0 0
---------------------------------------------------------------------
Slow authentication
You can detect slow or timed-out responses from Windows domain controllers by running the following
command.
isi statistics protocol list --protocols lsass_out --degraded
Protocol latency
WARNING: In OneFS versions earlier than OneFS 8.x, SMB protocol latency (TimeAvg) numbers can be skewed
by Change Notify operations. Requests for change notifications receive an initial response but also a response
when a file changes in the folder that is being monitored. The response might be immediate, after 3 seconds,
after 3 hours, never, and so on. To verify if Change Notify is elongating latency numbers, report by class. If the
majority of time is spent in file_state, Change Notify is skewing the numbers. This issue is resolved in OneFS
8.0.x and later.
The following table outlines the common expectations about protocol latency times.
Namespace metadata Read Write
< 10 ms Good
10 ms – 20 ms Normal Dependent on I/O size Dependent on I/O size
20 ms < Bad

Namespace metadata Read Write
50 ms < Investigate
To observe which class of operation is taking the longest, run the following command. The output TimeAvg
must be converted to milliseconds (ms) if you are comparing to the standard expectations in the table. The
output of this command is only meaningful with active traffic.
isi statistics protocol list --totalby class
Example output
Ops In Out TimeAvg TimeStdDev Node Proto Class Op

-------------------------------------------------------------
1.9 164.6 0.0 1826.2 5563.2 * * write *
-------------------------------------------------------------
Total: 1
Clusters and nodes

Capacity
The Isilon OneFS best practices is to have at least 10% free capacity on the cluster.
To observe the free capacity on the cluster and then the storage pool, run the following commands.
isi status --quiet

isi storagepool list
isi status -p
CPU
To observe the 10 processes using the most CPU on each node, run the following command.
isi_for_array -s -X 'top -n –S 10'
Memory
To observe the status of memory for each node, examine the output of the following commands.
isi statistics query current --nodes all --degraded --stats node.memory.used

isi statistics query current --nodes all --degraded --stats node.memory.free
To observe the consumption of memory by process, run the following command. Use the less up, down
controls to read the list before closing the less command.
isi_for_array -s -X 'sysctl kern.malloc_pigs' | less

Load average
You can obtain the system load average for intervals of 1, 5, and 15 minutes by using the uptime command.
The load average number is generated by using inputs from both CPU queues and disk queues. The load
average number is not normalized for the number of CPUs, so a load average of 1 on a single CPU means the
system is fully loaded. A load average of 1 on a 4 CPU system means the system is 75% idle. Be aware of the
CPU count to correctly interpret the results.
To obtain the CPU count, run the following command.
isi statistics query current --nodes all --degraded --stats node.cpu.count
Run the following uptime command across the cluster.
isi_for_array -s 'uptime'
Balance across nodes

You can view the balance of disk operations across the nodes by running the following command. An
unbalanced distribution of disk operations requires further investigation.
isi statistics query current --nodes=all --stats=node.disk.xfers.rate.sum
Example output
Node node.disk.xfers.rate.sum
---------------------------------
1 5.400000
2 7.000000
3 0.200000
average 4.200000
---------------------------------
Total: 4
SmartConnect and load balance

To display a list of all the IP addresses that are bound to external interfaces, run the following command. Note
any nodes that have one more IP address than the rest of the nodes.
isi network interfaces list
To check the status of the nodes, run the following command and observe the Out column of Throughput to
determine the load of the nodes. Assess the throughput balance across the nodes in relation to the IP
connections.
isi status
SMB open files list

To obtain information about Windows (SMB) open sessions and files, run the following commands.
isi smb session list

isi_for_array -X 'isi smb openfiles list -v --format=csv --no-header --no-footer'

NFS locks list
To obtain information about NFS locks on the array, run the following command.
isi_for_array -X isi nfs nlm locks list -v --format=csv --no-header
Disk drive behavior

Before you examine drives, you must have an understanding of drive metrics. Originally systems contained a
single-disk. A device driver maintained a queue of waiting requests serviced one at a time by that disk. The
standards in the following table are well-defined.
Utilization / Capacity Measure of in-use versus free disk capacity
Utilization/ Busy Average of time the disk was busy over the sample interval
Disk Percent Busy Average of time the disk was busy over an interval, expressed as a percentage
Disk Latency Seek Time + Rotation Time + Transfer Time
Service Time Time from the device controller request to the end of transfer , including delays
due to queuing and latency
Response Time Disk service time plus all other delays such as network, until data is at the host
Throughput Average amount of data transferred within a period of time (for example, MB/s)
Queue Depth Average number of requests waiting to be processed by the disk
Time in Queue Average time requests waited in queue to be processed by the disk
OneFS reports metrics in a standard way. However, in OneFS the I/O per second (IOPS) measurement is
measured where the file system dispatches requests to the device queue. I/O requests can be coalesced while
in the device queue, and, for SAS drives, multiple drive operations can be executed in parallel. Usually, the
end result is that the drive sees far fewer operations than are fed into the top of the device queue. OneFS,
however, reports from the top of the device queue, which can result in IOPS numbers that are higher than
those given by block devices and that are higher than expected. You should take this fact into account when
commenting on OneFS IOPS measurements.
The following IOPS measurements are typical for each drive type.
SATA: 100 Mach8 SSD: 2600
SAS: 200 Hitachi SSD: 4800

Disk: Time in queue
Disk time in queue indicates how long an operation is queued on a drive. This indicator is key for spindle-
bound clusters. A time in queue value of 10 to 50 milliseconds equals Yellow zone, a time in queue value of
50 to 100 milliseconds equals Red.
To obtain the maximum, minimum, and average values for disk time in queue for SATA drives, run the
following command. For information on SAS drives, you can include SAS instead of SATA.
To display time in queue for 30 drives sorted highest-to-lowest, run the following command:
isi statistics drive list -n all --sort=timeinq | head -n 30
Disk: Queue depth

Queue depth indicates how many operations are queued on drives. A queue depth of 5 to 10 is considered
heavy queuing.
To obtain the maximum, minimum, and average values for disk queue depth of SATA drives, run the following
command. For information on SAS drives, you can include SAS instead of SATA. If a large difference exists
between the maximum number and average number in the queue, conduct further investigation to see if an
individual drive is working excessively.
To display queue depth for 30 drives sorted highest-to-lowest, run the following command:
isi statistics drive list -n all --sort=queued | head -n 30
Disk: Percent busy

Disk percent busy can be helpful to determine that the drive is 100% busy, but it does not indicate how much
extra work might be in the queue. To obtain the maximum, minimum, and average disk busy values for SATA
drives, run the following command. For information on SAS drives, you can include SAS instead of SATA.
{sum+=$10; max=0; min=1000} {if ($10>max) max=$10; if ($10<min) min=$10} END {print
"Min = ",min; print "Max = ",max; print "Average = ",sum/NR}'
To display disk percent busy for 30 drives sorted highest-to-lowest issue, run the following command.
isi statistics drive -nall --orderby=busy | head -n 30

Related resources
Performance Info Gathering
OneFS: Troubleshooting performance issues
NETWORKING - PERFORMANCE: Isilon Troubleshooting Guide
Isilon Troubleshooting Guide: File Systems – Performance Issues
How to Triage VNX-File/Celerra NAS performance issues
How to address/approach Unified File and Celerra Performance issues
How to use iperf between a client and server to measure basic network throughput
InsightIQ Demonstration Video: Capacity Usage through File System Analytics
InsightIQ Demo Video: Identify Demanding Clients
Advanced Troubleshooting of an Isilon Cluster
OneFS Performance Monitoring and Planning
OneFS L3 Cache Performance and Best Practices

Contacting EMC Isilon technical support
You can contact EMC Isilon Technical Support for any questions about EMC Isilon products.
Online Support Live Chat

Create a Service Request
Telephone Support United States: 1-800-SVC-4EMC (800-782-4362)

Canada: 800-543-4782
Worldwide: +1-508-497-7901
For local phone numbers in your country, see
EMC Customer Support Centers.
Email Support Unavailable
Help with Online Support For questions specific to EMC Online Support site
registration or access, email support@emc.com.
Copyright © 2016 EMC Corporation. All rights reserved.

Published in USA.
EMC believes the information in this publication is accurate

as of its publication date. The information is subject to
change without notice.
The information in this publication is provided “as is.” EMC

Corporation makes no representations or warranties of any
kind with respect to the information in this publication, and
specifically disclaims implied warranties of merchantability
or fitness for a particular purpose. Use, copying, and
distribution of any EMC software described in this
publication requires an applicable software license.
EMC², EMC, and the EMC logo are registered trademarks or

trademarks of EMC Corporation in the United States and
other countries. All other trademarks used herein are the
property of their respective owners.
For the most up-to-date regulatory document for your product

line, go to EMC Online Support (https://support.emc.com).
For documentation on EMC Data Domain products, go to the
EMC Data Domain Support Portal
(https://my.datadomain.com).


OneFS Cluster Performance Metrics, Tips, and Tricks PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

OneFS Cluster Performance Metrics, Tips, and Tricks PDF

Hochgeladen von

Copyright:

Verfügbare Formate

EMC ISILON ONEFS CLUSTER PERFORMANCE

METRICS HINTS AND TIPS

September 21, 2016

2 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 2

3 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 3

Isilon OneFS Best Practice References

• Best Practices for Data Replication with EMC Isilon SyncIQ

When to analyze your workload performance

Cluster health state

4 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 4

isi_for_array -sX 'isi devices list | grep -vi healthy'

NumOps Ops Proto Op

The operational measurements are defined as:

• NumOps is the number of operations in the sample time.

Protocol read, write, and metadata mix

isi statistics pstat --protocol smb2

5 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 5

___CPU Utilization___ _____OneFS Stats_____

____Network Input____ ___Network Output____ ______Disk I/O_______

• Read (829/1256)*100 = 66%

6 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 6

Ops Proto UserName

ftp nlm irp

hdfs papi jobd

http siq lsass_in

nfs3 smb1 lsass_out

Busiest files and paths

Files marked as “UNKNOWN” are one of the following:

7 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 7

93.8 other /ifs

8 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 8

OneFS 7.2.x file sizes

sqlite3 -header -column /ifs/.ifsvar/modules/fsa/pub/latest/results.db 'select path,

9 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 9

OneFS 7.2.x histogram

sqlite3 -header -column /ifs/.ifsvar/modules/fsa/pub/latest/results.db 'select

10 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 10

sqlite3 -header -column

267.431 177.913 data/OneFS_v8.0.1.BETA.0_Install.tar.gz

OneFS 8.0.x file histogram

11 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 11

curl -k -s -u root:<pw> --get "https://localhost:8080/platform/3/fsa/results?"

curl -k -s -u root:<pw> --get

A table with each bucket size is shown below.

File Size Bucket Example output

12 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 12

isi_for_array -X sysctl efs.bam.bsw.num_misaligned_4k_writes

CSE-2-X200-1: efs.bam.bsw.num_misaligned_4k_writes: 1702

CSE-2-X200-3: efs.bam.bsw.num_misaligned_4k_writes: 259

CSE-2-X200-2: efs.bam.bsw.num_misaligned_4k_writes: 588298

isi statistics pstat --protocols <protocol of the virtual machine>

Blocked, contended, and deadlocked events

13 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 13

Contended A LIN is experiencing cross-node contention. A node holds

Deadlocked The attempt to lock the LIN resulted in a deadlock.

isi statistics heat --totalby event,lin,path --limit 50

Latency and hops

14 EMC Isilon OneFS Cluster Performance Metrics Hints and Tips 14

1 10.245.108.81 (10.245.108.81) 0.392 ms 0.283 ms 0.185 ms 0.118 ms 0.119 ms

Latency and packet loss

ping -i .1 -c 250 <IP > Issue from client to a node

MTU size of 1500 or 9000?

ping -s 1600 –c 5 -D <IP> Issue from node to client

_CPU Utilization_ _OneFS Stats_

__Network Input _Network Output Disk I/O_____