h13343 WP Isilon Best Practices Surveillance

BEST PRACTICES GUIDE FOR
IMPLEMENTING EMC ISILON IN A

VIDEO SURVEILLANCE SOLUTION
ABSTRACT
This Guide provides technical information and recommendations to consider when
planning, designing, and implementing a surveillance solution with EMC Isilon Scale-
out storage. It includes best practices for leveraging integration functionality between
the surveillance applications and the Isilon OneFS operating system and general
recommendations for optimizing performance and availability.
August 2014
EMC WHITE PAPER

To learn more about how EMC products, services, and solutions can help solve your business and IT challenges, contact your local
representative or authorized reseller, visit www.emc.com, or explore and compare products in the EMC Store
Copyright 2014 EMC Corporation. All Rights Reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without
notice.
The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with
respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a
particular purpose.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
Part Number H13343
2
TABLE OF CONTENTS
ABSTRACT 1
EXECUTIVE SUMMARY 4
AUDIENCE 4
VALIDATED SOFTWARE AND TOPOLOGIES 4
TIER 1 VS TIER 2 NAS STORAGE 6
EVIDENCE AND VIDEO STORAGE 8
SERVER-TO-NODE RATIOS AND SIZING 9
HIGH AVAILABILITY DESIGNS IN SURVEILLANCE 12
Isilon Failure Scenarios 12
ISILON CLUSTER CONFIGURATIONS 13

SmartQuotas 13
Impact Policies 14
Networking 15
Smartconnect Settings 15
SyncIQ 17
Security Integration 17
REFERENCES 18
3
EXECUTIVE SUMMARY
This Best Practices Guide provides technical information to consider when planning and implementing EMC Isilon scale-out NAS
storage with version 7.0 or higher of the Isilon OneFS operating system to support a Surveillance system infrastructure. Its
fundamental intent is to help ensure that the Isilon storage configuration implemented will meet the overall capacity, availability, and
performance requirements of the surveillance workloads to be supported.
This Guide is intended to serve as a design and configuration guide for EMC Isilon and the associated video management system
(VMS) for EMC technical consultants and solutions architects. Additionally, this guide can assist those who wish to use the Isilon
Cluster Sizing tool (Web based tool available to EMC partners and employees) to create a properly sized and configured Isilon
storage cluster. While this EMC utility can be leveraged to compile a bill of materials for selling and building an Isilon storage cluster,
gathering storage capacity and performance requirements is still a necessary precursor to using the tool.
This Guide will assist in ensuring that the expected impacts of hosting a surveillance infrastructure on an Isilon storage cluster are
adequately identified, documented, and accounted for in the overall solution design. The Guide may also be of use in gathering and
assessing the performance requirements for other workloads that will run on the cluster, but the principles and concepts outlined
here will focus primarily on the surveillance system aspects of performance and capacity planning.
AUDIENCE
The Guide is written for system designers and administrators who are familiar with video surveillance technology, server technology,
and network storage administration. This Guide is intended for by EMC and EMC partner resources that have a working knowledge of
the following:
Consolidated storage technologies, including Network-Attached Storage (NAS) and Storage Area Network (SAN) protocols and
considerations
EMC Isilon scale-out storage and the Isilon OneFS operating system
Video management systems and video surveillance sensors, such as IP cameras
Spreadsheet applications as appropriate for performance and capacity analysis, including data imports, exports, conversions,
and creation of performance and capacity charts using those raw data sets
VALIDATED SOFTWARE AND TOPOLOGIES

The typical video surveillance deployment consists of multiple components:
IP cameras/encoders that create the streaming video over the network
Video management system (VMS) that is designed to receive, record, and distribute the video
Viewers running in browsers, thick clients, and in mobile operating systems.

Compute, storage, and network infrastructure to allow for processing, storing, and transporting the video and related data.
One of the key attributes to these systems is the need to store the video data as is illustrated in Figure 1.
4
Figure 1: Surveillance system architecture
Standard NAS protocols such as SMB and NFS are the primary interface from the video management systems to EMC Isilon storage.
The use of NAS for these applications has many benefits.1 It is through the VMS vendors that the integration for the surveillance
system capabilities and the infrastructure is enabled. The limitation in most implementations is the ability for the VMS software to
handle the workload of ingesting video streams from disparate cameras, normalizing them, and then writing the resulting files to
storageall while distributing the real time and recorded video to clients across the networks.
Most VMS vendors support NAS protocols in some way, and EMC Labs works with the majority of these vendors to validate the
integration functionality and performance limitations. This information is captured and made available via the use of Technical Notes
per the VMS Vendor in Inside EMC site by searching for the VMS vendor (Genetec, Verint, Milestone, Aimetis, Surveillus, or
DVTel).
EMC updates a list of validated software through EMC labs in the EMC Isilon Third Party Compatibility Guide. Figure 2 below is a
June 2014 snapshot of validated VMS vendors and versions. The document is regularly being updated with VMS vendors that are
adding NAS to their capabilities or as new EMC platforms/software versions become available. There are 5 additional VMS in test at
this time (ISS, NICE, Axxonsoft, Digifort, and Cisco). One can check this online document for updated information monthly:
https://support.emc.com/docu45932_Isilon-Third-Party-Software-and-Hardware-Compatibility-Guide.pdf?language=en_US
1
http://www.emc.com/collateral/white-papers/h12546-wp-video-surveillance.pdf
5
Figure 2. Supported surveillance software partners (6/2014)
Validation is not simply basic integration (which most VMS applications can accommodate via NAS), but rather being able to perform
with minimum standards typically specified by the VMS vendor. The primary limitation with most is the VMS I/O pipeline and how it
is implemented in the software such that minimum throughput can be accommodated. How this information is used is covered in
Sizing section later in this document.
For VMS vendors not on this list one should be wary in sizing and designing the system and be highly conservative. It is generally
more important for sizing to ensure the VMS vendor supports NAS and for which tier of storage: tier 1 or tier 2. In Video
surveillance systems the tier supported is important for sizing and understanding the overall software architecture.
TIER 1 VS TIER 2 NAS STORAGE

The tier 1 storage approach, as defined for this whitepaper, for the VMS is where the storage specified in the VMS configuration and
usually denotes that the software is streaming the writes to the storage device in real time or near-real time. There are occasions in
the software where the writes from the VMS software are still performed on the local disk of the operating system and subsequently
transferred over to the allocated tier 1 storage location after some minimal amount of time (1-15 minutes). An example of how
configuration of the tier 1 storage is specified below as would be done in Genetec Security Center 5.2:
Figure 3. Screenshot of Genetec tier 1 configuration
6
The use of NAS as a tier 1 target allows simple calculations of aggregate bandwidth and capacity for the system since bandwidth is
equal to the server ingest bandwidth (streaming video bandwidth) and capacity is determined by percent recorded and retention
time. The use of multiple tier 1 targets simultaneously is possible and typically referred to as dual writes from the VMS. The use
of dual writes from the VMS will have significant affect in the I/O pipeline and generally reduces the throughput per server by 50%.
This mechanism is still used by some customers to accommodate two Isilon clusters in a redundant fashion.
The tier 2 storage approach for the VMS requires multiple locations of storage for the video files to transition from one location to the
other. Currently, the only vendor that does this (and that EMC labs has tested) is Milestone Xprotect Corp 2013. There are other
VMS vendors that support this for NAS as the tier 2 target and typically only block storage as the tier 1 target (Exacq Vision or
Avigilon ACC for instance). For some, this is such that tier 1 storage is only for 2 hours and the tier 2 storage is for 30 days or
beyond. This depends on the VMS vendor as does the migration policy and implementation where migration may occur at
configurable intervals for video data that is aged beyond tier1 policy or it may just migrate all video data at that interval. An
example, as illustrated below, has a recorder using the E Drive as the primary tier where video is initially recorded for 2 hours and
then moved to tier 2 target defined as a NAS share (\\video_archive).
What is consistent when these tier 2 targets are used is a hit on the I/O pipeline on the VMS servers associated with the need to
read, process, and then write the data from the tier 1 to the tier 2 target. This results in a decrease of throughput on the VMS
servers that is commensurate with the implementation. The tier 1 target has to handle the brunt of the workload because it is being
written to while also being read from simultaneously, where as the tier 2 target gets a bulk transfer from this tier 1 location. Below
is an example of a screen shot of the configuration in Milestone Xprotect for the storage locations.
Figure 4. Screenshot of Milestone tier 2 configuration
Using Isilon as a tier 1 target provides a fairly uniform throughput over time from the VMS servers; using Isilon as the tier 2 target
manifests as a step function every period (as defined in the software 15 minutes, 1 hour, 2 hours, etc.).
Using Isilon as tier 1 or tier 2 will affect the node type and sizing of the cluster. One of the primary differences is the VMS
implementation using NAS as the tier2 target has a better capability of absorbing latency spikes in the Isilon cluster, which results in
less stringent constraints on performance handling during steady state and/or failure mode scenarios. This is similar to archive
workloads and is why Isilon NL400 (Isilon lower end node platform) nodes can be used on most occasions in the tier 2 scenario. The
main limitation in using Isilon as a tier 2 target for the VMS vendors typically is present in the tier 1 capabilities used by the VMS,
which is normally block storage due to the limitation of the VMS vendors on this tier 1 target. A common occurrence for these tier 1
targets are IOP storms during the copy phases associated with moving the data. As a result, it is usually best practice to have 15K
or 10K RPM disks for the block storage supporting the tier 1 target (or refer to VMS vendor best practice documentation for these
scenarios).
7
For Isilon as a tier 1 target, the sensitivity of the VMS implementation to NAS protocol latencies is commonly due to the limited
buffer sizes per thread for each stream. This is why it is typical to use Isilon X400/X410 platforms with more robust CPU and
memory than the alternative NL400 Isilon nodes. This is especially true for scenarios where the server to node ratio is greater than
1:1; where the number of camera threads writing to Isilon is X times the 1:1 scenario. Actual sizing for these systems is more
complex, but this is a good general rule of thumb for selecting node type based on the tier. Server to node ratios is another very
relevant variable discussed in the sizing section.
If Isilon as a tier 1 target is not performing properly due to VMS, network, or Isilon misconfiguration, it generally results in buffer
overflows and lost frames on the VMS servers. Every VMS vendor illustrates this differently in their log files (buffer overflows, file
process failed, etc), but the end result is the cameras do not record for these periods of time. For the scenario where Isilon is used
as the tier 2 target, the outcome usually is that the archive phase within the VMS cannot catch up with the tier 1 ingest
throughput. This eventually can create an issue in the tier 1 target, creating deletions of video that are not planned. Proper design
of either approach is very important to ensuring an optimal surveillance system deployment.
Implementation Note: EMC InsightIQ can provide very valuable information about the health of the Isilon
system. For tier 2 deployments, the traffic between the VMS server and Isilon node should show a step
function going from 0-X Mbps. The key is that the step function gets to zero. If not, the VMS server is
not able to offload the data fast enough and will eventually overload the tier 1 target.
EVIDENCE AND VIDEO STORAGE

In video surveillance applications, the use of the VMS clients on desktops for reviewing video allows a user to create an export. An
export is when the VMS client creates a file that can be distributed to client systems not running the VM client software. An example
is the Legal department, who needs to examine the validity of a claim, but is running Windows 7 OS without the specialized VMS
software. By sending this user a standard video format viewable on Windows, such as .wmv or even .mp4 formats, the individual
can view as necessary using native Windows video viewing applications. These exports are also commonly called Evidence or
potential Evidence for use in legal proceedings or collaboration amongst different organizations.
With Isilon as the video repository at \\video_archive, the same system can also have a share referred to as
\\evidence_archive. The evidence archive would be reached by all distributed security personnel on the network and provide a
centralized and secure location for the exports to be created in by the VMS clients.
Every site with a monitor (VMS Client system) can export to the central evidence archive as shown below. It is uncommon for high
volumes of these transactions, but the transactions will typically transfer large files between 5MB-5GB. Most of the users have this
directory mapped as a network drive in their Microsoft Windows machine upon startup, and there are usually less than 10 of
these users on a system at any given time.
The workload on EMC Isilon and optimizations to accommodate this use case is minimal. For scenarios where 250 users (security
personnel) are active at one time, the performance of the share is dependent on which version of Windows (due to the directory
notifications from so many concurrent clients in certain versions and other shared directory limits. Please refer to EMC Isilon Home
Directory Storage Solutions for NFS and SMB Environments. This is uncommon, but possible within large citywide surveillance
systems where an event (concert, marathon, etc) may require many organizations to access video at one time as part of their
collaboration efforts. In these cases it would be beneficial to allocate shares per region or monitor to avoid suboptimal performance
issues (\\evidence_archive\zone1, \\evidence_archive\zone2, etc.).
Implementation Note: One should use a separate SmartConnect zone and associated interfaces for
evidence repository to be sized based on number of clients concurrently writing/reading from this
directory.
8
Figure 5. Surveillance system architecture with evidence archives
SERVER-TO-NODE RATIOS AND SIZING

Sizing of the system can be complex, but if the VMS is validated via EMC labs, the sizing should use the achievable throughput per
server (recorder) per Isilon platform type as the primary guiding factor. The sizing data is available to EMC partners and employees
on Inside EMC as specified in earlier sections: search for the VMS vendor (Genetec, Verint, Milestone, Aimetis, Surveillus,
DVTel). New VMSs are being added, as are subversions of OneFS regularly.
To ensure published bandwidths are achievable when sizing an Isilon cluster for a surveillance system it is important to ensure there
are not too many variations from the configurations published in the technical notes. EMC Labs attempts to reproduce realistic
environments for compute, network, and hypervisor components, but some of the VMS settings can have impacts on performance of
the full system. The following attributes are commonly part of a production surveillance system configuration and are what is tested
and documented in the technical notes.
SYSTEM DESCRIPTION TESTING PROCEDURES AND AFFECT ON

ATTRIBUTE ACHIEVABLE BANDWIDTH
Number and Bit Resolution, frame rate, and codec will dictate Testing of VMS uses between 64-150 video streams per
Rate of Video the average bandwidth of the video feed, server and is specified in the Technical Notes and uses
Streams coupled with the complexity of the scene. between 1-6 Mbps video streams: correlating to SD
(4CIF@30fps) and HD (1080p@15fps) video.
There is usually an upper bound of number of
video streams per server (usually below As the number of video stream increase beyond ~100, the
200). servers see degradations in achievable bandwidth. If a larger
number of cameras per servers are specified, reduce the
achievable bandwidth per server by 10%.
As resolutions increase, the average bit rate per thread

increases and EMC typically sees enhanced aggregate
throughputs. To protect from the unknown of additional
codec complexities for high resolutions (using 10MP cameras
with JPEG2000 for instance), reduce achievable bandwidth by
10%.
9
SYSTEM DESCRIPTION TESTING PROCEDURES AND AFFECT ON
ATTRIBUTE ACHIEVABLE BANDWIDTH
Audio Recording Many cameras have audio microphones built Testing does not take audio recording into consideration, but
in to allow for inband audio recording by the it is anticipated to be a minimal effect on the overall
VMS. This is uncommonly used due to legal workload. There should be minimal affect on the
issues with audio recording. performance capabilities for these scenarios.
Watermarking VMS vendors all have various capabilities for Testing does not take into consideration watermarking for
enhanced security. Watermarking is one that most VMS vendors, but the effect on the server I/O can be
creates a digital hash for every video file dramatic because it requires more work by the VMS server
created, ensuring the recording and files are per video stream. Reduce the achievable bandwidth by 20%.
originated by the video management system
(VMS) and no tampering of the video is
present.
Motion Based Most VMS vendors support motion-based Testing does not generally take this into consideration due to
Recording recording, so recording disk space can be lack of many production systems using motion-based
minimized to only when motion is present in recording. Many of these environments are relying on the
the frame. Cameras also support this Cameras motion detection algorithms, which has negligible
function, but require the VMS vendor to allow effect on the server or storage infrastructure.
the camera to specify if motion is occurring
For scenarios where the VMS is doing the motion detection on
versus evaluating it on the server directly. the server, the servers load on CPU and memory is increased
and there is an effect on the ability to perform I/O using NAS
protocols. Unless the Technical Note specifies this is tested,
reduce the achievable bandwidth by 20%.
Dual Writes A VMS vendor could have multiple targets Testing does not take this into consideration, but it is
simultaneously setup to record to. This is estimated that the bandwidth will reduce by the number of
often times referred to as dual writes when volumes the server is simultaneously writing to. If there
two targets are setup as the primary tier were 2 volumes, the bandwidth per server to each node
storage. would the achievable published bandwidth.
Server Hardware Every VMS vendor has recommended Since much of the testing done with VMS vendors used virtual
minimums of server CPU, memory, storage, machines, the key was allocating the aligned amount of
and network configurations. All testing and memory, OS storage, and network capabilities, as defined by
designs did and should align with these the VMS vendor.
recommendations. Non VM optimized hardware may not scale as well as others.
In testing, EMC utilized Cisco UCS B-Series servers, which are
designed specifically for virtualization. Additionally, some
VMS vendors only support bare metal implementations.
Table 1. Common attributes that are commonly part of a production surveillance system configuration.
One common finding across all Isilon platforms is that the server throughputs degrade as additional servers are added to a node.
The average throughout per server is between 30-50 MBps with Isilon X400/X410/NL400 nodes with HDDs (no SSDs) when the
server to node ratio is 1:1. Once the ratio becomes 2:1, the average throughput drops to between 20-35 MBps, with a more
precipitous decline when using the NL400s. Note, that all throughput is determined during 20% reads and FlexProtect jobs active,
such that Node Removes and Node Adds are typically not going to affect the published performance. This last point is important for
Security personnel and system administrators knowing the system can operate with no video loss even in drastic failure states (loss
of node is equal to 36 disks being lost).
10
In sizing the Isilon cluster for a surveillance system, the inputs necessary are depicted below in green cells. In sizing, there are two
driving factors that will help specify the number of nodes necessary in the cluster: performance and capacity. In sizing an Isilon
cluster, one should examine the number of nodes for capacity and performance and choose the highest number of nodes.
Step 1: Enter Camera Configurations and Associated Bit Rates Step 2: Enter Camera Group Configurations for Recording
Questions Input Questions Input
Camera Group 1: Camera Configuration H.264 720p @ 15fps Camera Group 1: Number of Cameras 2,500
Camera Group 1: Bitrate per Camera (Kbps) 2,560 Cam Group 1: Percentage Recording (motion or other) 100%
Camera Group 2: Camera Configuration Manual Entry Cam Group 1: Retention Policy (Days) 30.0
Camera Group 2: Bitrate per Camera (Kbps) - Camera Group 1: Min Required Useable Capacity (TiB) 1,885.93
Camera Group 3: Camera Configuration Manual Entry Camera Group 2: Number of Cameras -
Camera Group 3: Bitrate per Camera (Kbps) - Cam Group 2: Percentage Recording (motion or other) 100%
Camera Group 4: Camera Configuration Manual Entry Cam Group 2: Retention Policy (Days) -
Camera Group 4: Bitrate per Camera (Kbps) - Camera Group 2: Min Required Useable Capacity (TiB) -
Camera Group 5: Camera Configuration Manual Entry Camera Group 3: Number of Cameras -
Camera Group 5: Bitrate per Camera (Kbps) - Cam Group 3: Percentage Recording (motion or other) 100%
Cam Group 3: Retention Policy (Days) -
Step 3: Enter Additional Surveillance System Variables Camera Group 3: Min Required Useable Capacity (TiB) -
Question Input Camera Group 4: Number of Cameras -
Percentage of Video Used for Evidence 1.00% Cam Group 4: Percentage Recording (motion or other) 100%
Retention Time for Evidence (Days) 365 Cam Group 4: Retention Policy (Days) -
Max Number of Simulateous Playback Viewing Workstations 15 Camera Group 4: Min Required Useable Capacity (TiB) -
Average # of Streams per Viewing Workstation (not used in Calcs) 4 Camera Group 5: Number of Cameras -
Type of Video Analytics Used (not used in Calcs) On VMS Server Cam Group 5: Percentage Recording (motion or other) 100%
Number of Video Analytics Streams (not Used in Calcs) 0% Cam Group 5: Retention Policy (Days) -
# of Disaster Recovery Sites (Not Including Primary Site) 0 Camera Group 5: Min Required Useable Capacity (TiB) -
Video Management System (VMS) OR NAS Enabled Camera Vendor Genetec Security Center 5.2
Number of VMS Recording Servers 50
Table 2. List of inputs for surveillance system sizing
To determine capacity, the base calculation for supporting the camera recording is Average Bit Rate x Retention Time x Percent
Recording. For many systems the capacity is also affected by the amount of additional evidence stored on the cluster, so the
resulting total minimum capacity needed is evidence recording + camera recording. To predict the amount of evidence needed, it is
typically stated as a percentage of the camera recording (e.g. 1-5%). In order to size an Isilon cluster for evidence capacity, it is
important to ensure the cluster is not running at 100% capacity, so targeting 85% full in the cluster is ideal for surveillance systems.
To determine the number of nodes necessary to meet the performance requirements,
1. Determine the Aggregate Bandwidth for the system and then derive the Per Server Bandwidth.
2. The cluster should be able to support the Per Server Bandwidth by each node in the cluster based on the load distribution.
The load distribution is ideally uniform and automated via either Round Robin or Connection Count using SmartConnect load
balancing. For example, a system with 10 VMS servers running at 30MBps each needs (10) NL400-144TB nodes to satisfy
the capacity requirement, and the load distribution is 1:1 for server-to-node ratio.
3. For the system to be sized to support a node failure it is important to evaluate the load distribution and ability for the node
type to achieve the require Per Server Bandwidth.
For the same example, a node failure would result in (9) NL400-144TB and the load distribution would be 1:1 for 8 nodes of the
cluster and 2:1 for 1 node, where each server has a bandwidth of 30 MBps. This means the node platform has to support 2:1
server-to-node ratio with 30MBps per server. Many times this increased server-to-node ratio will require using X400s nodes.
Smart Connect Implementation Note: In production, there are a few key elements to ensure ideal
operation based on the sizing, but the most relevant is the proper distribution (target < 4) of the VMS
servers to nodes. If round robin is used for SmartConnect initial connections with only a small number of
VMS servers, such as 10, the distribution of SMB/NFS connections could result in overloading of a few
nodes while starving others.
11
Implementation Note: While an Isilon cluster can handle virtualization workloads by hosting NFS
datastores for the VMS server OS partitions, the majority of the workload for surveillance can be handled
easily by HDD-only X400/NL400 platforms. Virtualization workloads are best handled by systems with
SSDs in the nodes; it is best to use a separate pool for the NFS datastores or use block-based storage for
the virtualized OS boot partitions.
HIGH AVAILABILITY DESIGNS IN SURVEILLANCE

In the majority of surveillance systems, the primary design requirement for high availability is to allow for redundant sites to operate
in the event of a site going down for maintenance or other reasons. The use of dual streams from cameras to redundant VMS server
is the most common mechanism to enable this functionality. This is because the root of the functionality for a VMS is a temporal
database that allows a user to view video for any given time, location, or event. The VMS, thus, would need synchronous replication
in the event of a failure in order to continue operating without dual streams. Dual streaming from the camera or using a multicast
stream from the cameras allows real time video to traverse the network to each VMS recorder. This is shown in Figure 6, but how
each VMS vendor accommodates this functionality varies somewhat.
Figure 6: Surveillance high availability design and data flow
The retention time for these second streams at their secondary site is generally less than the primary site. For instance, at the
primary site the retention time may be 30 days, where as at the secondary site, the retention time is 7 days. Sizing of the systems
at each site will depend on the same inputs as previously reviewed. However, it is likely the aggregate throughput and capacity will
defer between primary and secondary sites in line with the use of lower resolutions for the secondary stream or perhaps the reduced
retention time at the secondary site.
ISILON FAILURE SCENARIOS

A proper design and implementation comprehends various failure scenarios for sizing and architecting the solution. Outside of
keeping redundant servers and clusters for redundancy purposes, most implementations require some form of local high availability
implementation as well. By designing and implementing a system that can accommodate multiple disk or even a node failures is
important for surveillance systems that scale to the size Isilon is well positioned for. The biggest impacts from node failures are the
following:
Loss of Server to Node Connection
o With a Windows OS, this is using SMB2 as the stateful protocol. Once the client sees a loss of the connection due to
timeout, the session will reconnect to the Isilon cluster. Using SMB requires using a FQDN for the connection such that the
client can use SmartConnect to reconnect to another node in the cluster (using connection count or round robin as the load
balancing mechanism). This scenario creates a loss of connectivity, but typically with a few seconds, such that the VMS can
12
usually buffer the writes during this time and the failure is transparent to a user in video playback as well as video ingest
from the cameras.
o For VMS that employ a Linux-based OS with NFSv3/4, the ability for the Isilon cluster to use Dynamic IP to move an IP
address to another node allows even faster recovery than SMB. This recovery is typically well within one second and uses
the Isilon SmartConnect load balancing mechanism for Dynamic IP (round robin or connection count is most common). The
latency in the NFS connection is so low for these scenarios that the VMS buffers would be minimally affected.
SmartFail of a Node
o In the event of bad hardware or corrupted journal event on a node, the need to SmartFail a node is uncommon but present.
When a node is smartfailed, all the existing data on the node is moved to other nodes and forward error correction codes
(FECs) are recalculated. This is accomplished by a job called FlexProtect in OneFS and the overarching effect is greater
CPU, memory, and drive utilization. All testing and validation is done with FlexProtect active to ensure achievable
bandwidth that is published can still be accommodated during a node fail scenario.
o The Isilon system is typically not configured with every feature during validation because they are uncommon in surveillance
systems. For instance SyncIQ, SnapShotsIQ, SmartPools, and Smart Dedupe are not active on the system during the
validations. Expect some degrading effects in achievable bandwidth during a Node Fail scenario if these features are
implemented on the Isilon cluster in production.
o The overarching effect of FlexProtect and FlexProtectLin jobs is to increase protocol latencies across the cluster. Every
production environment is different and the effects on the latencies will vary, but to avoid issues where video loss is
experienced in the VMS due to buffer overflows during these increased latency periods. Use Impact Policies set to Low for
these jobs and keep the priorities at default (1) as a best practice. It is important to note this will delay completion time for
SmartFail and could take over a week for a node with 144TB or more. See the Isilon configuration section for more
information.
ISILON CLUSTER CONFIGURATIONS

SMARTQUOTAS
The majority of VMS vendors provide best practices based on in-house testing as well as customer experiences to limit volume sizes
(between 2-16TB usually). The standard practice with EMC Isilon is to specify the share size equal to the total capacity desired for
each VMS server (100TB for instance). Without SmartQuotas, the VMS administrators must anticipate the total write rate to the
cluster and adjust the Min Free Space field on each VMS Server accordingly because of the way most VMS applications delete files:
either based on age of the file associated with the temporal database or because of how full a particular volume is (i.e., Min Free
Space left). Although VMS vendors that support the deletion of the video based on age also generally support deletion based on the
volume percent full. When presenting a volume of 1PB, for instance, to the 10 VMS servers, this could create a condition where the
VMS servers begin to delete (or groom) the files when the cluster is 80% fullversus allowing optimal capacity utilization. Isilon
SmartQuotas resolves the issues caused by manual calculations and VMS misconfigurations.
To configure SmartQuotas, from File System Management choose the SmartQuotas tab and perform the following steps:
1. Set the hard threshold to the Archiver video file share limit.
a. Define OneFS to show the available space as the size of the hard threshold.
b. Set the usage calculation method to show the user data only.
13
Figure 7. Screenshot of quota configuration
IMPACT POLICIES
The Impact Policy defines the number of parallel tasks or workers allowed to run at one time within OneFS. For best I/O performance
during a failure scenario, you should configure all background jobs (FlexProtect and FlexProtectLin especially) with the Impact Policy
set to Low. Do not change the priority of any job from the default setting unless it is specified in the Technical Notes. This
configuration setting is in located at: Operations > Jobs and Impact Policies.
In all cases, the EMC recommends using OneFS 7.0 or later to maximize bandwidth and minimize video review response times. In
most cases, you may use the default Impact Policy with X200, X400, NL400, and greater. For scenarios where there are multiple
servers to node (i.e., 2:1) or where the per-server bandwidth is very high, using the Low Impact Policies is a good deterrent of video
loss during a Node Fail state.
Priority configuration
Even if the Impact Policy is modified, for example by modifying all the jobs to Low, the priority of the jobs should remain at their
default settings.
I/O Optimization configuration

Set the default I/O Optimization setting to Streaming by choosing File System Management> SmartPools > Settings >
Default File Pool Policy Settings.
Note: A SmartPool license is NOT required for this setting to be active.
14
NETWORKING
The networking aspects for video surveillance deployment are just as important as sizing the system with the right number of nodes
and can even impact this calculation, especially for systems using VMS with Isilon as the tier 1 storage target. Latencies are
especially important for achieving published bandwidths for the VMS implementations. The network between the VMS servers and
the Isilon cluster should be low latency (<10 ms) and packet loss (<.1%) and many times avoid any routing delays (this is
dependent on the VMS as some require to be on the same subnet as the storage).
The recommendation for optimal bandwidth for all VMS vendors is to use 10GbE interfaces across the Isilon cluster supporting the
\\video_archive share\mount. Most VMS are tested with 1 GigE and 10GbE and there are differences in achievable bandwidth
per node for every platform when using the different network interfaces. These differences are specified in the Technical Notes for
each validation test. The use of 10GbE is also easier to handle higher server to node ratios, where it generally results in higher per
server bandwidth.
Network Interface Implementation Note: For scenarios where 1 GigE is used, no more than 2 server
connections per interface should be implemented and/or designed. Ideally, there is only one server
connection per 1 GigE interface based on test data.
There is no need for link aggregation in video surveillance deployments and testing generally stays away from using LACP or other
mechanisms for link aggregation. Instead, for high availability, it is recommended to use SmartConnect and OSI layer7 load
balancing, versus L2 with LACP. Many customers desire using LACP in order to accommodate no loss of network connectivity during
access switch upgrade/downtime and we have not seen any specific degradation in these production environments.
Jumbo frames are helpful in achieving higher bandwidths per server. Unless the Technical Notes specify this, the testing is done
without Jumbo frames to help map to the most common networks.
SMARTCONNECT SETTINGS
SmartConnect provides load balancing of connections to the Isilon cluster as well as failover handling of connections. The
configuration is different depending on the VMS, but most of the VMS vendors are Windows-based, so this configuration will be
reviewed. The main difference when a Linux based system is the ability to use Dynamic IP as the Allocation Strategy in the
SmartConnect configuration, and thus not relying on DNS unless absolutely required.
With Windows-based VMS software (Verint Nextiva, Genetec Security Center, Milestone Xprotect, Aimetis Symphony), it allows the
use of a UNC path with a fully qualified domain name (FQDN) for the VMS Servers (Recorders), versus requiring manual mapping of
each nodes IP Address in the VMS Server configuration. This makes implementation easier, but does not work perfectly 100% of
the time due to reliance on DNS as well as idiosyncrasies in OneFS implementation.
SmartConnect uses DNS load balancing for the SMB connections. SmartConnect Advanced also allows for failovers, which reduces
the effect of a node failure on video playback. From the perspective of Isilon, when a node fails the Recorders will timeout the SMB
session based on Windows OS and then attempt to reconnect to the target at this point. Thanks to the SmartConnect load
balancing, this capability enables minimal loss of connectivity between Recorders and Isilon, this avoiding any loss of video playback
for clients.
To configure SmartConnect from Cluster Management:
1. Select the Networking Configuration tab.
2. In Subnet Settings, set the SmartConnect IP address (SSIP). This is the IP address that is configured in a DNS server as the
Authoritative name server for the Isilon Cluster DNS name, such as videoarchive.acme.com.
3. In Pool settings:
a. Type the SmartConnect zone name to which clients will connect.
b. Select the subnet to use SmartConnect, which is the subnet that has the SSIP configured on the DNS server.
4. Set the IP Connection policy.
o Use the Connection Count policy if the cluster is used strictly for video storage. This policy distributes IP connections
evenly across all the active NICs. IP connections include Nextiva Recorders, management workstations logged into the
Isilon cluster, Isilon InsightIQ, or any other system using the cluster.
15
- If the system is connected to anything other than VMS Servers, the connections will affect the Connection Count load
balancing as of OneFS 7.1.0.
- If the system does not implement pools for the connections, then all clients will show up as connections on the video
pool, this would create very unpredictable connection distrbutions.
o Alternatively use Round Robin policy, but realize with a small number of servers (<10), the distribution of the connections
to the node cluster may not be deterministically uniform.
In all cases OneFS load balancing does a good job but it is not perfect due to limitations with the SMB and SMB2 (CIFS) protocols.
After any network, VMS server, or node related event causing IP reconnects we recommend verifying the load distribution.
Verifying load distribution can be done on Isilon using the GUI as show in Figure 8 below.
Figure 8. Cluster overview client connections
5. Set the IP Allocation strategy to Static.
The Static setting for the IP allocation method is illustrated below. This setting maintains the IP address to NIC pairing.
Figure 9: Configuring SmartConnect
16
SYNCIQ
SyncIQ is not generally utilized in the configurations for surveillance due to the typical high availability architecture specified in the
previous sections. The only time that it is seen is when the Evidence archive needs to be replicated offsite. This is typically setup to
occur every day and only for the Evidence directory (\\Evidence_archive).
SECURITY INTEGRATION
Most VMS vendors support Active Directory integration for the VMS servers and client systems. As a result, validation is typically
done using the same domain from the VMS server and the Isilon cluster. An example configuration for Verint Nextiva is shown
below, where the Nextiva Recorder is setup with an account also configured on the Isilon cluster via the Active Directory Provider
(Cluster Management Access Management Active Directory). The Share used by Nextiva was configured to provide Full Control
for the users specified in the Verint Nextiva configurations. Many VMS vendors can only have a single user specified across 100s of
servers, so realize it is important to avoid having password expiration or redundant usernames in the Local Provider listings in
OneFS.
Figure 10: Verint Nextiva permissions for share
17
Figure 11: Isilon security setting for share
Occasionally, if there is an issue with permissions from the VMS server and there are multiple domains and/or the VMS server is on
Active Directory domain and Isilon is only permitted to use the Local Provider table, the configuration of the SMB share has had to
use the Run As Root option in configuration.
REFERENCES
OneFS best practices for generic windows shares
http://www.emc.com/collateral/white-papers/h11152-isilon-home-directories-wp.pdf
OneFS Technical Overview
http://www.emc.com/collateral/hardware/white-papers/h10719-isilon-onefs-technical-overview-wp.pdf
Technical Notes for Genetec
https://www.emc.com/auth/rcoll/technicaldocument/h12087-genetec-security-center.pdf
Reference Architecture for Genetec
https://www.emc.com/collateral/solutions/technical-docs/h10583-emc-storage-physical-security-vnx-vnxe-isilon-genetec-security-
center.pdf
Proven Infrastructure for Genetec
https://www.emc.com/auth/rcoll/technicaldocument/h12979-emcproveninfravideosurvgenetecsecuritycenter.pdf
Technical Notes for Milestone
https://www.emc.com/auth/rcoll/technicaldocument/h8236-milestone-exprotect.pdf
Reference Architecture for Milestone
https://www.emc.com/collateral/technical-documentation/h12078-emc-physical-security-milestone-xprotect-vnx-isilon-ra.pdf
Technical Notes for Verint
https://www.emc.com/auth/rcoll/technicaldocument/h8097-physical-security-verint-tn.pdf
Technical notes for Surveillus
https://www.emc.com/collateral/hardware/white-papers/h10518-video-surviellance-surveillus-vsm-isilon-storage.pdf
Technical Notes for DVTel
https://www.emc.com/auth/rcoll/technicaldocument/h10563_dvtel_config_guide_tn.pdf
Technical Notes for Next Level
https://www.emc.com/auth/rcoll/technicaldocument/h10992-emc-storage-physical-security-nlss-gateways.pdf
Technical Notes Aimetis
https://www.emc.com/auth/rcoll/technicaldocument/h13157-aimetis-symphony-emc-storage-tn.pdf
18

h13343 WP Isilon Best Practices Surveillance

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

h13343 WP Isilon Best Practices Surveillance

Hochgeladen von

Copyright:

Verfügbare Formate

BEST PRACTICES GUIDE FOR

IMPLEMENTING EMC ISILON IN A

EMC WHITE PAPER

Copyright 2014 EMC Corporation. All Rights Reserved.

Part Number H13343

ISILON CLUSTER CONFIGURATIONS 13

Video management systems and video surveillance sensors, such as IP cameras

VALIDATED SOFTWARE AND TOPOLOGIES

Viewers running in browsers, thick clients, and in mobile operating systems.

TIER 1 VS TIER 2 NAS STORAGE

Figure 3. Screenshot of Genetec tier 1 configuration

Figure 4. Screenshot of Milestone tier 2 configuration

EVIDENCE AND VIDEO STORAGE

SERVER-TO-NODE RATIOS AND SIZING

SYSTEM DESCRIPTION TESTING PROCEDURES AND AFFECT ON

As resolutions increase, the average bit rate per thread

Table 2. List of inputs for surveillance system sizing

To determine the number of nodes necessary to meet the performance requirements,

HIGH AVAILABILITY DESIGNS IN SURVEILLANCE

Figure 6: Surveillance high availability design and data flow

ISILON FAILURE SCENARIOS

Loss of Server to Node Connection

ISILON CLUSTER CONFIGURATIONS

I/O Optimization configuration

Note: A SmartPool license is NOT required for this setting to be active.

To configure SmartConnect from Cluster Management:

1. Select the Networking Configuration tab.

a. Type the SmartConnect zone name to which clients will connect.

4. Set the IP Connection policy.

Figure 8. Cluster overview client connections

5. Set the IP Allocation strategy to Static.

Figure 9: Configuring SmartConnect

Figure 10: Verint Nextiva permissions for share

Das könnte Ihnen auch gefallen