Sie sind auf Seite 1von 35

Building

SANs

with

Brocade Switches
Summary

compiled by Christopher Greene - Hewlett-Packard Storage Services

version 1.5

Building SANs with Brocade switches summary 1


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Table of Contents
1. Introduction to SANs.......................................................... 3
Background information .............................................................................................................. 3
Data Sharing ............................................................................................................................. 3
SAN is the solution for ................................................................................................................ 3
2. Fibre Channel Basics .......................................................... 4
Architecture............................................................................................................................... 4
Fibre Channel Layers .................................................................................................................. 4
Fibre Channel Classes of Service .................................................................................................. 5
Topologies................................................................................................................................. 5
Fabric Services .......................................................................................................................... 6
3. SAN components ................................................................ 7
Cabling Distances....................................................................................................................... 7
Hubs ........................................................................................................................................ 8
Fibre Channel Switches ............................................................................................................... 8
Connecting Hosts to the Fabric..................................................................................................... 9
4. Overview of Brocade Switches and Features .................... 10
Entry Level switches ..................................................................................................................10
Scalable Fabric Switches ............................................................................................................10
Brocade Fabric OS.....................................................................................................................11
5. The SAN design process ................................................... 12
Lifecycle of a SAN .....................................................................................................................12
Tips and comments ...................................................................................................................12
Performance Gathering ..............................................................................................................12
Localizing and Groups................................................................................................................12
Port Count Determination...........................................................................................................13
Complex considerations .............................................................................................................13
6. SAN Applications and Configurations ............................... 14
High Availability Microsoft Cluster .............................................................................................14
Storage Consolidation................................................................................................................14
LAN-Free Backup Configuration...................................................................................................15
SAN Server-Free backup ............................................................................................................15
SAN-based Third-Party Copy Data Movers ....................................................................................16
Remote Distance Solutions .........................................................................................................16
7. Developing a SAN architecture ......................................... 17
Identifying Fabric Topolgies and SAN architectures........................................................................17
Useful Topologies ......................................................................................................................18
Core/Edge or Star Topologies .....................................................................................................21
Working with the Core/Edge Topology .........................................................................................23
Determining Levels of Availability ................................................................................................24
Configuring Traffic Patterns ........................................................................................................25
8. SAN Troubleshooting........................................................ 27
Troubleshooting Approach: The SAN is a virtual Cable ...................................................................27
Troubleshooting Tools................................................................................................................28
Troubleshooting the Fabric .........................................................................................................30
Segmented Fabrics....................................................................................................................31
Troubleshooting Devices that cannot be Seen ...............................................................................31
Troubleshooting Marginal Links ...................................................................................................32
9. SAN Implementation, maintenance, and management ..... 33
Installation Guidelines ...............................................................................................................33
10. What exactly is Fabric Assist ? ..................................... 34
General....................................................................................................................................34
Limitations and Considerations of Fabric Assist and QuickLoop ........................................................34
Configuration of Fabric Assist......................................................................................................35

Building SANs with Brocade switches summary 2


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
1. Introduction to SANs
Background information
Veritas volume manager, Tivoli, nad Datacore users are able to allocate and share
storage among multiple hosts
Data movers allow for SAN based server free backup directly from storage to tape.
Virtual Interface protocol (VI) Intel, Microsoft, Compaq reduces use of CPU for
network transfers emerged as leading protocol for communications in clustered
environments.

Data Sharing
Resource Sharing
simplest form storage farm is shared among machines access to the storage is
defined statically. ownership does not change often
resource partitioning can be done at the switch level (zoning), LUN masking level
(storage), LUN masking (HBA), or virtualization

Volume Sharing
sharing a LUN between hosts can cause corruption at the SCSI block level
need software clustered hosts have this software built in

File Level Sharing


reading and writing to the same volume from mutliple hosts.
requires a file system that allows for multiple reads and writes.
often a NAS (NFS or CIFS) solution

SAN is the solution for


block level access
high bandwidth
need for expandability
required access to large arrays
need for redundant access to storage
clustered server configurations
distributed applications
need for disaster tolerance
backing up large amounts of data nightly
running clustered databases
centralized management

Building SANs with Brocade switches summary 3


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
2. Fibre Channel Basics
Architecture
SAN consists of initiating devices, interconnecting devices, and target devices
JBOD each disk is visible as a separate entity on the SAN.
RAID collection of disks in a raid group appear as one LUN entity on the SAN
Initiator seeks targets to interract with hosts
Target disks (JBOD or RAID controller)
Fibre Channel transmits at 1.06 Gbps / Gigabit Ethernet is at 1.25 Gbps
2 Gbps Fibre channel transmits at 2.125 Gbps

Fibre Channel Layers


FC-4 Upper layer protocaol (FC, IP, VI)
FC-3 Advanced services
(name server, time server, alias server)
FC-2 Fibre Channel framing
(primitives, words, frame size)
FC-1 8b/10b encoding
FC-0 Physical

primitives control the flow of the fibre channel frame


frames are collections of words that contain headers and payload

FC-0
specifies how light is transmitted

FC-1
encoding layer 8b/10b for every 8 bits you get 10 for error checking
FC-0 with FC-1 are considered signaling interface
bits are encoded into two kinds of characters K and D
all primitives (LIP, SOF, OPN, CLS, IDLE) are delimited by K characters
D characters (data characters) are used to provide all other 8 bit values

FC-2
framing and flow control
relies on primitives encoded from the FC-1 layer followed by 3 data characters (D
characters)
primitives drive loop initialization and arbitration
FC-2 controls flow control by sending the correct primitieves to initiate transfers.

Building SANs with Brocade switches summary 4


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Fibre Channel Classes of Service

Class 1 acknowledged connection-oriented (circuit switched) service


Class 2 acknowledged connectionless (packet switched) service
Class 3 unacknowledged connectionless service
Class 4 connection-oriented fractional bandwidth
Class F inter-switch communication format

Topologies
Point-to-point
Arbitrated Loop
Switched

Point to Point topology


used between two devices no addressing used
sometimes connection between host and switch is point-to-point

Arbitrated Loop
all devices connected in loop and arbitrate for communiction
each device received Arbitrated Loop Pysical Address AL_PA of 8 bits as an address
up to 127 devices to attach

Switched Topology
F_Port fabric port (on the switch)
FL_Port fabric loop ports (on the switch) loop devices connect to these ports
Nodes are assigned a 24 bit address xxyyzz
o xx is the domain
o yy is the area (port on the switch)
o zz is the al_pa (00 for point to point)

Building SANs with Brocade switches summary 5


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Fabric Services
Login Server
FFFFFE
node is required to send a fabric login (FLOGI) to this address before communication
with rest of fabric can exist
FLOGI frame sent with S_ID filled in only for its AL_PA value response from server
contains the domain and area portion filled in.

Name Server
FFFFFC
gets informtion from a port login (PLOGI) at registration and subsequent registrtion
frames
common requests is Request for Transfer (RFT_ID) which registers what layer 4
protocols the device can handle

Fabric / Switch Controller


FFFFFD
provides a state change notification to all hosts that request to keep track of topology of
fabric
device registers for State Change Notification (SCN) by sending state change
registration (SCR)
when theres a change Fabric Controller sends Registered State Change Notification
(RSCN) to all devices requesting such.

Management Server
provides a single access point for managing the fabric as well as three services
o Fabric configuration server information to discover topology
o Unzoned name server access to name server for nodes within all zones
o Fabrc zone server - allows mannagement entities to contol zone participation

Building SANs with Brocade switches summary 6


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
3. SAN components
Cabling Distances
Copper
Media Type Speed (MB/s) Distance (meters)
STP (active) 100 0-30
STP (passive) 100 0-15
video cable 100 0-25
STP (active) 200 0-10
video cable 200 0-10
STP (active) 50 0-40
video cabe 50 0-40

Multimode Fiber

Media Type Laser/LED Speed (MB/s) Distance (meters)


(nm)
50 micron 850 100 2-500
62.5 micron 850 100 2-300
50 micron 850 200 2-300
62.5 micron 850 200 2-90
50 micron 850 50 2-1000
62.5 micron 850 50 2-400

Single Mode Fiber

Media Type Laser/LED Speed (MB/s) Distance (meters)


(nm)
9 micron 1300 100 210,000
9 micron 1300 50 2-10,000
9 micron 1300 200 2-2,000

Building SANs with Brocade switches summary 7


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Hubs
Simple
Electrical hub which creates the loop in arbitrated loop
if a device is down the loop is broken

Managed Hub
perform moreadvancced services
provide frame switching between initiators and targets for throughput enhancement
can isolate initiators in case the initiator is having problems can bypass fixing the loop
typical capabilities
o LIP isolation (prevent LIP from affecting entire loop)
o automatic port bypass (if initiator is having problems)
o signal retiming
o loop zoning
o web interface
o telnet
o port-event logging
o snmp support

Fibre Channel Switches


Zoning
allow devices to only see those devices in the same zone
hard zoning based on port number on the switch
soft zoning based on WWN of device and /or alias defined

Class of Service
most support Class 3 and F
few support Class 2 or even Class 1

Buffer Credits
buffer credits per port are crucial as this indicates how many frames can be sent. This is
especially critical for long distance applications

Building SANs with Brocade switches summary 8


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Connecting Hosts to the Fabric
HBA-based LUN Masking
capability of an HBA to selectively hide or mask storage on the network from the host.
important for Windows as that OS can walk on storage owned by other OSes causing
data corruption.

Persistent Binding
(LUN mapping) is the mapping of a Fibre Channel device into an operating system at a
specific device location.
important for some applications that use the SCSI address to address a device (raw
volume accessed by Oracle)

Remote Boot
allows a host to boot off a volume on the SAN
binding between a specific WWN and LUN must be done to work

Fibre Channel to DWDM


allowable to connect E_Ports together over long-haul fiber lines
do need to allow for additional frame buffers between the two links due to the extra time
needed for the transport.
DWDM can extend distances in Fibre Channel up to 100 Km.

Building SANs with Brocade switches summary 9


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
4. Overview of Brocade Switches and Features
Entry Level switches
2010, 2040, 2050 are the same physical 1U unit with different feature sets enabled
2210, 2240, and 2250 are same physical 1.5U unit with different feature sets enabled
2000 = 8 ports no GBIC (connection built in)
2200 = 16 ports with GBICs
power supply (one each) is built in not a FRU
ethernet and serial ports

2010 (8 port) and 2210 (16 port)


2x10 are arbitrated loop-only switches alternative to hub-based solutions
2010 - 8 fixed optical ports
2210 16 GBIC slots
bundled with Brocade zoning, web tools, name server, and quick loop.
can be upgraded to full fabric with license key

2040 (8 port) and 2240 (16 port)


support for entry-level fabrics
can be used in dual-switch configurations to support fabric or loop up to 30 ports. You
can E_Port one port per switch to connect to larger fabric
2040 comes with zoning, web tools, and name server must purchase quickloop to
connect FC-AL (arbitrated loop) hosts

2050 (8 port) and 2250 (16 port)


full fabric switch that allows multiple E_Port configurations

Scalable Fabric Switches


2400 and 2800 redundant power supplies replaceable
cooling fan is also FRU
power supplies and fan can be replaced without taking switch offline.

2400 (8 port)
1U full fabric switch
ports can be E, F, or FL ports (start at G_Ports)

2800 (16 port)


2 U high
dual power supplies (same as the 2400)
copper or optical GBICs

6400 Integrated Fabric


six 2250 switches configured together as a solution
two switches operate as the core other four operate as the edge switches
64 port solution (96 total ports 64 usable)

12000 core fabric switch


128 ports acts as two separate switches within same chassis
1 or 2 Gb/s ASIC technology
protocol independent backplane will support iSCSI, FCIP, Infiniband, etc.
will support optional application platform blade that enables the deployment of high
performand services such as virtualization and third-party copy.

Building SANs with Brocade switches summary 10


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Brocade Fabric OS
Core OS functions
Automatic discovery of devices and entry into SNS (Simple Name Server). Translative
mode allows public initiators to log into loop targets.
Universal port support
continuous monitoring of ports for exception conditions

Services for Reconfiguration


Management server in-band discovery of topology
Simple Name Server
Alias Server multicast service that sends data to members of an alias

Dynamic Routing Services


Dynamic path selection via FSPF
Load sharing to maximize throughput through ISLs multiple ISLs between switches
Automatic path failover reconfigures new paths when a link fails
in-order frame deliver
automatic rerouting of frames when a fault occurs
routing support for link costs enables managers to manually configure link costs
support for high-priority protocol frames useful for clustering
static routing support
automatic reconfiguration of ISLs

Extended Fabrics
allows switch to support the rigors of long distance I/O operations in such instances as
DWDM.

Fabric Watch
switch watches for faults and alerts based on thresholds

QuickLoop and Fabric Assist


QuickLoop connects ports on one or two switches with one or more private loop devices
(initiators) on those ports. Each port is a looplet. All devices on the looplets in
QuickLoop form a logical PLDA making them all from the same AL_PA address space.
QuickLoop is in two modes Hub emulation and Fabric Assist
Hub emulation actually builds private loops across as set of switch ports on one or two
switches.
Fabric Assists creates virtual loops capable of spanning entire fabric allowing private
hosts to function as if they were attached to a physical loop switch. Fabric OS assigns a
phantom address to each private loop target device.

Brocade ISL Trunking


allows up to four ISLs between two switches to be viewed as one logical connection.

Building SANs with Brocade switches summary 11


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
5. The SAN design process
Lifecycle of a SAN
Data Collection
Data Analysis
Architecture Development
Prototype and Testing
Transition
Release to Production
Maintenance

Tips and comments


Brocade does not recommend managing fabric from one Ethernet port (in-band
management) but you should use out of band management utilizing all Ethernet ports
If possible do not use private loop devices use fabric aware devices
it is recommended that dual redundant fabrics are used to design an implementation
a host has connections to storage through two paths two paths on separate fabrics.

Performance Gathering
Windows NT use the diskmon feature (from the resource kit) or permon
Solaris iostat utility

Localizing and Groups


If you can localize traffic into specific areas of a SAN, you directly improve the SANs
performance and reliaility.
A SAN with a great deal of known locality might becontstructed out of many separate
fabrics with no ISLs whatsoever.
A SAN with little or known locality might require a high-performance ISL architecture
Grouping occurs with matching targets with initiators
SAN A

Group 1

Group 2 Server 1

Group 3

Group 4

Server 2

SAN B

Group 1

Group 2 Server 3

Group 3

Group 4

Server 4

if you are able to make relatiely small performance groups, your SAN will benefit greatly
from applying the principal of locality
The amount of locality will determine the number of ports needed for ISLs high locality
= low ISL count, low locality = high ISL count

Building SANs with Brocade switches summary 12


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Port Count Determination
add up the total number of ports required by hosts and storage this is the total
number of exposed ports.
Take this number and divide by the number of redundant fabrics (2 or 3) and that is the
number of exposed ports per fabric
Add into account the following as overhead ports per fabric including all ISL ports and
ports left open for future growth.

Complex considerations
if you have distance considerations add two ISLs per switch
if you have high performance and little locallity add two ISLs per switch

Ports per Fabric (P)


--------------------------------------------------- = estimated number of
switches
Ports per switch (Ps) - ISL per switch (I)

Building SANs with Brocade switches summary 13


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
6. SAN Applications and Configurations
High Availability Microsoft Cluster

Disk array

Fabric A Fabric B

hba1 hba2 hba1 hba2

Active Cluster Standby Cluster


Server Server

Storage Consolidation
Major advantage of SAN is to allow storage to come online without having to down the
host and reconfigure the SCSI bus.
When storage is directly attached to servers difficult to reallocate space
when put on SAN can bring online, allocate all without reboooting (W2K, Solaris)

Problem
Windows NT will assume that it owns storage it encounters and write a signature to the
disk. When this happens if this is storage owned by UNIX, the data on the storage
could become corrupt.
use Brocade zoning to ensure that hosts dont step on other hosts storage
to truly add storage on the fly without taking file system off line another layer of file
system functionality like Veritas Volume Manager is needed

Shared Storage for a Web Farm


front end IP load balancing
centralized read-only storage pool on the SAN will provide centralized repository
A shared file system is required for this solution allowing several read-only access
This reduces cost as content is centralized and consolidated in one place

Building SANs with Brocade switches summary 14


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Storage Partitioning using Switch Zoning
allows the segmentation of initiators and targets on a fabric.
in JBOD this allows individual disks to be zoned out to specific hosts
if your SAN contains multi-LUN devices (RAID arrays), zoning does not currently allow
for zoning on the LUN level

Storage Partitioning using Storage LUN Masking


This is accomplished on the storage device itself
done by determining which LUN can be seen by which host based on WWN of the HBAs
on the host.
This is done on the hardware base and cannot be faked out unless you were to mask
the HBA WWN

Storage Partitioning using HBA LUN Masking


This is accomplished via software on the host
HBA masks LUNs that the host does not have access to
Prone to security issues as rogue hosts can overtake storage that is owned by other
hosts causing corruption.

Partitioning with Software


solution exists as drivers loaded on top of file system.
like HBA LUN masking the software needs to be loaded on every machine on the SAN
to work.
some approaches allow several hosts to have read/write access to the same LUN
similar to shared volume/shared file system.
Veritas Volume Manager, Tivoli SANergy, HP LUN Manager, etc.

LAN-Free Backup Configuration


traditional backup uses direct connect tape system
to coordinate the growing number of individual tape systems, LAN backups became
prevalent
LAN backup involves dedicated backup server on the LAN
backup jobs require large amounts of block data moved.
Backup traffic that traverses the fibre channel SAN are more efficient since traffic does
not need to go through TCP/IP stack.
A dedicated backup server is still required to manage the data movement from storage
to tape

SAN Server-Free backup


involves a data mover which reads from he disk and writes directly to SAN-based shared
tape resources.
a server never needs to be in the data path.
data movers can be Network Data Management Prtocol (NDMP) based or use Extended
copy command, other times called third-party copy.
NDMP is an open standard protocol for enterprise-wide backup of heterogeneous
storage. NDMP clients and servers pass metadata about the backup job as well as the
data itself. This was traditionally done with the network-attached storage model.
Extended Copy is a SCSI command that allows a remote block-level copy to occur. The
host requesting the copy command does not send it directly to the devices in question
instead, it sends the request to a third-party device, which thend sends the command to
the appropriate targets. In general, fibre channel tape drives and fc-scsi bridges use
extended copy while NDMP uses legacy hosts.
Best solution is a server free fibre channel based backup

Building SANs with Brocade switches summary 15


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
SAN-based Third-Party Copy Data Movers
All data movers support the SCSI Extended Copy command. The third-party copy
hardware actually moves the data from the disk to tape. The backup software controls
this operation without the need for te servers in the network to get involved in the
actual movement of data. Agents are not required torun on the server, and critical
servers are not occupied backing up the data.

Remote Distance Solutions


Tunneling
it is possible to tunnel fibre channel over ATM using Brocade Remote Switch product and
an appropriate Fibre Channel to ATM converter.
Remote Switch is an optionally licensed product

Metropolitan Area Network Solutions


can use DWDM to provide longer distances between switches up to 100 Km.
relies on Extended Fabric to add buffer credits to the ports doing the tunneling so that
communication can remain constant.
Done by licensing the Extended Fabric license then changing long-distance fabric
settings to 1 in the switch configure command

Building SANs with Brocade switches summary 16


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
7. Developing a SAN architecture
Identifying Fabric Topolgies and SAN architectures
Fabric one or more interconnected Fibre Channel switches. Fabric refers to the
physical switches or a set of global software components such as routing tables, zoning
configuration, etc.
SAN consists of one or more related fabrics and connected edge devices.
Fabric Topology the arrangement of the switches that form a fabric. Used in the
context of ISL s and does not relate to the way the nodes are connected to the fabric
Resilient Core/Edge Fabric Topology a topology where two or more switches act as a
core to interconnect multiple edge switches. End nodes are connected to the edge
devices. Shortcut form written stating the number of edge switches, number of core
switches, and number of ISLs. Example 16e4c1i 16 edge switches each connected to
the four core switches by one ISL.

16e x 4c x 1i
16 edge 4 core 1 isl (per edge)

Node any device that attaches to a fabric (storage or host)


Node Count number of nodes attahed to a fabric
Fabric Port Count number of ports available for connection by nodes in a fabric the
term generic term port count refers to all ports in a fabric
SAN Port Count number of ports available for connection by nodes in entire SAN (not
just the fabric)
SAN Architecture overall design or structure of a storage network solution. This
includes one or more related farics each of which has a topology.
Hop Count number of ISLs that a frame must traverse to reach its destination.
Latency 2 microseconds per switch (1 microsecond is typical); can be treated as
inconsequential
Over-Subscription situation when one or more nodes could contend for use of a
resource such as an ISL than that resource can handle. Most nodes cannot sustain full
Fibre Channel speeds typically running at 50 to 80 percent of full 100 MB/s. For
performance limiting congestion to occur, several devices must not only all operate at
their peak at the exact same tim, but must sustain that activity. Most trafficis bursty as
well as releatively random limiting the chances of this happening. The exception to this
is video streaming.
ISL over-subscription ratio number of ports that could contend for the use of its
throughput. This should be caculated for an edge switch in a core/edge SAN by making
the ratio of the number of free ports (non-ISL) on that switch to the number of ISLs.
Example: in a 16ex4cx1i each edge switch has 4 ISLs (one for each core). Since each
edge switch has 12 free ports and 4 ISLs the ratio is 3:1. Worst case scenario is a
15:1 but in reality if those 15 ports are all accessing the same storage node the
oversubscription also happens at the storage level and no network design will rectify this
situation.
Congestion realization of over-subscription where multiple devices are contending for
throughput. Point of clarification congestion/blocking blocking nothing gets through
whereas congestion you just have to wait a bit
Fastest Shortest Path First (FSPF) routing protocol across Fibre-Channel based SANs.
Provides for load sharing between equal-cost links. Fabrics with equal cost links (as
described above) benefit from this in reducing the congestion. Fabrics with many ISLs,
but few equal-cost paths like the full mesh do not benefit from FSPF
Single Point of Failure (SPOF) any component that could bring down an entire SAN
solution.
Fan-out This is the ratio of host ports to a single storage port. It is the view of the
SAN from the storages perspective. Example: one storage port 10 hosts using = 10:1
fan-out
Fan-in Ratio of storage ports to a single host port. This is the view of the SAN from
the hosts perspective. How many different storage devices will the host be trying to
access from a single HBA.

Building SANs with Brocade switches summary 17


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Useful Topologies
(There can be no more than seven hops between initiator and target)

Scalability
there are two metrics for measuring scalability: the size that the topolgy can scale to (in
terms of port count and switch count) and the ease of performing this process.

Cascade Topology
line of switches in which the end switches are not connected
inexpensive easy to deploy but limited scalability
Best for situations where most if not all traffic can be localized onto individual switches,
and the ISLs are used primarily for management traffic.
Limit of Scalability: 114 ports / 8 switches

Ring topology
like the cascaded fabric but with the ends connected
superior reliability since traffic can get around any one ISL failure
Best when localization is high
Good when implementing SAN over MAN or WAN where ring topology is already
dictated.
good for starting small and staying small
ISLs used more for management than data
Limit of Scalability: 112 ports / 8 switches

Building SANs with Brocade switches summary 18


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Full mesh topology
every switch is connected directly to every other switch. Using 16-port switches, the
largest useful full mesh consists of eight switches each of which has nine available ports
total of 72 available ports.
Adding more than 8 switches in full mesh significantly reduces the number of ports
available.
Scaling can require unplugging edge devices make sure you count for this by leaving
ports open or by ensuring that devices can withstand downtime on each switch.
Best used when your fabric will not grow beyond 4 or 5 switches since cost of ISL
become prohibitive after that.
Fabrics using 6 or more switches are good candidates for core/edge design which will
cost less, perform better, and scale better.
Performance though is good on full mesh since there will only ever be one hop to
destination.
Limit of scalability: 72 ports / 8 switches

Building SANs with Brocade switches summary 19


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Partial Mesh Topology
common definition is broad enough to cover every topology that is not full mesh
Similar to full mesh but some of the ISLs removed.
Can scale farther than full meshes
Meshes (partial or full) are recommended for networks that will change infrequently
since they are difficult to scale without downtime. This is because ISLs may need to be
disconnected in the process. This is disruptive especially if redundant fabrics were not
used.
May also be used as static components of a SAN backup fabric for example or the core
of a core/edge design.
Limit of Scalability: 176+ ports / 16+ switches

Building SANs with Brocade switches summary 20


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Core/Edge or Star Topologies
Two or more switches will reside in the center of th fabric (the core) and interconnect a
number of other switches (the edge).
Hosts and storage connect to the edge switches using free ports (edge ports)
Free ports (if any) should be used as ISL connections to the edge switches.
easy to grow without downtime
easy to transition to future large core faric switches, and good at providing investment
protection
economical
simple and easy to unerstand
well tested
full utilization of the FSPF protocol
performance is deterministic you can easily determine how much bandwidth any given
switch has to get to any other switch
scalable to hundreds of ports (thousands in the future)

Simple resilient core/edge topology


two or more core elements each of which consists of a single switch
two core elements are recommended to maintain resilience
Limit of Scalability: 224 ports / 20 switches

Building SANs with Brocade switches summary 21


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Complex core resilient core/edge topology
two or more core elements each of which consists of multiple switches
most of the time this complex scenario is not necessary only needed when more
than 16 edge switches are needed.
Limit of Scalability: 300+ ports / 28+ switches

NOTE: This topology does NOT replace a redundant fabric SAN. For true resilience two of
these fabrics are needed running in parallel. Therefore, maintenance on this fabric will not cause
downtime since the other fabric will remain operational.

Composite resilient core/edge topology


has two or more cores. each core consists of two or more single-switch elements.

Building SANs with Brocade switches summary 22


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Working with the Core/Edge Topology
Adding an edge switch
Step one: set up the new switch by itself in the location where it will attach to the SAN
but do not attach to the SAN. Power it on, assigne host name and IP address.
Configure Domain ID and other configure parameters to fit into existing fabric. Ensure
that no zoning configuration is in effect.
Step two: connecting the switch. Issue the portDisable command for each port you
wish to use as in ISL
portdisable 0
portdisable 1
portdisable 2

Now physically connect the cables and issue the portEnable command on the first port
(port 0). The fabric will reconfigure itself. Issue the fabricshow command to ensure
that the switch is successfully integrated into the fabric.

Upgrading the Core (upgrading core switch)

Step one: Physically install the switch and configure configure elements including
switch name, Domain ID, and IP address. Make sure there is no zoning information in
the switch. Disable the entire switch with the switchDisable command. Now telnet
into one of the existing core switches and issue the switchDisable command. Ensure
that all traffic is successfully crossing the remaining core switch (assuming two core
switches). Remove the old core switch and cable the new switch to all existing edge
switches
Step two: Issue the switchEnable command on the new core switch and allow the
fabric to reconfigure itself. Issue the fabricShow command to ensure that the new
switch is successfully integrated into the fabric.
Step three: repeat the above steps on all core switches that must be upgraded.

Building SANs with Brocade switches summary 23


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Determining Levels of Availability
Resiliency is the capability of a fabric topology to withstand failures.
Redundancy is the duplication of componenets up to and including the entire fabric to
prevent the failure of the SAN solution. A completely redundant fabric environment is
the recommended path of SAN design.

Four Primary categories of availability of SAN architecture.


Single fabric nonresilient all switches form one fabric which contains a single point of
failure
Single fabric, resilient all switches form one fabric but there is no single point of failure
that could cause disruption in the fabric. Ring, mesh, and core/edge are examples of
this
Dual fabric, nonresilient half the switches form one fabric while the other half forms
another separate fabric. Within each fabric there is a single point of failure
Dual fabric, resilient half the switches form one fabric while the other half forms
another separate fabric. There is no single point of failure within each fabric that will
cause disruption within the fabric. This is generally the best approach to the design of a
fabric true HA.

H ost

SAN A SAN B

S to ra g e

Building SANs with Brocade switches summary 24


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Configuring Traffic Patterns
Leveraging Tiers
In most SANs today traffic predictably flows between hosts and storage and rarely
between hosts.
Since traffice flows between host and storage removing unused ISLs frees up ports
without affecting resiliency

The fabric below has no traffic going horizontally between switches:

Server

Disk array

By removing some ISLs resiliency is not affected:

Server

Disk array

Building SANs with Brocade switches summary 25


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
You can also view this structure as tiered between storage and host tier:

Server

Host Tier

Storage Tier

Disk array

The benefits of tiered SANS is that they do not need ISLs or bandwidth
optimization between switcheson the same tier.
You can also have three tiers one for the core tier as well. Again ISL
optimization between switches at the core tier is not important.
Tiered SANs aid in administration when you need to add more hosts, you just
add a host switch when you need more storage- you just add another storage
switch.

Exploiting Locality
You can attain the best performance in any network by localizing. Localizing means
putting ports that need to communicate closer together.
The more locality within a SAN fabric, the fewer ISL are needed for data communication.
While smaller SANs will not benefit as greatly frm the use oflocality as large SANs will,
all SANs will benefit somewhat. However, for low bandwidth applications, the
management benefits of organizing your edge devices in a tiered fashion are significant,
and zero percent locality can be acceptable.

Congestion on a SAN occurs when:


SAN application is extremely bandwidth-intensive. Example of video applications that
use large I/O size (over 64 KB) and typically consume 80 MB/s to 100 MB/s of
throughput. More common are applications such as online transaction processing or e-
commerce which is typicaly 2KB to 8KB blocks and bandwidth is only about 16MB/s
Majority of I/O is streaming as opposed to bursty at peak utilization.
Majority of SAN deices support 100MB/s or 200 MB/s. Remember though that most tape
systems only operate at 14MB/s so the congestion is not on the SAN itself but on the
edge device.

Building SANs with Brocade switches summary 26


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
8. SAN Troubleshooting
Troubleshooting Approach: The SAN is a virtual Cable
when troubleshooting with a simple-switch configuration, single switch, single host, and
single storage, you need to focus on the HBA, the GBIC, host OS, switch, and the
storage.

I cannot see my Disks


ensure host and storage are connected via switchShow command. If not connected,
focus on port initialization. This command shows what devices are connected and if the
ports are successfully initialized.
If the device is connected to the switch ensure the device is entered in the Simple
Name Server (SNS) by issuing the nsShow command. This command shows all SNS
entries local to that switch by contrast nsAllShow gives entries for Global SNS.
you shoule start like this in the middle of the path and start searching outward

Where to Start and What to Gather


as stated before, SAN troubleshooting should begin in the middle of the SAN and
proceed outward.
Take a snapshot: describe the problem and gather information. Formulate a statement
describing the problem including the bad behavior and a statement about what you are
doing or have done.
Questions to ask in the beginning:
o can the problem be duplicated on demand?
o is the problem intermittent and if so how often?
o has anything changed recently on the fabric? if so what?
o is the problem localized or fabric-wide? is it only on this switch or every switch
in the fabric
o is this the initial install or was it running and now is broke?
Also do:
o collect any and all error messages
o collect firmware and driver versions for HBAs
o external switch information including port LED state
o diagram of SAN configuration
o if long distance is included what distance? quality of links?
supportShow

Building SANs with Brocade switches summary 27


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Troubleshooting Tools
Using the switch LEDs
fast flickering green is good
identify potentially disruptive device by offline (no LED), sends light (steady yellow),
comes online (steady green), and then cycles through process. This is a disruptive
device. On QuickLoop this would be causing LIPs throughout QuickLoop
a slowly falshing switch power LED indicates tht the switch failed the power on self test
(POST)
LED indications:

Ports Description
no light no light or signal no GBIC module or cable
steady yellow receiving light but not yet online
slow yellow disabled due to switchDisable or portDisable
(flashes 2
secs)
fast yellow error, fault with port
(flashes
sec)
steady green online (connected with external device over cable)
slow green online, but segmented
(flashes 2
secs)
fast green internal loopback
(flashes
sec)
flickering online and frames being forwarded
green

Switch Diagnostics
diagHelp list of diagnostic commands
ramTest system DRAM diagnostic
portRegTest port register diagnostic
centralMemoryTest central memory diagnostic
cmiTest CMI bus connection diagnostic
canTest QuickLoop CAM diagnostic
portLoopbackTest port internal loopback diagnostic
sramRetentionTest SRAM data retention diagnostic
cmemRetentionTest Central mem data retention diagnostic
crossPortTest cross-onnected port diagnostic
spinSilk cross-connected line-speed exerciser
diagClearError clear diag error on specified port
diagDisablePost disable POST on reboot
diagEnablePost enable POST on reboot
setGbicMode enable tests only on ports with GBICs
setSplbMode enable 0=dual, 1=single port LB mode
supportShow print version, error, portLog, etc.
parityCheck Dram parity, 0=diabled 1=enable
spinFab ISL link diagnostic
loopPortTest L_Port cableloopback diagnostic

Helpful commands
generally the help <command> will give a man page for the command

errShow command
64 logged errors list is cleared upon reboot
also logs environmental errors
consider using the syslog facilities of the switch for persistent storage of errors
syslogIpAdd, syslogIpRemove, and syslogIpShow

Building SANs with Brocade switches summary 28


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
portErrShow command
effective for marginal ports
key is looking for high number of errors relative to frames transmitted and received.
guideline is to look for errors ins excess of 0.5 percent of the total number of frames
transferred.
enc_in received data: number of 8b/10b encoding errors. reinitialization of N_Ports
will cause this
crc_err indicates contents of frame are no longer valid
too_long larger than Fibre Channel frame size allowable header plus 2112-byte
payload
bad_eof number of frames received with bad end of frame
enc_out receive link: the number of 8b/10b encoding erorrs recorded outside frame
boundaries.
er_disc_c3 number of class 3 frames discarded (due to timeouts, or unreachable
destinations)

switchShow command
when troubleshooting issues involve fabric services or switchs ability to participate in
the fabric, the important parts of switchShow are:
o switchState (online, offline, testing, faulty)
o switchRole (principal, subordinate, disabled)
o switchDomain
if running in a fabric switchState should be online
there should only be one principal switch in a fabric
there should be no duplicated switchDomains
o 1000 series 0-31
o 2000 series 1-239
switchID is the 24 bit address of the switch in the fabric
switchType which model the switch is
o 1: 1000 series
o 2: 2800
o 3: 2400
o 4: 20x0
o 5: 22x0
the terms upstream and downstream indicate switchs position relative to the principal
switch

switchName: brocade2b
switchType: 2.4
switchState: Online
switchRole: Principal
switchDomain: 1
switchId: fffc01
switchWwn: 10:00:00:60:69:10:5e:a7
port 0: id No_Light
port 1: id No_Light
port 2: id No_Light
port 3: id No_Light
port 4: id No_Light
port 5: id No_Light
port 6: id No_Light
port 7: id No_Light
port 8: id No_Light
port 9: id No_Light
port 10: id No_Light
port 11: -- No_Module
port 12: -- No_Module
port 13: -- No_Module
port 14: id No_Light
port 15: id No_Light
brocade2b:admin>

Building SANs with Brocade switches summary 29


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
nsShow command
the most important thing to check here is whether the device you are concerned with
exists in the local SNS
devices in local SNS only not global
total number of Name Server entries for Fabric - nsAllShow

topologyShow command
displays the fabric topology as seen by the local switch
lists all the switches in the fabric and all the possible paths to reach those switches
Port addressing:
XX 1Y ZZ
XX is value between 0x1 and 0xef indicates the domain id of switch
1 will always exist (native mode) with 2000 series switches
Y is the port number
ZZ is the AL_PA for a loop device or 00 for F_Port

Troubleshooting the Fabric


What to look for:
verify that hosts can see all the storage they should via format in Unix, Disk Admin in
Windows, or /var/adm/messages in Solaris
look for a switch with an unconfirmed domain in the switchShow command this means
the switch was unable to communicate with the principal switch to obtain a domain ID.
by running the topologyShow command verify that you can see all the switches. If
one is missing, log into that switch and issue a portDisable portEnable on the E_Port
to reset the ISL.

Timeouts
if the SAN experienced a reconfiguration a host that is online might timeout. Most
host will retry but verify with the specific host.
if there are genuine timeouts- you can increase the Resource Alocation Time Out Value
(R_A_TOV) , or Error-Detect Time Out Value (E_D_TOV)

Timeout of edge devices during Fabric Bring Up


if you suspect a PLOGI/FLOGI timeout failure during the fabric convergence, you can
confirm your suspicions by reviewing the host logs.
You can determine the SAN state by issuing a topologyShow command

Port Configuration Conflict of Missing Fabric License


if switch does not have fabric license you cannot have E_Ports. switchShow will
indicate that the E_Ports are unknown.
issue the licenseShow command to assure that the fabric license is given in.
it is possible to configure the switch so that other swtiches cannot ISL into the switch by
the portCfgEport to disable E_Port functionality

Building SANs with Brocade switches summary 30


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Segmented Fabrics
You will know when a fabric segments (slow blinking green LED ISL lights).
when issuing the switchShow command the E_Port will show up as unknown

Zone Conflict
A fabric will segment when zoning configuration is not consistent. In most cases it is
easier to clear the configuration of one switch (most likely the new one) and absorb the
existing configuration.
o Multiple zoning configurations enabled will create Zoning Conflict only one
configuration may be active at one time.
o Zone definition type conflict happens when configuration is defined but the
definition type (alias, zone) are in conflict. Example: cfg1 has red as an alias
where cfg2 has red as a zone
o Zone definition content conflict happens when configuration is defined, no
type conflicts but content of configuration is in conflict (different port definitions
in an alias for example)
These can be remedied by:
o cfgClear <configuration you want to delete> followed by a
o cfgDisable <active configuration you want to delete>
keep in mind that all elements in the configure command must be identical (except
domain ID of course) otherwise fabric will segment.
if there is a domain conflict:
o switchDisable followed by a switchEnable to gain a new domain ID

Message Queue Errors


you can detect a message queue error (MQ) by looking for either M or Q in the error
message.
can result in edge devices dropping from the Name Server or preventing a switch from
joining the fabric.

Troubleshooting Devices that cannot be Seen


first step is to determine whether the missing device problem is a fabric or a local issue.
ensure all switches are online on the fabric
o topologyShow
o nsAllShow to determine whether you see all the devices you should on the
fabric.
verify the missing devices can be seen via the switchShow command (you can see
their WWN). If you can see the device(s) with this command troubleshoot the fabric
as a virtual cable.
if you cannot see the device ensure that the ports in question are initialized properly.
You may have a port initialization problem or a marginal link problem.
if the port is not online or showing up as a G_Port analagous to a disconnected cable.
Quick way of troubleshooting is to inspect the LEDs on devices in question.
you may need to lock in the port to a specific configuration
o portCfgEport
o portCfgFAport
o portCfgLport
ensure that the port also is not connected to a port that is in QuickLoop
o qlShow
o qlDisable
o qlPortDisable <port #>
if your end device does not show up in the Name Server reinitialize the port by
portDisable or portEnable command

Building SANs with Brocade switches summary 31


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Troubleshooting Marginal Links
Marginal Link is defined as a port which is receiving a marginal incoming signal, or the
switch receiveris not functioning properly.
Failed Nx_Port would be caused by bad GBIC or cable
Failed Fx_Port would be caused by failed switch optical component or switch port

Nx_Port (host/storage) behavior with a marginal port in the loop


when a marginal port interrupts a loop (LIP), latency on the loop
hosts described as slow
green lights mixed with yellow lights indicate that the ports are resetting themselves.

Marginal GBIC /Cable


you can determine marginal GBIC or cable via er_enc_out statistic

LIP count
a port (on a loop) with a larger Lip_in than a Lip_out count indicates that the
associated device is guilty of the LIP activity.

Building SANs with Brocade switches summary 32


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
9. SAN Implementation, maintenance, and
management
Installation Guidelines
Brocade recommends using out of band (Ethernet) connectivity to each and every switch
even though it is possible to manage in band (IP over Fibre Channel)
Hard or Soft Zoning if you define a zone (or alias) with a WWN you are using soft
zoning. A combination of port number and WWN indicates soft zoning still. Port
numbers alone indicate hard zoning.
when adding a new switch to a fabric to ensure that it is empty zoning configuration
wise delete defined zone, then disable it (so it is not in local memory) then save the
configuration
o cfgClear
o cfgDisable
o cfgSave

Building SANs with Brocade switches summary 33


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
10. What exactly is Fabric Assist ?
General
QuickLoop allows a private host initiator(8 bit address) to communicate with a public
target.
Remember : Translative mode allows a public host initiator to communicate with a
private target this is default on a Brocade switch.
How QuickLoop works:
o places a phantom address of the private host (initiator) on the public targets
loop.
o with this the public target (storage) can now only participate on the
QuickLoop. It CANNOT participate with the rest of the Fabric
Fabric Assist allows storage to participate with QuickLoop and the rest of the Fabric at
the same time.
How Fabric Assist works:
o places a phantom address of the public target (storage) on the private
initiators loop.
o with this the storage is not locked into a QuickLoop. The storage can
participate with the QuickLoop and with the rest of the Fabric.
How to configure Fabric Assist:
o Private host must placed into a FA zone
o Public storage is not put into QuickLoop but it is placed into the same FA zone
o Public storage can also be placed in any other fabric zone to communicate
successfully with public hosts (intiator)

Limitations and Considerations of Fabric Assist and


QuickLoop
private storage cannot be on the same switch as private host in fabric assist
public storage can be on the same switch as private host in fabric assist
version 2.4.x is minimum version supporting FA
only allowed one host(initiator) in an FAzone but may have multiple targets
a port that is in QuickLoop cannot be in Fabric Assist
The node identified by the H{ } syntax must be a single private host
If the private host is identified by its WWN, then it must support probing
If the host rejects the probe from the switch then it can only be zoned via (domain,port)
An FA host must be on a port by itself, otherwise it will fail
Private targets and a private host on the same switch cannot be included in a QLFA zone
If only 1 switch exists, then the classic QuickLoop must be used
Fabric and public targets may exist anywhere (Same switch is allowed)
Fabric Assist hosts may not exist on the same switch as QuickLoopuFabric Assist targets
may co-exist on a switch that has QL enabled
A maximum of 125 unique targets may be zoned with private hosts on any single switch

Building SANs with Brocade switches summary 34


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5
Configuration of Fabric Assist
A new set of commands start with fazone mirror other zoning commands and work
with existing alias commands
Commands:
fazoneCreate creates a new Fabric Assist zone
fazoneAdd add members to a Fabric Assist zone
fazoneDelete remove a Fabric Assist zone
fazoneRemove remove members from a Fabric Assist zone
faShow shows the PID of each Fabric Assist on the switch and a list of each
zoned target port with assigned AL_PA values
faStatShow displays Fabric Assist statistics
You must also use the existing cfg Commands to add the zones to the configuration and
save the configuration.
Example fazoneCreate fazone1, H{1,5} ; 1,3 ; 2,5
Remember that the Fabric Assist host (only one per FA zone) must be designated by
H{ }

Building SANs with Brocade switches summary 35


compiled by Christopher Greene Hewlett-Packard Storage Services
version 1.5

Das könnte Ihnen auch gefallen