Sie sind auf Seite 1von 27

Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

[ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL Best practice FUJITSU Storage ETERNUS DX S3 Storage Cluster

Best practice FUJITSU Storage ETERNUS DX S3 Storage Cluster – Technical Info

This document will give you some technical information about the new ETERNUS “Storage Cluster” feature. It will help you to understand how to configure, manage and use this new function which enables an ETERNUS DX S3 storage system to get high availability by connecting two ETERNUS DX S3 storage devices.

Content Introduction Overview Requirements Software Licenses Storage Cluster setup and configuration Storage
Content
Introduction
Overview
Requirements
Software
Licenses
Storage Cluster setup and configuration
Storage Cluster configuration
Storage Cluster allocating Business Volumes
Storage Cluster Controller setup
Storage Cluster processing
2
2
3
3
3
4
5
7
11
12
Storage Cluster bi-directional information and setup
Recovery procedure caused by defect RAID Group
Preconditions
15
16
16
1. Step
16
2. Step
17
3. Step
17
4. Step
17
5. Step
18
6. Step
19
Appendix
Fibre Channel Switch read-only discovery
Status of TFO Group Information
20
21
23
Recommendations
25
TFO Checklist
26
Abbreviations
27

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Introduction

The ETERNUS “Storage Cluster” is a high-availability feature of the ETERNUS DX S3 family of storage devices. Assigned volumes named Transparent Failover Volumes (TFOV) used in a Storage Cluster configuration are mirrored and paired from the Primary storage system to the Secondary storage system by remote equivalent copy (array-based replication with REC synchronous mode of Advanced Copy function). In normal state, the Fibre Channel (FC) ports configured on the Primary site are linked up and the ones on the Secondary site are linked down so that business servers issues I/O to the Primary storage system. The Storage Cluster Controller is connected by LAN to both ETERNUS DX S3 storage devices for heartbeat monitoring. The Storage Cluster Controller is responsible for avoiding any kind of “split-brain” scenario for a Storage Cluster configuration setup in automatic failover mode. In case the Primary storage system crashes the Storage Cluster Controller is required for the decision to switchover operation to the Secondary storage system (automatic failover). If the Storage Cluster Controller is not connected a user needs to operate a manual failover to the Secondary storage system. When the failover is invoked, the Fibre Channel (FC) ports configured on the Primary storage system links down and the ones on the Secondary storage system links up, taking over the volume information including WWN/WWPN of the Primary site so that business servers issues I/O to the Secondary storage system. To achieve this functionality a user needs to configure the “Storage Cluster” feature using ETERNUS SF.

Overview

The ETERNUS “Storage Cluster” is a function which enables the storage system to get high availability by connecting two ETERNUS DX S3 storage systems. One of them is the “Primary” storage system and the other is the “Secondary” storage system. In case where the Primary (active) storage system is no longer available due to hardware failure or unexpected disaster, the I/O path (host connections) of the working business servers are switched to the mirrored Secondary (standby) storage system. In “Auto Mode” configuration this failover is transparent for both servers and applications and ensures uninterrupted operations. Additionally a user could initiate a manual failover from the Primary (active) storage system to the Secondary (standby) storage system any time. This could take place in case when a RAID Group hosting the volumes (TFOV) used in the “Storage Cluster” configuration is destroyed due to several disk failures and the ETERNUS DX S3 storage system is still up and running. Another approach for a manual failover could be storage system downtime due to hardware maintenance or firmware upgrades.

downtime due to hardware maintenance or firmware upgrades. The picture above illustrates the functional design of

The picture above illustrates the functional design of the Storage Cluster feature in a single-sided Transparent Failover (TFO) configuration.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Requirements

For the Storage Cluster feature you need to connect two ETERNUS DX S3 storage systems used as a pair. Each of them could be an

ETERNUS DX100 S3, ETERNUS DX200 S3, ETERNUS DX500 S3 or ETERNUS DX600 S3. The Storage Cluster feature requires firmware version

V10L20-000 or later. In addition you need to have a server running the ETERNUS SF V16.1 Manager software. The operating system used on that server could be either Windows, Linux or Solaris. Read the “ETERNUS SF Express V16 / Storage Cruiser V16 / AdvancedCopy Manager V16 Installation and Setup Guide” for details about the supported version of each operating system.

As a strong recommendation you should use a dedicated server for running the Storage Cluster Monitoring (ETERNUS SF V16.1 Storage Cruiser

Agent) software. Note: The operating system of this server must be Windows based. This might be changed in the future.

The zoning at the Fibre Channel (FC) switches for the business server connections to the ETERNUS DX S3 storage systems must be a WWPN based Fibre Channel (FC) zoning only.

Note: Fibre Channel (FC) ports used by the Storage Cluster feature couldn’t be members of any “Port Group” on each ETERNUS DX S3 storage system and should have exactly the same settings (speed, topology etc.) at the Primary (active) and the Secondary (standby) storage system. In addition “Host Affinity” must be enabled for these Fibre Channel (FC) ports. This can be checked within ETERNUS SF V16.1 for each ETERNUS

DX S3 storage system under Connectivity -> FC Port.

DX S3 storage system under Connectivity -> FC Port. The volumes (TFOV) used by the “Storage

The volumes (TFOV) used by the “Storage Cluster” feature must be created with identical size and the host LUN numbers used in each LUN Group must be identical on both ETERNUS DX S3 storage systems. Make sure that nobody has a lock (is working with the ETERNUS DX S3 HW-GUI) on each of the two ETERNUS DX S3 storage systems involved by the Storage Cluster feature while configuring the Storage Cluster functionality.

Ports used for the REC Path for the Storage Cluster feature could be configured “RA” or “CA/RA”. The later one is not recommended because it will have an influence to the performance of the Storage Cluster feature volumes (TFOV) used by the business servers. In addition you couldn’t attach business servers using the Storage Cluster feature to “CA/RA” ports. These business server connections need to have dedicated “CA” ports only.

The REC Path must be configured using ETERNUS SF V16.1 or using the ETERNUS DX S3 HW-GUI otherwise the Storage Cluster setup can’t be configured.

Note:

Don’t remove LUNs from a “LUN Group” used by a “TFO Group” which is in Phase = Maintenance !

Software

The Storage Cluster functionality can be set up, configured, managed and checked through the Web Console of the ETERNUS SF V16.1 Manager software. There are two options for initiating a failover from the Primary (active) storage system to the Secondary (standby) storage system.

Automatic Failover

Manual Failover

The Storage Cluster Monitoring function is provided by the ETERNUS SF V16.1 Storage Cruiser Agent software.

Licenses

For each discovered ETERNUS DX S3 storage system used for the Storage Cluster feature you need to purchase and register these kinds of

licenses:

ETERNUS SF Storage Cruiser V16 Standard License

ETERNUS SF Storage Cruiser V16 Storage Cluster Option

ETERNUS SF AdvancedCopy Manager V16 Remote Copy License

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Storage Cluster setup and configuration

Install the ETERNUS SF V16.1 Manager software on one of your servers. Additional information related to the installation can be found in the “ETERNUS SF Express V16 / Storage Cruiser V16 / AdvancedCopy Manager V16 Installation and Setup Guide”. Afterwards you need to discover your two ETERNUS DX S3 storage systems and register all required licenses for both ETERNUS DX S3 storage systems needed by the Storage Cluster feature. As a strong recommendation you should discover the Fibre Channel (FC) switches, for receiving SNMP traps in case of problems at the switches, as well. This can be done in a read-only way so that ETERNUS SF V16.1 isn’t able to modify any switch configuration. See Appendix for details about the read-only discovery of Fibre Channel (FC) switches. As another recommendation you should install the ETERNUS SF V16 Storage Cruiser Agent software at your business servers and discover these servers in ETERNUS SF V16.1 as well. This will enable the graphical end-to-end correlation view within ETERNUS SF V16 Manager GUI for these servers. Set up the WWPN Zoning between the ETERNUS DX S3 storage systems and your business servers at the Fibre Channel (FC) switches first. Afterwards start your setup of the Storage Cluster functionality.

The setup and configuration of the Storage Cluster feature needs to be done at the ETERNUS SF V16.1 Manager GUI. All related settings needed for the Storage Cluster setup and configuration can be found under the “Connectivity” and the “Storage Cluster” selection in the category pane of a discovered ETERNUS DX S3 storage system in the ETERNUS SF V16.1 Manager GUI.

DX S3 storage system in the ETERNUS SF V16.1 Manager GUI. As a rule of thumb

As a rule of thumb you should use self-explanatory names for all related configuration elements used by the Storage Cluster functionality such as FC Hosts (e.g. PRI_SRV01_HBA0 and SEC_SRV01_HBA0), LUN Groups (e.g. PRI_SRV01_LG and SEC_SRV01_LG) and TFO Groups (e.g. DX600_to_DX500 or DX500#1_DX500#2). Please be aware that any kind of names you are using for the configuration of the Storage Cluster feature should not exceed 16 characters. In addition you should start the setup of the Storage Cluster feature always at the Primary (active) ETERNUS DX S3 storage system within the ETERNUS SF V16.1 Manager GUI.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Storage Cluster configuration

Switch to the Primary (active) ETERNUS DX S3 storage system within the ETERNUS SF V16.1 Manager GUI and enter the “Storage Cluster” section. There you will find everything related to the Storage Cluster “TFO Group” and an entry point for creating a REC Path used by the Storage Cluster feature.

for creating a REC Path used by the Storage Cluster feature. You should start to create

You should start to create a REC Path first. The REC Path configuration will be done by the well-known REC Path configuration wizard. Additional information about the creation of an ETERNUS DX S3 REC Path is available in the ETERNUS SF V16 documentation. Supported protocols used by the REC Path for the Storage Cluster feature are “FC” and “iSCSI”. It is strongly recommended to use at least one port of each CM of the two ETERNUS DX S3 storage systems for the REC Path configuration.

Note: Because the REC configuration runs always in synchronous mode, you should change the “Priority Level” at each ETERNUS DX S3 storage system, involved in the Storage Cluster functionality to the highest number.

This setting can be done using the ETERNUS DX S3 HW-GUI only. Please refer to “Advanced Copy” -> “Settings” -> “Copy Path” -> “Modify REC Multiplicity” to modify the “Priority Level”.

REC Multiplicity” to modify the “Priority Level”. After the REC Path configuration is done, you should

After the REC Path configuration is done, you should start with the creation of your Storage Cluster “TFO Group”. The “Set” button in the “Action” pane could be used to create a new or modify an existing and selected Storage Cluster “TFO Group”.

an existing and selected Storage Cluster “TFO Group”. Select the “Remote Disk Array” from the list

Select the “Remote Disk Array” from the list of available storage systems. Because we started the creation of our Storage Cluster “TFO Group” at the Primary (active) ETERNUS DX S3 storage system, the “Local” option must be selected for the “Primary Disk Array”. Enter the name of this “TFO Group” and choose your “Failover Mode”.

of this “TFO Group” and choose your “Failover Mode”. The “Split Mode” settings are related to

The “Split Mode” settings are related to the status of the REC Path. To achieve application consistency for any case of automatic failover to the Secondary (standby) ETERNUS DX S3 storage system, you may select “Read” as the “Split Mode”. Note: If you select the “Read” option the business servers will get an I/O error for write requests in case the REC Path is broken.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Last but not least you need to select the Fibre Channel (FC) port pairs used by the Storage Cluster feature.

Channel (FC) port pairs used by the Storage Cluster feature. Note: You can’t use “CA/RA” ports

Note: You can’t use “CA/RA” ports for the creation of the Fibre Channel (FC) port pairs used by the Storage Cluster feature.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Storage Cluster allocating Business Volumes

Next thing to do is to register all WWN names of your business servers Host Bus Adapters (HBA). Open the “Connectivity” category, select “Host” and press the “Add FC Host” button to set up the FC Host.

press the “Add FC Host” button to set up the FC Host. You should find all

You should find all WWN numbers of your business servers HBA's already connected to Fibre Channel (FC) ports (e.g. CM#0 CA#0 Port#3) at the Primary (active) ETERNUS DX S3 storage system.

at the Primary (active) ETERNUS DX S3 storage system. Therefore identify the Channel Adapter (CA) port

Therefore identify the Channel Adapter (CA) port on that ETERNUS DX S3 storage system and register the names of each WWN number. As mentioned above use dedicated names (e.g. PRI_SRV01_HBA0) to identify these FC Hosts for future reference. You should note down the WWN numbers for manual registration of each “FC Host” at the Secondary (standby) ETERNUS DX S3 storage system later on.

Secondary (standby) ETERNUS DX S3 storage system later on. Enter the name of this “FC Host”,
Secondary (standby) ETERNUS DX S3 storage system later on. Enter the name of this “FC Host”,

Enter the name of this “FC Host”, select the “Host Response”, press the “Next” button and confirm your settings at the next screen. Repeat these steps for all WWN numbers of your business servers Host Bus Adapters (HBA) connected to the Primary (active) ETERNUS DX S3 storage system. After you have completely finished this part, you need to setup the LUN Group including the volumes for your business servers. Select “Affinity/LUN Group” in the “Connectivity” category of the Primary (active) ETERNUS DX S3 storage system and create the LUN Group.

category of the Primary (active) ETERNUS DX S3 storage system and create the LUN Group. Page

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Again you should choose a self-explanatory name for the LUN Group used by the Storage Cluster feature. Enter the “Host LUN Number” for the selected volumes and add them to the list of “Assigned Volumes”. Press the “Next” button and confirm your settings at the next screen.

button and confirm your settings at the next screen. Note: You should write down the “LUN
button and confirm your settings at the next screen. Note: You should write down the “LUN

Note: You should write down the “LUN No.” including the “Capacity” of each volume added to the list of “Assigned Volumes” for the creation of the corresponding LUN Group at the Secondary (standby) ETERNUS DX S3 storage system later on.

Note: After adding the volumes to the LUN Group you need to check the reservation status of each volume using the ETERNUS DX S3 HW-GUI. If there are still persistent reservations left over you need to remove them from each volume first. Select the volume and use the “Release Reservation” Action button for this purpose. The picture below will show details about “Reservation”.

button for this purpose. The picture below will show details about “Reservation”. Page 8 of 27

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

After finishing this part of the Storage Cluster feature setup, you need to create the Host Affinity for your business servers. Switch to the “Host Affinity” section in the “Connectivity” category pane of the Primary (active) ETERNUS DX S3 storage system.

pane of the Primary (active) ETERNUS DX S3 storage system. Press the “Create” button and start

Press the “Create” button and start configuring the Host Affinity using the created “FC Host” and the associated “LUN Group” attached to the Fibre Channel (FC) port of the Primary (active) ETERNUS DX S3 storage system. You need to repeat this process for each WWN of your business servers Host Bus Adapters (HBA).

each WWN of your business servers Host Bus Adapters (HBA). Note: You couldn’t create the “Host

Note: You couldn’t create the “Host Affinity” using “Host Group”, “Port Group” and “LUN Group” at the ETENRUS DX S3 HW GUI. This won’t work with the Storage Cluster feature.

Important Note: After removing a TFO volume from the LUN Group at the Secondary (standby) ETERNUS DX S3 storage system you must change the unique identifier (UID) of that volume to use it as a “Standard” Volume. For this purpose you can use the ETERNUS DX HW CLI “set volume” command using the “-uid” parameter. (See the “ETERNUS CLI User's Guide” for details)

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

The Storage Cluster feature configuration is nearly done at the Primary (active) ETERNUS DX S3 storage system. Now you need to setup the corresponding settings at the Secondary (standby) ETERNUS DX S3 storage system. Therefore register the WWN numbers, which you noted down while creating each Host at the Primary (active) ETERNUS DX S3 storage system, of each Host Bus Adapters (HBA) belonging to your business servers manually. Use self-explanatory names (e.g. SEC_SRV01_HBA0, SEC_SRV01_HBA1) for this process.

(e.g. SEC_SRV01_HBA0, SEC_SRV01_HBA1) for this process. Enter all needed information in the input fields of each

Enter all needed information in the input fields of each “FC Host” and add it to the list. Press the “Next” button to confirm your settings.

list. Press the “Next” button to confirm your settings. Create the corresponding “Affinity/LUN Group” and all

Create the corresponding “Affinity/LUN Group” and all the “Host Affinity” of your business servers Host Bus Adapters (HBA) at the Secondary (standby) ETERNUS DX S3 storage system afterwards. Keep in mind that the corresponding “Affinity/LUN Group” (e.g. SEC_SRV01_LG) must use same number of volumes including the exact same “LUN No.” and the exact same “Capacity” of each volume added to the list of “Assigned Volumes”. The procedure for all of these tasks is the same as you did at the Primary (active) ETERNUS DX S3 storage system.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Storage Cluster Controller setup

As already mentioned above, you should install the ETERNUS SF V16.1 Storage Cruiser Agent software used for Storage Cluster Monitoring on a dedicated server. Information about the installation of that software could be found in the “ETERNUS SF Express V16 / Storage Cruiser V16 / AdvancedCopy Manager V16 Installation and Setup Guide”. After the installation succeeded you need to modify two configuration files. This will enable the ETERNUS SF V16.1 Storage Cruiser Agent to be the Storage Cluster Controller for your environment. The default installation directory of the ETERNUS SF V16.1 Storage Cruiser Agent software is “C:\ETERNUS_SF “. Using the default installation the two files (Correlation.ini and TFOConfig.ini) are located under the “C:\ETERNUS_SF\ESC\Agent\etc” directory.

Add the following lines at the end of the “Correlation.ini” file:

#----------------

# Storage Cluster Controller Server configuration

#----------------

StorageClusterController=ON

The “TFOConfig.ini” file is responsible for identifying the two ETERNUS DX S3 storage systems used for the Storage Cluster functionality. Therefore you need to add the Master IP address of each ETERNUS DX S3 storage system into that file.

Here comes an example how the input should look like:

IP=192.168.100.60

IP=192.168.200.50

After the modifications on both files took place, you need to restart the ETERNUS SF V16.1 Storage Cruiser Agent to reflect the settings. You will find additional information in the “ETERNUS SF Storage Cruiser V16 Operation Guide” for any kind of details. In addition you should discover the Storage Cluster Controller (using the Storage Cruiser Agent functionality) within the ETERNUS SF V16 Manager software.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Storage Cluster processing

Management (such as changing TFO Group Name, Failover Mode or Split Mode) of a “TFO Group” can be done either at the Primary (active) or Secondary (standby) ETERNUS DX S3 storage system. Select the “TFO Group” and press the “Set” button in the Action pane for this purpose. Modifying LUN Groups (such as adding or removing Volumes to/from the LUN Group) used by the Storage Cluster feature should always started at the Primary (active) ETERNUS DX S3 storage system first. After this is done you should modify the corresponding LUN Group at the Secondary (standby) ETERNUS DX S3 storage system.

You need to check the status of your Storage Cluster configuration at the Storage Cluster Controller. If you were using the default installation of the ETERNUS SF V16.1 Storage Cruiser Agent software you will find the CLI script here: “C:\ETERNUS_SF\ESC\Agent\bin”. Here comes an example output of the “agtpatrol.bat” CLI script:

C:\ETERNUS_SF\ESC\Agent\bin> agtpatrol.bat

--------------------------------------------------------------------------------

INTERVAL=1000

TARGET IP:

192.168.100.60

192.168.200.50

--------------------------------------------------------------------------------

TARGET TFO GROUP:

IP ADDRESS=192.168.100.60 GROUP NAME=DX600_to_DX500 TYPE=Primary PAIR IP ADDRESS=192.168.200.50 PAIR GROUP NAME=DX600_to_DX500 STATUS=Normal

INTERVAL=1000

UPDATE TIME=Mon Jun 02 10:54:26 CEST 2014

IP ADDRESS=192.168.200.50 GROUP NAME=DX600_to_DX500 TYPE=Secondary PAIR IP ADDRESS=192.168.100.60 PAIR GROUP NAME=DX600_to_DX500 STATUS=Normal

INTERVAL=1000

UPDATE TIME=Mon Jun 02 10:54:26 CEST 2014

Note: “INTERVAL” is the heartbeat rate in milliseconds configured on each ETERNUS DX S3 storage system.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Use the “Refresh” button in the Action pane to update the “TFO Group Status” always to get the actual status of your TFO Groups. This will create a job running in the background that will update the “TFO Group Status”.

the background that will update the “TFO Group Status”. Manual Failover can only be triggered using

Manual Failover can only be triggered using the Storage Cluster section at the Primary (active) ETERNUS DX S3 storage system. You won’t be able to press the “Failover” or “Force-Failover” button at the Secondary (standby) ETERNUS DX S3 storage system.

“Force-Failover” button at the Secondary (standby) ETERNUS DX S3 storage system. Page 13 of 27 fujitsu.com/eternus

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

In case a failover took place due to manual or auto mode, you want to switch back the business servers host connections to the original ETENRUS DX S3 storage system. Before you are able to start this operation you need to check some preconditions. Make sure that the Primary (active) ETERNUS DX S3 storage system is up and running without any hardware related issues. The REC Path connection between the two ETERNUS DX S3 storage systems must be available and the volumes used by the Storage Cluster feature are in sync (Equivalent). The last one must be checked using the details of the associated TFO Group. There you have the capability to verify the status of the REC Copy process for each volume (switch view from “Ports” to “Volumes”) belonging to this TFO Group. If everything is ready for switching back the business servers host connections to the Primary (active) ETERNUS DX S3 storage system (Status = Active and Phase = Equivalent) you are able to start the failback. As you can see at the picture below, this action isn’t available at the Primary (active) ETERNUS DX S3 storage system.

at the Primary (active) ETERNUS DX S3 storage system. Therefore you need to switch the ETERNUS

Therefore you need to switch the ETERNUS SF V16.1 GUI to the Secondary (standby) ETERNUS DX S3 storage system and start the failback action using the associated TFO Group from there.

system and start the failback action using the associated TFO Group from there. Page 14 of

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Storage Cluster bi-directional information and setup

The Storage Cluster feature could be configured bi-directional as well. There is no need to create a new REC Path configuration for this purpose. The existing REC Path between the two ETERNUS DX S3 systems can be shared for that. However you must use dedicated Fibre Channel (FC) ports for the second TFO Group on each ETERNUS DX S3 system. You can’t share Fibre Channel (FC) ports among TFO Groups. As a rule of thumb you should use dedicated RAID Groups for “active” and “passive” TFO Volumes on each ETERNUS DX S3 system involved in a bi-directional Storage Cluster setup. The additional TFO Group including all required resources (Volumes, LUN-Groups, FC-Hosts and Host Affinity) needs to be setup analog as described for the single-sided configuration. The picture below gives you an example how such a configuration could look like.

you an example how such a configuration could look like. Note: The status of the two

Note: The status of the two TFO Groups above differs. For having always the latest status you need to press the “Refresh” button of the “TFO Group Status” for creating a job to update all your TFO Groups. This needs to be done from time to time.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Recovery procedure caused by defect RAID Group

In case a Transparent Failover took place (automatic or manual) due to a broken RAID Group at the Primary (active) ETERNUS DX S3 storage system, there is a special treatment needed to recover the Storage Cluster configuration after the broken RAID Group is repaired. All related configuration steps need to be done at the Secondary (passive) ETERNUS DX S3 storage system. First of all you must login to the CLI using a User Account of the Maintainer Role, because all these steps must be executed using CLI commands of that ETERNUS DX S3 storage system.

Preconditions

If the setup of your TFO Group is configured as Failover Mode = Manual you need to start the failover by using the “Failover Force” in advance. Identify your TFO Group and the start the manual failover. In any case you must make sure that all TFO Volumes are hosted by the Secondary (passive) ETERNUS DX S3 storage system. (The Status of the Secondary TFO Group must be “Active”)

CLI> show tfo-groups

TFO Group No.

[0]

TFO Group Name

[DX600_to_DX500]

Type Status Phase

[ Secondary ] [ Standby ] [Maintenance ]
[
Secondary
]
[
Standby
]
[Maintenance
]

Condition Failover Mode Split Mode Monitor Interval

[Normal] [Manual] [Read/Write] [-]

Pair Box ID

[00ETERNUSDXMS3ET603SAU####OF4621352001##]

Own <-> Pair Port

[CM#0 CA#0 Port#1 <-> CM#0 CA#0 Port#1] [CM#1 CA#0 Port#1 <-> CM#1 CA#0 Port#1]

CLI> forced tfo-group-activate -tfog-number 0 -active-mode manual-failover CLI> show tfo-groups

TFO Group No. TFO Group Name Type Status Phase Condition Failover Mode Split Mode Monitor Interval Pair Box ID Own <-> Pair Port

[0]

[DX600_to_DX500] [ Secondary ] [ Active ] [Maintenance ]
[DX600_to_DX500]
[
Secondary
]
[
Active
]
[Maintenance
]

[Normal] [Manual] [Read/Write] [-]

[00ETERNUSDXMS3ET603SAU####OF4621352001##]

[CM#0 CA#0 Port#1 <-> CM#0 CA#0 Port#1] [CM#1 CA#0 Port#1 <-> CM#1 CA#0 Port#1]

Afterwards stop the Transparent Failover Replication of the TFO Volumes (Status = “Error Suspend”) located at the broken RAID Group of the Primary (active) ETERNUS DX S3 storage system. Follow these steps to fulfill this requirement:

1. Step

Identify volumes located on the broken RAID Group which are in “Error Suspend” status.

CLI> show tfo-pair -tfog-number 0

<TFO Group Info #0> TFO Group Name [DX600_to_DX500]

<Port Info CM#0 CA#0 Port#1>

Host No.

[8]

Host Name

[SEC_SRV01_HBA0]

Own Volume

 

Pair Volume SID

Status

Phase

Error

No.

Name

No.

Code

----- -------------------------------- ----------- ----- ------------- ---------------- -----

 

11

RM_TFO_VOL00

9

13 Error Suspend Equivalent

0x00

16

RM_TFO_VOL05

14

1 Active

Equivalent

0x00

<Port Info CM#1 CA#0 Port#1>

 

Host No.

[10]

Host Name

[SEC_SRV01_HBA1]

Own Volume

 

Pair Volume SID

Status

Phase

Error

No.

Name

No.

Code

----- -------------------------------- ----------- ----- ------------- ---------------- -----

11

RM_TFO_VOL00

9

13 Error Suspend Equivalent

0x00

16

RM_TFO_VOL05

14

1 Active

Equivalent

0x00

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

2. Step

Get detail information about the TFO Copy session with “Status = Error Suspend”.

CLI> show tfo-pair -session-id 13

<Session Info #13> Own Volume No.

[11]

Own Volume Name

[RM_TFO_VOL00]

Pair Volume No.

[9]

Status Phase

[Error Suspend] [Equivalent]

Error Code

[0x26]

Source Block Address

[0x0000000000000000LBA]

Destination Block Address

[0x0000000000000000LBA]

Total Data Size

[30720MB]

Copied Data Size

[29184MB]

Direction Sync Recovery Mode Split Mode

[From Local/To Remote] [Sync] [Automatic] [Automatic]

Remote Session-ID

[13]

Remote Box-ID

[00ETERNUSDXMS3ET603SAU####OF4621352001##]

Time Stamp Elapsed Time Copy Range

[2014-08-25 16:45:29] [31 day 7 hour 36 min 30 sec] [Totally]

Secondary Access Permission [Read Only at Equivalency]

Concurrent Suspend Status

[Normal]

3. Step

Release the copy sessions of TFO Volumes which have the “Status = Error Suspend”.

CLI> release tfo-pair -port 001 -host-number 8 -volume-number 11

4. Step

Restore the broken RAID Group and the associated volumes used as the TFO Volumes at the Primary (active) ETERNUS DX S3 storage system. Please see the maintenance manual for RAID Group recovery. There are 2 possibilities related to the failed RAID Group:

- RAID Forced Recovery - Recovery by [DISK Hot Maintenance]

You can use the “RAID Forced Recovery” options if you think the disks are still OK and the broken RAID Group was forced because of another event, e.g. DE failure. If you think the disks are really broken, then choose Recovery by [DISK Hot Maintenance].

The next screenshots are examples for Recovery by [DISK Hot Maintenance] from the ETERNUS DX HW-GUI.

are examples for Recovery by [DISK Hot Maintenance] from the ETERNUS DX HW-GUI. Page 17 of
are examples for Recovery by [DISK Hot Maintenance] from the ETERNUS DX HW-GUI. Page 17 of

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

[ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL Identify and exchange the broken disks. After exchange of

Identify and exchange the broken disks.

FUJITSU CONFIDENTIAL Identify and exchange the broken disks. After exchange of the broken disks, the status

After exchange of the broken disks, the status of the RAID Group is “Available”, but the volumes are in status “Readying”. The next screenshot is an example how this information will be seen in the ETERNUS DX HW-GUI.

how this information will be seen in the ETERNUS DX HW-GUI. All volumes which are in

All volumes which are in status “Readying” must be formatted first.

Note:

The format of the volume must be done using ETERNUS CLI or ETRNUS SF V16.x manager. If you try to perform the format using the

ETERNUS HW-GUI, you will get the following error message:

ETERNUS HW-GUI, you will get the following error message: 5. Step Go back to the CLI

5. Step

Go back to the CLI of the Secondary (passive) ETERNUS DX S3 storage system using a User Account of the Maintainer Role and restart the Transparent Failover Replication of the TFO Volumes.

CLI> recover tfo-pair -port 001 -host-number 8 -volume-number 11 -recovery-target

primary
primary

This will start a new initial copy of the TFO Volumes hosted by the former broken RAID Group. If the copy succeeded you are able to switchback (Failback) the host access to the Primary (active) ETERNUS DX S3 storage system again.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

6. Step

Afterwards you may want to check the status of the TFO Group and the associated TFO Volumes belonging to that TFO Group.

CLI> show tfo-pair -tfog-number 0

<TFO Group Info #0> TFO Group Name [DX600_to_DX500]

<Port Info CM#0 CA#0 Port#1>

Host No.

[8]

Host Name

[SEC_SRV01_HBA0]

Own Volume

 

Pair Volume SID

Status

Phase

Error

No.

Name

No.

Code

----- -------------------------------- ----------- ----- ------------- ---------------- -----

 

11

RM_TFO_VOL00

9

2

Copying
Copying

Equivalent

0x00

16

RM_TFO_VOL05

14

1 Active

Equivalent

0x00

<Port Info CM#1 CA#0 Port#1>

 

Host No.

[10]

Host Name

[SEC_SRV01_HBA1]

Own Volume

 

Pair Volume SID

Status

Phase

Error

No.

Name

No.

Code

----- -------------------------------- ----------- ----- ------------- ---------------- -----

11

RM_TFO_VOL00

9

2

Copying
Copying

Equivalent

0x00

16

RM_TFO_VOL05

14

1 Active

Equivalent

0x00

CLI> show tfo-pair -session-id 2

<Session Info #2> Own Volume No.

[11]

Own Volume Name

[RM_TFO_VOL00]

Pair Volume No.

[9]

Status Phase

[Active] [Copying]

Error Code

[0x00]

Source Block Address

[0x0000000000000000LBA]

Destination Block Address

[0x0000000000000000LBA]

Total Data Size

[30720MB]

Copied Data Size

[6144MB]

Direction Sync Recovery Mode Split Mode

[From Local/To Remote] [Sync] [Automatic] [Automatic]

Remote Session-ID

[6]

Remote Box-ID

[00ETERNUSDXMS3ET603SAU####OF4621352001##]

Time Stamp Elapsed Time Copy Range

[0000-00-00 00:00:00] [0 day 0 hour 1 min 51 sec] [Totally]

Secondary Access Permission [Read Only at Equivalency]

Concurrent Suspend Status

[Normal]

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Appendix

Here come some helpful hints while dealing with the Storage Cluster feature of ETERNUS DX S3 storage systems. You should examine the Fibre Channel (FC) zone configuration of your Fibre Channel (FC) switches from time to time. Especially check the ports at the FC-Switches involved by the Storage Cluster functionality for issues like “Duplicate Port WWN detected”. See this example for details:

Switch01:admin> switchshow

switchName:

Switch01

switchType:

66.1

switchState:

Online

switchMode:

Native

switchRole:

Subordinate

switchDomain:

169

switchId:

fffca9

switchWwn:

10:00:00:05:1e:83:12:aa

zoning:

ON (My_Fabric2)

switchBeacon:

OFF

FC Router:

OFF

FC Router BB Fabric ID: 1

Address Mode:

0

Fabric Name:

My_Fabric1

Proto

==================================================

Index Port Address Media Speed

State

0 0

a90000

id

8G

Online

FC F-Port 10:00:00:90:fa:50:34:52

1 1

a90100

id

8G

Online

FC F-Port 10:00:00:90:fa:50:3e:60

2 2

a90200

id

8G

Online

FC F-Port 21:00:00:24:ff:53:36:6f

3 3

a90300

id

8G

No_Sync

FC Disabled (Persistent)

4 4

a90400

id

8G

No_Sync

FC Disabled (Persistent)

5 5

a90500

id

8G

Online

FC F-Port 21:00:00:24:ff:53:36:71

6 6

a90600

id

8G

No_Sync

FC Disabled (Persistent)

7 7

a90700

id

8G

In_Sync

FC Disabled (Persistent)

8 8

a90800

id

8G

No_Light

FC Disabled (Persistent)

9 9

a90900

id

8G

In_Sync

FC Disabled (Persistent)

10 10

a90a00

id

8G

Online

FC F-Port 50:00:00:e0:da:80:68:20

11 11

a90b00

id

8G

Online

FC F-Port 50:00:00:e0:da:80:43:20

12 12

a90c00

id

8G

No_Light

FC

13 13

a90d00

id

8G

Online

FC F-Port 10:00:00:90:fa:50:34:1d

14 14

a90e00

id

8G

No_Sync

FC Disabled

 

15 15

a90f00

id

8G

Online

FC F-Port 50:00:00:e0:da:80:43:23

16 16

a91000

id

8G

No_Sync

FC Disabled (Persistent) (Duplicate Port WWN detected)

17 17

a91100

--

8G

No_Module

FC

18 18

a91200

--

8G

No_Module

FC

19 19

a91300

--

8G

No_Module

FC

20 20

a91400

--

8G

No_Module

FC

21 21

a91500

--

8G

No_Module

FC

22 22

a91600

--

8G

No_Module

FC

23 23

a91700

--

8G

No_Module

FC

24 24

a91800

id

N8

Online

FC F-Port 50:00:00:e0:d4:00:01:91

25 25

a91900

id

N8

Online

FC F-Port 50:00:00:e0:d4:00:01:92

26 26

a91a00

id

8G

No_Light

FC

27 27

a91b00

id

8G

No_Light

FC

28 28

a91c00

--

8G

No_Module

FC

29 29

a91d00

--

8G

No_Module

FC

30 30

a91e00

--

8G

No_Module

FC

31 31

a91f00

--

8G

No_Module

FC

32 32

a92000

id

8G

No_Light

FC

33 33

a92100

id

8G

No_Light

FC

34 34

a92200

id

N8

Online

FC E-Port 10:00:00:27:f8:3d:bb:a7 "Switch99" (upstream)(Trunk master)

35 35

a92300

id

N8

Online

FC

E-Port

(Trunk port, master is Port 34 )

36 36

a92400

id

8G

Online

FC F-Port 21:00:00:24:ff:53:36:58

37 37

a92500

id

8G

Online

FC F-Port 21:00:00:24:ff:53:37:2a

38 38

a92600

id

N8

Online

FC F-Port 50:00:00:e0:d4:00:00:90

39 39

a92700

id

8G

No_Light

FC

As already mentioned above you should use the “Refresh” button within the Storage Cluster “Overview” section of the ETERNUS SF V16.1 Manager GUI to update the status of your TFO Groups. The “Set” action could be used to create a new TFO Group or modify an existing TFO Group, which needs to be checked before you press the “Set” button.

existing TFO Group, which needs to be checked before you press the “Set” button. Page 20

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Fibre Channel Switch read-only discovery

First you need to configure your Fibre Channel (FC) switch. Therefore you need to login into the switch using an administrator account and create the user account used by ETERNUS SF V16 later on. If you are using the CLI of the switch the command for creating the user would look like this:

MySwitch:admin> userconfig --add ETSFuser -r user

[Syntax: userconfig --add <New User Account Name> -r user]

Afterwards you need to set a password for this user. The CLI command for this would be:

MySwitch:admin> passwd ETSFuser

[Syntax: passwd <New User Account Name>]

The last configuration step at the Fibre Channel (FC) switch is to create a read-only SNMP community. Again you can use the CLI of the switch for creating the dedicated read-only SNMP community used by ETERNUS SF V16 later on. ETERNUS SF V16 requires a SNMP community of SNMPv1. You can modify the well-known read-only community “public” and change it to e.g. “ETSFsnmp” for this purpose. The CLI command for this would be:

MySwitch:admin> snmpconfig --set snmpv1

Community (rw): [Secret C0de] Trap Recipient's IP address : [0.0.0.0] Community (rw): [OrigEquipMfr] Trap Recipient's IP address : [0.0.0.0] Community (rw): [private] Trap Recipient's IP address : [0.0.0.0] Community (ro): [public] ETSFsnmp Trap Recipient's IP address : [0.0.0.0] Community (ro): [common] Trap Recipient's IP address : [0.0.0.0] Community (ro): [FibreChannel] Trap Recipient's IP address : [0.0.0.0]

You might need to call “snmpconfig --set accessControl” to set or change access-control-related parameters afterwards.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Enter the ETERNUS SF V16 Manager GUI and discover the Fibre Channel (FC) switch using the just created settings on that switch.

discover the Fibre Channel (FC) switch using the just created settings on that switch. Page 22
discover the Fibre Channel (FC) switch using the just created settings on that switch. Page 22

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Status of TFO Group Information

Active/Standby

CONFIDENTIAL Status of TFO Group Information Active/Standby *1: "Unknown" has a meaning common to all the

*1: "Unknown" has a meaning common to all the statuses, so is omitted hereinafter.

Phase

has a meaning common to all the statuses, so is omitted hereinafter. Phase Status Page 23

Status

has a meaning common to all the statuses, so is omitted hereinafter. Phase Status Page 23

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Halt Factor

Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL Halt Factor Page 24 of 27 fujitsu.com/eternus
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL Halt Factor Page 24 of 27 fujitsu.com/eternus

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Recommendations

Notes belonging to multipath settings of your Business Servers:

Linux:

no_path_retry

Specify the number of retries until disable queueing, or fail for immediate failure (no queueing), queue for never stop queueing. Default is 0.

For the Storage Cluster function with an ETERNUS DX S3 storage system you need to specify :

"no_path_retry 10"

fast_io_fail_tmo

The default fast_io_fail_tmo setting for an FC remote port in seconds. If an rport has vanished from the fabric all I/O to the devices on that port will be terminated after this timeout. Should be smaller than dev_loss_tmo setting. Default is 5.

Infos from (Fibre Channel/FCoE/iSCSI/SAS) for Linux device-mapper multipath document:

"fast_io_fail_tmo 1"

Windows:

Windows Server® 2012 R2/ Windows Server® 2012/ Windows Server® 2008 R2/ Windows Server® 2008 Standard Multipath Driver (msdsm) Notes

Various settings, such as the load balance policy and retry count, can be adjusted by using the standard multipath drivers (msdsm) for Windows Server® 2012 R2, Windows Server® 2012, Windows Server® 2008 R2 or Windows Server® 2008. However the following settings should not be changed from their default values.

Screen name

MPIO tab of Multi-Path Disk Device properties Load balance policy, [Details] button, [Edit] button

Parameters that may not be changed

Details of DSM

Timer counter (path checking period, enable path checking,

Details of MPIO paths

number of retries, retry interval, PDO deletion period) Path status

Notes for Host Response Settings:

Don’t use different Host Response settings Active-Active (A-A) or Active-Active Preferred (A-A/P) for the Primary (active) and the Secondary (passive) ETERNUS DX S3 storage system.

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

TFO Checklist

DX S3 Storage Cluster] FUJITSU CONFIDENTIAL TFO Checklist Quick Checklist for TFO Configurations Step Action

Quick Checklist for TFO Configurations

Step

Action

Important Note

1

Check the firmware of both ETERNUS DX S3 systems

A

minimum of V10L20-0000 is required

2

Check that the latest version of ETERNUS SF Manager including

ETERNUS SF V16.1 or higher

all

latest patches are installed

3

Discover both ETERNUS DX S3 systems in ETERNUS SF Manager

 

4

Discover the FC Switches

Read-Only Mode is recommended

5

Check the licenses for both ETERNUS DX S3 systems

ETERNUS SF Storage Cruiser V16 Standard License, ETERNUS SF Storage Cruiser V16 Storage Cluster Option, ETERNUS SF AdvancedCopy Manager V16 Remote Copy License

6

Configure FC Zoning or Direct Cabling for the REC Path

A

minimum of 1 path per CM is recommended

7

Configure the REC Path with ETERNUS SF Manager or the ETERNUS DX S3 HW-GUI

“RA” only Ports are recommended

8

Configure a Storage Cluster “TFO Group”

Use “Split Mode” --> “Read” to achieve Application Consistency. Only CA Ports can be used.

9

Configure FC Zoning from the Business Server(s) to the Primary ETERNUS DX S3 system

Only WWPN based zoning is supported for TFO

10

Create the Business LUNs on both ETERNUS DX S3 systems

Be

sure that the LUNs on both ETERNUS DX S3 systems have the

same size

11

Register the HBAs of the Business Server(s) on the Primary ETERNUS DX S3

Be

sure to use the same Host Response settings on both ETERNUS

DX

S3 arrays. Note down the used Host WWPNs for later usage.

12

Create an Affinity/LUN Group on the Primary ETERNUS DX S3 and add the Business LUNs

Note down the used Host LUN Numbers for later usage

13

Check the reservation status of each volume using the ETERNUS

Remove existing reservations from each volume

DX

S3 HW-GUI

14

Create a Host Affinity for the registered HBAs, the ports and the created LUN Group on the Primary ETERNUS DX S3

is not possible to use “Host Group”, “Port Groups”, “LUN Group” mechanism from the HW-GUI in TFO configurations

It

15

Register the HBAs of the Business Server(s) on the Secondary ETERNUS DX S3

Be

sure to use the same Host Response settings on both ETERNUS

DX

S3 arrays. Add the Host WWPNs manually (info from step 11)

16

Create an Affinity/LUN Group on the Secondary ETERNUS DX S3 and add the Business LUNs

Be

sure to use the same Host LUN Numbers as configured for the

Primary ETERNUS DX S3

17

Create a Host Affinity for the registered HBAs, the ports and the created LUN Group on the Secondary ETERNUS DX S3

is not possible to use “Port Groups” in TFO configurations. Be sure to use the Standby Ports for this Host Affinity.

It

18

Check the TFO Group and TFO Volume Status

 

19

Install the ETERNUS SF Storage Cruiser Agent as Monitoring instance (Storage Cluster Controller)

Only Windows OS is supported

20

Modify the ETERNUS SF Storage Cruiser Agent Configuration files

Correlation.ini & TFOConfig.ini

21

Restart the ETERNUS SF Storage Cruiser Agent Service

 

22

Discover the Storage Cluster Controller Server in the ETERNUS SF Manager

 

Best practice [ETERNUS DX S3 Storage Cluster]

FUJITSU CONFIDENTIAL

Abbreviations

Shortcut

 

Description

CA

Abbreviation of ETERNUS DX Channel Adapter

CM

Abbreviation of ETERNUS DX Controller Module

LUN

Abbreviation of Logical Unit Number

TFO

Abbreviation of Transparent Failover. For the Storage Cluster feature,

it

means operation of failover transparently for operation server.

TFOV

Abbreviation of TFO Volume. A volume assigned in a Storage Cluster configuration.

 

A

group managing connection configuration, policies, states and maintenance for failover.

TFO Group

includes one or more Fibre Channel (FC) CA ports and volumes allowed to access from these CA ports.

It

The state of TFO Group is “Active” (accessible from operation server) or “Standby” (not accessible from operation server).

CA Port Pair

The Storage Cluster feature operates failover by sharing common WWN/WWPN with each Fibre Channel (FC) CA port of two ETERNUS DX S3 storage systems and controlling link state of each Fibre Channel (FC) CA port. This operation is called “CA Port Pairing” and a pair of Fibre Channel (FC) CA ports sharing common WWN/WWPN is called “CA Port Pair”.

WWN / WWPN

Abbreviation of World Wide Name / World Wide Port Name

The diagram below illustrates the different components used by a TFO Group of the Storage Cluster feature, such as:

TFO Group including TFOVs, Affinity Groups and CA Port Pairs

CA Port Pair CA Port Pair Active Standby CA #0 CA #1 CA #0 CA
CA Port Pair
CA Port Pair
Active
Standby
CA #0
CA #1
CA #0
CA #1
TFOV#0
TFOV#0
TFOV#1
TFOV#1
TFOV#2
TFOV#2
Affinity Group #0
Affinity Group #1
Affinity Group #0
Affinity Group #1
TFO Group
TFO Group
Corresponding
Primary storage
Secondary storage

Contact

FUJITSU Limited Address:Shiodome City Center, 5-2, Higashi-shimbashi 1-Chome, Minato-ku, Tokyo 105-7123, Japan

Website: www.fujitsu.com/eternus

ƒ2014 Fujitsu, the Fujitsu logo, [other Fujitsu trademarks /registered trademarks] are trademarks or registered trademarks of Fujitsu Limited in Japan and other countries. Other company, product and service names may be trademarks or registered trademarks of their respective owners. Technical data subject to modification and delivery subject to availability. Any liability that the data and illustrations are complete, actual or correct is excluded. Designations may be trademarks and/or copyrights of the respective manufacturer, the use of which by third parties for their own purposes may infringe the rights of such owner.