Ha Vcs 410 101a 2 10 srtpg2 130918134117 Phpapp01

VERITAS Cluster Server for
UNIX, Fundamentals
(Appendixes)
HA-VCS-410-101A-2-10-SRT (100-002149-B)
COURSE DEVELOPERS Disclaimer
Bilge Gerrits
The information contained in this publication is subject to change without
Siobhan Seeger notice. VERITAS Software Corporation makes no warranty of any kind
Dawn Walker with regard to this guide, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose.
VERITAS Software Corporation shall not be liable for errors contained
herein or for incidental or consequential damages in connection with the
furnishing, performance, or use of this manual.
LEAD SUBJECT MATTER
EXPERTS
Copyright
Geoff Bergren
Paul Johnston Copyright 2005 VERITAS Software Corporation. All rights reserved.
Dave Rogers No part of the contents of this training material may be reproduced in any
form or by any means or be used for the purposes of training or education
Jim Senicka
without the written permission of VERITAS Software Corporation.
Pete Toemmes
Trademark Notice
VERITAS, the VERITAS logo, and VERITAS FirstWatch, VERITAS
Cluster Server, VERITAS File System, VERITAS Volume Manager,
TECHNICAL VERITAS NetBackup, and VERITAS HSM are registered trademarks of
CONTRIBUTORS AND VERITAS Software Corporation. Other product names mentioned herein
REVIEWERS
may be trademarks and/or registered trademarks of their respective
Billie Bachra companies.
Barbara Ceran
VERITAS Cluster Server for UNIX, Fundamentals
Bob Lucas
Participant Guide
Gene Henriksen
April 2005 Release
Margy Cassidy
VERITAS Software Corporation
350 Ellis Street
Mountain View, CA 94043
Phone 6505278000
www.veritas.com
Table of Contents
Appendix A: Lab Synopses
Lab 2 Synopsis: Validating Site Preparation ........................................................... A-2
Lab 3 Synopsis: Installing VCS ............................................................................... A-6
Lab 4 Synopsis: Using the VCS Simulator............................................................ A-18
Lab 5 Synopsis: Preparing Application Services................................................... A-24
Lab 6 Synopsis: Starting and Stopping VCS......................................................... A-29
Lab 7 Synopsis: Online Configuration of a Service Group.................................... A-31
Lab 8 Synopsis: Offline Configuration of a Service Group.................................... A-38
Lab 9 Synopsis: Creating a Parallel Service Group .............................................. A-47
Lab 10 Synopsis: Configuring Notification ............................................................ A-52
Lab 11 Synopsis: Configuring Resource Fault Behavior....................................... A-55
Lab 13 Synopsis: Testing Communication Failures .............................................. A-60
Lab 14 Synopsis: Configuring I/O Fencing............................................................ A-66
Appendix B: Lab Details

Lab 2: Validating Site Preparation........................................................................... B-3
Lab 3: Installing VCS............................................................................................. B-11
Lab 4: Using the VCS Simulator ........................................................................... B-21
Lab 5: Preparing Application Services .................................................................. B-29
Lab 6: Starting and Stopping VCS ........................................................................ B-37
Lab 7: Online Configuration of a Service Group ................................................... B-41
Lab 8: Offline Configuration of a Service Group ................................................... B-57
Lab 9: Creating a Parallel Service Group.............................................................. B-73
Lab 10: Configuring Notification ............................................................................ B-85
Lab 11: Configuring Resource Fault Behavior ...................................................... B-93
Lab 13 Details: Testing Communication Failures................................................ B-101
Lab 14: Configuring I/O Fencing ......................................................................... B-111
Appendix C: Lab Solutions

Lab 2 Solutions: Validating Site Preparation........................................................... C-3
Lab 3 Solutions: Installing VCS............................................................................. C-13
Lab 4 Solutions: Using the VCS Simulator............................................................ C-35
Lab 5 Solutions: Preparing Application Services .................................................. C-51
Lab 6 Solutions: Starting and Stopping VCS ........................................................ C-63
Lab 7 Solutions: Online Configuration of a Service Group.................................... C-67
Lab 8 Solutions: Offline Configuration of a Service Group.................................... C-89
Lab 9 Solutions: Creating a Parallel Service Group............................................ C-109
Lab 10 Solutions: Configuring Notification .......................................................... C-125
Table of Contents i
Copyright 2005 VERITAS Software Corporation. All rights reserved.
Lab 11 Solutions: Configuring Resource Fault Behavior .................................... C-133
Lab 13 Solutions: Testing Communication Failures............................................ C-149
Lab 14 Solutions: Configuring I/O Fencing ......................................................... C-163
Appendix D: Job Aids

Cluster System States............................................................................................. D-2
Resource States and Transitions ............................................................................ D-4
Service Group Configuration Procedure ................................................................. D-5
Resource Configuration Procedure......................................................................... D-6
List of Notifier Events and Traps ............................................................................. D-7
Example Bundled Agent Reference Guide Entries ............................................... D-10
SCSI-3 Persistent Reservations............................................................................ D-17
Best Practices ....................................................................................................... D-18
New Features in VCS 4.1...................................................................................... D-22
New Features in VCS 4.0...................................................................................... D-24
Appendix E: Design Worksheet: Template
ii VERITAS Cluster Server for UNIX, Fundamentals

Appendix A
Lab Synopses
Lab 2: Validating Site Preparation
Visually
Visuallyinspect
inspectthe
theclassroom
classroomlablabsite.
site.
Complete
Complete and validate the designworksheet.
and validate the design worksheet.
Use
Usethe
thelab
labappendix
appendixbest
bestsuited
suitedto
toyour
your
experience
experiencelevel:
level:
?? Appendix
AppendixA:
A:Lab
LabSynopses
Synopses
?? Appendix
AppendixB:
B:Lab
LabDetails
Details
?? Appendix
AppendixC:
C:Lab
LabSolutions
Solutions
train2
train1
System Definition Sample Value Your Value
System train1
System train2
See
Seethe
thenext
nextslide
slidefor
forlab
labassignments.
assignments.
Lab 2 Synopsis: Validating Site Preparation

In this lab, work with your partner to prepare the systems for installing VCS.
Step-by-step instructions for this lab are located on the following page:
Lab 2: Validating Site Preparation, page B-3
Solutions for this exercise are located on the following page:
Lab 2 Solutions: Validating Site Preparation, page C-3
A2 VERITAS Cluster Server for UNIX, Fundamentals

Lab Assignments
Use the table to record your cluster values as you work through the lab.
Object Sample Value Your Value
A
Your system host name train1
your_sys
Partner system host name train2

their_sys
name prefix for your bob

objects
Interconnect link 1 Solaris: qfe0

Sol Mob: dfme0
AIX: en2
HP-UX lan1
Linux: eth2
VA bge2
Sol Mob: dmfe1
AIX: en3
HP-UX lan2
Linux: eth3
VA bge3
Public network interface Solaris: eri0
interface Sol Mob: dfme0
AIX: en1
HP-UX lan0
Linux: eth1
VA bge0
Admin IP address for 192.168.xx.xxx
your_sys

their_sys
Appendix A Lab Synopses A3

Verifying the Network Configuration
1 Verify that the Ethernet network interfaces for the two cluster interconnect
links are cabled together using crossover cables.
Note: In actual implementations, each link should use a completely separate
infrastructure (separate NIC and separate hub or switch). For simplicity of
configuration in the classroom environment, the two interfaces used for the
cluster interconnect are on the same NIC.
Four NodeUNIX
Classroom LAN 192.168.XX, where XX=27, 28, or 29
Software Share
192.168.XX.100
train6 train7
192.168.XX.106 Hub/Switch 192.168.XX.107
train5 Hub/Switch train8
192.168.XX.105 192.168.XX.108
train4 train9
192.168.XX.104 192.168.XX.109
train3 train10
Hub/Switch
Hub/Switch
Hub/Switch
Hub/Switch
192.168.XX.103 192.168.XX.110
SAN
train2 Disk train11
192.168.XX.102 Array 192.168.XX.111
train1 SAN train12
192.168.XX.101 Tape 192.168.XX.112
Library
LAN LAN
2 Verify that the public interface is cabled and accessible on the classroom public
network.
Virtual Academy
Skip this step.

Other Checks
1 Check the PATH environment variable. If necessary, add the /sbin, /usr/
sbin, /opt/VRTS/bin, and /opt/VRTSvcs/bin directories to your
A
PATH environment variable.
2 Check the VERITAS licenses to determine whether a VERITAS Cluster Server

license is installed.
Checking PackagesLinux Only
When installing any Storage Foundation product or VERITAS Volume Replicator,

the VRTSalloc package (the VERITAS Volume Manager Intelligent Storage
Provisioning feature) requires that the following Red Hat packages are installed:
compat-gcc-c++-7.3-2.96.128
compat-libstdc++-7.3-2.96.128
Version 7.3-2.96.128 is provided with Red Hat Enterprise Linux 3 Update 2 (i686).
Configuring Secure ShellLinux Only
Verify that ssh configuration files are set up in order to install VCS on Linux or to
run remote commands without prompts for passwords.
If you do not configure ssh, you are required to type in the root passwords for all
systems for every remote command issued during the following services
preparation lab and the installation procedure.
If you do not want to use ssh with automatic login using saved passphrases on a
regular basis, run the following commands at the command line. This is in effect
only for this session.
exec /usr/bin/ssh-agent $SHELL
ssh-add
Save your passphrase during your GNOME session.
Setting Up a Console WindowLinux Only
1 Open a console window so you can observe messages during later labs.
2 Open a System Log Display tool.

Lab 3: Installing VCS
vcs1
Link 1:______
Link 1:______ Link 2:______
Link 2:______
Public:______ Public:______
train1 train2
4.x ## ./installer
./installer Software
4.x
location:_______________________________
Pre-4.0
Pre-4.0 ## ./installvcs
./installvcs
Subnet:_______
Lab 3 Synopsis: Installing VCS

In this lab, work with your lab partner to install VCS on both systems.
Lab 3: Installing VCS, page B-11
Lab 3 Solutions: Installing VCS, page C-13
Obtaining Classroom Information

Use the following table to collect information you need to install VCS. Your
instructor may also ask you to install VERITAS Volume Manager and VERITAS
File System.

Cluster Definition These values define cluster properties and are required to install
VCS.
Attributes/Properties Sample Value Your Values
A
Node names, cluster train1 train2 vcs1 1 train1
name, and cluster ID train3 train4 vcs2 2 train2
train5 train6 vcs3 3 vcs1
train7 train8 vcs4 4 1
train9 train10 vcs5 5
Cluster interconnect Ethernet interface for Solaris: qfe0
interconnect link #1 Sol Mob: dmfe0
AIX: en2
HP-UX lan1
Linux: eth1
VA: bge2
Ethernet interface for Solaris: qfe1
AIX: en3
HP-UX lan2
Linux: eth2
VA: bge3
Public network Solaris: eri0
interface Sol Mob: dmfe0
interface AIX: en1
HP-UX lan0
Linux: eth1
VA: bge0

VCS.
Web GUI IP Address:
train1 train2 192.168.xxx.91
Subnet Mask 255.255.255.0

Network interface Solaris: eri0
Sol Mob: dmfe0
AIX: en0
HP-UX lan0
Linux: eth0
VA: bge0
NetworkHosts (HP-UX see instructor
only)
Installation software
location
install_dir
License
Administrator account Name admin

Password password

Installing VERITAS Cluster Server Software
1 Obtain the location of the installation software from your instructor.
A
Installation software location:
_____________________________________________________________
2 This first step is to be performed from only one system in the cluster. The install
script installs and configures all systems in the cluster.
a Change to the install directory.
b Run the installer script (VERITAS Product Installer) located in the

directory specified above. For versions of VCS before 4.0, use
installvcs. Use the information in the previous table or design
worksheet to respond to the installation prompts.
Note: For VCS 4.x, install Storage Foundation (Volume Manager and File
System).
c If a license key is needed, obtain one from your instructor and record it
here.
License Key: ____________________________________________
d Install all optional packages (including Web console and Simulator).
e Accept default of Y to configure VCS.
f Do not configure a third heartbeat link at this time.
g Do not configure a low-priority heartbeat link at this time.
h Do not configure VERITAS Security Services.
i Do not set any user names or passwords.
j Retain the default admin user account and password.
k Configure the Cluster Server Cluster Manager.
l Do not configure SMTP Notification.

m Do not configure SNMP Notification.
n Select the option to install all packages simultaneously on all systems.

o Do not set up enclosure-based naming for Volume Manager.
p Start Storage Foundation Enterprise HA processes.
q Do not set up a default disk group.
3 If you did not install the Java GUI package as part of the installer (VPI)
process (or installvcs for earlier versions of VCS), install the VRTScscm
Java GUI package on each system in the cluster. The location of this package is
in the pkgs directory under the install location directory given to you by your
instructor.
Installing Other Software
1 If your instructor indicates that additional software, such as VCS patches or

updates, is required, obtain the location of the installation software from your
instructor.
Installation software directory:
_____________________________________________________________
2 Install any VCS patches or updates, as directed by your instructor. Use the
operating system-specific command.
3 Install any other software indicated by your instructor. For example, if your
classroom uses VCS 3.5, you may be directed to install VERITAS Volume
Manager and VERITAS File System.

Viewing VERITAS Cluster Server Installation Results
You can use the worksheet at the end of this lab synopsis to verify and record your
cluster configuration.
A
1 Verify that VCS is now running using hastatus.
If hastatus -sum shows the cluster systems in a running state and a

ClusterService service group is online on one of your cluster systems, VCS has
been properly installed and configured.
2 Perform additional verification (generally only necessary if there is a problem

displayed by hastatus -sum).
a Verify that all packages are loaded.
b Verify that LLT is running.
c Verify that GAB is running.
Exploring the Default VCS Configuration
View the configuration files set up by the VCS installation procedure.
1 Explore the LLT configuration.
2 Explore the GAB configuration.
3 Explore the VCS configuration files.

Verifying Connectivity with the GUIs
Verify GUI connectivity with the Java GUI and the Web GUI. Both GUIs can
connect to the cluster with the default user of admin and password as the default
password.
1 Use a Web browser to connect to the Web GUI.
2 Start the Java GUI and connect to the cluster using these values:
3 Browse the cluster configuration.

Design Worksheet: Cluster Interconnect Configuration
First system:
A
/etc/VRTSvcs/comms/llttab Sample Value Your Value
set-node train1
(host name)
set-cluster 1
(number in host name of odd
system)
link Solaris: qfe0
Sol Mob: dfme0
AIX: en2
HP-UX lan1
Linux: eth2
VA: bge2
link Solaris: qfe1
Sol Mob: dmfe1
AIX: en3
HP-UX lan2
Linux: eth3
VA: bge3
/etc/VRTSvcs/comms/llthosts Sample Value Your Value

train1
train2
/etc/VRTSvcs/comms/sysname Sample Value Your Value

train1

Second system:

set-node train2
set-cluster 1
link Solaris: qfe0
Sol Mob: dfme0
AIX: en2
HP-UX lan1
Linux: eth2
VA: bge2
link Solaris: qfe1
Sol Mob: dmfe1
AIX: en3
HP-UX lan2
Linux: eth3
VA: bge3

train1
train2

train2
Cluster Configuration (main.cf)
Types Definition Sample Value Your Value

Include types.cf

Cluster Definition Sample Value Your Value
Cluster vcs1
Required Attributes
A
UserNames admin=password
ClusterAddress 192.168.xx.91
Administrators admin
Optional Attributes
CounterInterval 5

System train1 (odd)
System train2 (even)
Service Group Definition Sample Value Your Value

Group ClusterService
Required Attributes
FailoverPolicy Priority
SystemList train1=0 train2=1
Optional Attributes
AutoStartList train1
OnlineRetryLimit 3
Tag CSG

Resource Definition Sample Value Your Value
Service Group ClusterService
Resource Name webip
Resource Type IP
Required Attributes
Device eri0
Address 192.168.xx.91
Optional Attributes
Netmask 255.255.255.0
Critical? Yes (1)
Enabled? Yes (1)

Resource Name csgnic
Resource Type NIC
Required Attributes
Device <platform specific>
Critical? Yes (1)
Enabled? Yes (1)

Resource Name VCSWeb
Resource Type VRTSWebApp
Required Attributes
AppName vcs
InstallDir /opt/VRTSweb/VERITAS
TimeForOnline 5
Critical? Yes (1)
Enabled? Yes (1)

Resource Dependency Definition
Parent Resource Requires Child Resource
A
VCSWeb webip
webip csgnic

Lab 4: Using the VCS Simulator
1. Start the Simulator Java GUI.
hasimgui &
2. Add a cluster.
3. Copy the preconfigured
main.cf file to the new
directory.
4. Start the cluster from the
Simulator GUI.
5. Launch the Cluster Manager
Java Console
6. Log in using the VCS
account oper with password
oper.
This account demonstrates
different privilege levels in
VCS.
Seenext
See next slide for classroom values
See nextslide
slidefor
forlab
labassignments.
assignments.
Lab 4 Synopsis: Using the VCS Simulator

This lab uses the VERITAS Cluster Server Simulator and the Cluster Manager
Java Console. You are provided with a preconfigured main.cf file to learn about
managing the cluster.
Lab 4: Using the VCS Simulator, page B-21
Lab 4 Solutions: Using the VCS Simulator, page C-35

Use the following table to record the values for your classroom.
Attribute Sample Value Your Value

Port 15559
VCS user account/ oper/oper
password

File Locations
Type of File Location

Lab main.cf file:
cf_files_dir
A
Simulator
configuration
directory:
sim_config_dir

Starting the Simulator on UNIX
4 Add /opt/VRTScssim/bin to your PATH environment variable after any

/opt/VRTSvcs/bin entries, if it is not already present.
5 Set the VCS_SIMULATOR_HOME environment variable to /opt/

VRTScssim, if it is not already set.
6 Start the Simulator GUI.
7 Add a cluster.
8 Use these values to define the new simulated cluster:

Cluster Name: vcs_operations
System Name: S1
Port: 15559
Platform: Solaris
WAC Port: -1
9 In a terminal window, change to the simulator configuration directory for the

new simulated cluster named vcs_operations.
10 Copy the main.cf, types.cf, and OracleTypes.cf files provided by

your instructor into the vcs_operations simulation configuration directory.
11 From the Simulator GUI, start the vcs_operations cluster, launch the VCS Java
Console for the vcs_operations simulated cluster, and log in as oper with
password oper.
Note: While you may use admin/password to log in, the point of using oper is to
demonstrate the differences in privileges between VCS user accounts.

Viewing Status and Attributes
1 How many systems are members of the cluster?
A
2 Determine the status of all service groups.
Service Group Status on S1 Status on S2 Status on S3

AppSG
OracleSG
ClusterService
3 Which service groups have service group operator privileges set for the oper
account?
4 Which resources in the AppSG service group have the Critical resource
attribute enabled?
5 Which resource is the top-most parent in the OracleSG service group?
6 Which immediate child resources does the Oracle resource in the OracleSG
service group depend on?

Manipulating Service Groups
1 Attempt to take the ClusterService group offline on S1.
What happens?
2 Attempt to take the AppSG service group offline on S1.
What happens?
3 Attempt to take the Oracle service group offline on S1.
What happens?
4 Take all service groups that you have privileges for offline everywhere.
5 Bring the AppSG service group online on S2.
6 Bring the OracleSG service group online on S1.
7 Switch service group AppSG to S1.
8 Switch the OracleSG service group to S2.
9 Bring all service groups that you have privileges for online on S3.

Manipulating Resources
1 Attempt to take the OraListener resource in OracleSG offline on S3.
A
What happens to the OracleSG service group?
2 Bring the OraListener resource online on S3.
3 Attempt to take the OraMount resource offline on system S3.
What happens?
4 Attempt to bring only the OraListener resource online on S1.
What happens?
5 Fault the Oracle (oracle) resource in the OracleSG service group.
6 What happens to the service group and resource?
7 View the log entries to see the sequence of events.
8 Attempt to switch the OracleSG service group back to S3.
What happens?
9 Clear the fault on the Oracle resource in the OracleSG service group.
10 Switch the OracleSG service group back to S3.
11 Save and close the configuration, log off from the GUI, and stop the simulator.

Lab 5: Preparing Application Services
/bob1/loopy /sue1/loopy
while true while true

NIC NIC
do do
IP Address echo echo IP Address
done done
bobDG1 sueDG1
/bob1 bobVol1
disk1 sueVol1 /sue1
disk2
Disk/Lun Disk/Lun
See
Seenext
nextslide
slidefor
forclassroom
classroomvalues.
values.
Lab 5 Synopsis: Preparing Application Services

The purpose of this lab is to prepare the loopy process service for high availability.
Lab 5: Preparing Application Services, page B-29
Lab 5 Solutions: Preparing Application Services, page C-51
Lab Assignments
Use the design worksheet to gather and record the values needed to complete the
preparation steps.

Service Group nameSG1
A
Resource Name nameNIC1
Resource Type NIC
Required Attributes
Device Solaris: eri0
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
NetworkHosts* 192.168.xx.1 (HP-UX
only)
Critical? No (0)
Enabled? Yes (1)

Resource Name nameIP1
Resource Type IP
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth1
VA: bge0
Address 192.168.xx.5* see table
Optional Attributes
Netmask 255.255.255.0
Critical? No (0)
Enabled? Yes (1)

System IP Address
train1 192.168.xx.51
train2 192.168.xx.52
train3 192.168.xx.53
train4 192.168.xx.54
train5 192.168.xx.55
train6 192.168.xx.56
train7 192.168.xx.57
train8 192.168.xx.58
train9 192.168.xx.59
train10 192.168.xx.60
train11 192.168.xx.61
train12 192.168.xx.62
Configuring Storage for an Application
1 Create a disk group using the convention specified in the worksheet.
2 Create a 2 GB volume and a vxfs file system.
3 Create a mount points, mount the file system on your cluster system, and verify
that it is mounted.
Configuring Networking for an Application
1 Verify that an IP address exists on the base interface for the public network.
2 Configure a virtual IP address on the public network interface using the IP

address from the design worksheet.

Setting up the Application
A script named loopy is used as the example application for this lab exercise.
1 Obtain the location of the loopy script from your instructor.
A
loopy script location:
__________________________________________________________
2 Copy this file to a file named loopy on the file system you created.
3 Start the loopy application in the background.
4 Verify that the loopy application is working correctly.

Manually Migrating the Application
Complete the following steps to migrate the application to the other system.
1 Stop all resources used in this service to prepare to manually migrate the
service.
a Stop your loopy process.
b Stop all storage resources.
c Unconfigure the virtual IP address.
2 On the other cluster system, import your disk group and bring up the remaining
storage resources and the virtual IP address.
3 Start the loopy application and verify that it is running.
4 After you have verified that all resources are working properly on the second
system, stop all resources.

Lab 6: Starting and Stopping VCS
A
vcs1
train1 train2
## hastop
hastop all
all -force
-force
Lab 6 Synopsis: Starting and Stopping VCS

The following procedure demonstrate how the cluster configuration changes states
during startup and shutdown, and shows how the .stale file works.
Lab 6: Starting and Stopping VCS, page B-37
Lab 6 Solutions: Starting and Stopping VCS, page C-63
Note: Complete this section with your lab partner.
1 Verify that there is no .stale file in the configuration directory.
2 Open the cluster configuration and verify that the .stale file has been
created.
3 Try to stop VCS.
4 Stop VCS forcibly and leave the applications running.

5 Start VCS on each system in the cluster and check the cluster status.
Why are all systems in the STALE_ADMIN_WAIT state?
6 Verify that the .stale file is present.
7 Return all systems to a running state (from one system in the cluster). View the
build process to see the LOCAL_BUILD and REMOTE_BUILD system
states.
8 Verify that there is no .stale file.

Lab 7: Online Configuration of a Service Group
Use the Java GUI to:
A
Create a service
group.
Add resources to
the service group
from the bottom of
the dependency
tree.
Substitute the
name you used to
create the disk
group and volume.
Lab 7 Synopsis: Online Configuration of a Service Group

The purpose of this lab is to create a service group while VCS is running using
either the Cluster Manager graphical user interface or the command-line interface.
Lab 7: Online Configuration of a Service Group, page B-41
Lab 7 Solutions: Online Configuration of a Service Group, page C-67

Creating a Service Group
Fill in the design worksheet with values appropriate for your cluster and use the
information to create a service group.

Group nameSG1
Required Attributes
FailOverPolicy Priority
Optional Attributes
1 Create the service group using the values in the table.
2 Save the cluster configuration and view the configuration file to verify your
changes.

Adding Resources to a Service Group
Add NIC, IP, DiskGroup, Volume, and Process resources to the service group
using the information from the design worksheets.
A
After each resource is added:
Bring each resource online.
Save the cluster configuration.

Resource Type NIC
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
only)
Critical? No (0)
Enabled? Yes (1)

Resource Type IP
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
Address 192.168.xx.** see table
Optional Attributes
Netmask 255.255.255.0
Critical? No (0)
Enabled? Yes (1)
System IP Address
train1 192.168.xx.51
train2 192.168.xx.52
train3 192.168.xx.53
train4 192.168.xx.54
train5 192.168.xx.55
train6 192.168.xx.56
train7 192.168.xx.57
train8 192.168.xx.58
train9 192.168.xx.59
train10 192.168.xx.60
train11 192.168.xx.61
train12 192.168.xx.62

Resource Name nameDG1
A
Resource Type DiskGroup
Required Attributes
DiskGroup nameDG1
Optional Attributes
StartVolumes 1
StopVolumes 1
Critical? No (0)
Enabled? Yes (1)

Resource Name nameVol1
Resource Type Volume
Required Attributes
Volume nameVol1
DiskGroup nameDG1
Critical? No (0)
Enabled? Yes (1)

Resource Name nameMount1
Resource Type Mount
Required Attributes
MountPoint /name1
BlockDevice /dev/vx/dsk/nameDG1/
nameVol1 (no spaces)
FSType vxfs
FsckOpt -y
Critical? No (0)
Enabled? Yes (1)

Resource Name nameProcess1
Resource Type Process
Required Attributes
PathName /bin/sh
Optional Attributes
Arguments /name1/loopy name 1
Critical? No (0)
Enabled? Yes (1)

Linking Resources in the Service Group
After you have verified that all resources are online, link the resources as shown in
worksheet.
A
nameVol1 nameDG1
nameMount1 nameVol1
nameIP1 nameNIC1
nameProcess1 nameMount1
nameProcess1 nameIP1
Testing the Service Group
1 Test the service group by switching it between systems.
2 Set each resource to critical.
changes.
4 Close the cluster configuration after all students working in your cluster are
finished.

Lab 8: Offline Configuration of a Service Group
nameSG1
nameSG1 nameSG2
nameSG2 name
name
Process1 Process2
name name name name

Mount1 IP1 IP2 Mount2
name name name name

AppVol
Vol1 NIC1 NIC2 Vol2
name name
App
DG1 Working DG2
Workingtogether,
together,follow
followthe
theoffline
offline DG
configuration
configurationprocedure.
procedure.
Alternately,
Alternately,work
workalone
aloneand
anduse
usethe
the
GUI
GUIto
tocreate
createaanew
newservice
servicegroup.
group.
Lab 8 Synopsis: Offline Configuration of a Service Group

The purpose of this lab is to add a service group by copying and editing the
definition in main.cf for nameSG1.
Lab 8: Offline Configuration of a Service Group, page B-57
Lab 8 Solutions: Offline Configuration of a Service Group, page C-89
Lab Assignments
Complete the following worksheet for the resources managed by the service
groups you create in this lab. Then follow the procedure to configure the resources.

Your system host name Use the same system as
your_sys previous labs
A
Partner system host name Use the same system as
their_sys previous labs
Name prefix for your name
objects
Disk assignment for disk Solaris: c#t#d#

group AIX: hdisk##
HP-UX: c#t#d#
Linux: sd##
Disk group name nameDG2
Volume name nameVol2 (2Gb)
Mount point /name2
Application script location

Prepare Resources
Use the values in the table to prepare resources for VCS.

1 Create a disk group using the convention specified in the worksheet.
2 Create a 2 GB volume and a vxfs file system.
3 Create a mount points, mount the file system on your cluster system, and verify
it is mounted.
4 Copy the loopy script to this file system.
5 Start the loopy and verify that the application is working correctly.
6 Stop the resources to prepare to place them under VCS control in the next
section of the lab.

Completing the Design Worksheet
In the design worksheet, record information needed to create a new service group
using the offline process described in the next section.
A
Group nameSG2
Required Attributes
Optional Attributes

Resource Type NIC
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
only)
Critical? No (0)
Enabled? Yes (1)

Resource Type IP
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
Optional Attributes
Netmask 255.255.255.0
Critical? No (0)
Enabled? Yes (1)
System IP Address
train1 192.168.xx.71
train2 192.168.xx.72
train3 192.168.xx.73
train4 192.168.xx.74
train5 192.168.xx.75
train6 192.168.xx.76
train7 192.168.xx.77
train8 192.168.xx.78
train9 192.168.xx.79
train10 192.168.xx.80
train11 192.168.xx.81
train12 192.168.xx.82

A
Required Attributes
DiskGroup nameDG2
Optional Attributes
StartVolumes 1
StopVolumes 1
Critical? No (0)
Enabled? Yes (1)

Required Attributes
Volume nameVol2
DiskGroup nameDG2
Critical? No (0)
Enabled? Yes (1)

Resource Type Mount
Required Attributes
MountPoint /name2
FSType vxfs
FsckOpt -y
Critical? No (0)
Enabled? Yes (1)

Required Attributes
PathName /bin/sh
Optional Attributes
Critical? No (0)
Enabled? Yes (1)

A
nameVol2 nameDG2
nameMount2 nameVol2
nameIP2 nameNIC2
Modifying a VCS Configuration File
1 Working with your lab partner, verify that the cluster configuration is saved
and closed.
2 Make a test subdirectory of the configuration directory.
3 Create copies of the main.cf and types.cf files in the test subdirectory.
Linux
Also copy the vcsApacheTypes.cf file.
4 One student at a time, modify the main.cf file in the test directory on one
system in the cluster.
a Copy the first students nameSG1 service group structure to a nameSG2

and rename all of the resources within the nameSG1 service group to end
with 2 instead of 1, as shown in the following table.

Existing Name Change To New Name
nameProcess1 nameProcess2
nameIP1 nameIP2
nameNIC1 nameNIC2
nameMount1 nameMount2
nameVol1 nameVol2
nameDG1 nameDG2
b Copy and modify the dependency section.
c Repeat this for the other students service group.
5 Edit the attributes of each copied resource to match the design worksheet
values shown earlier in this section.
6 Verify the cluster configuration and fix any errors found.
7 Stop VCS on all systems, but leave the applications still running.
8 Copy the main.cf file from the test subdirectory into the configuration
directory.
9 Start the cluster from the system where you edited the configuration file and
start the other system in the stale state.
10 Bring the new service group online on your system. Students can bring their
own service groups online.
11 Verify the status of the cluster.

Lab 9: Creating a Parallel Service Group
A
nameSG1
nameSG1 nameSG2
nameSG2 name
name
Process1 Process2
name name name name

name name name name

DBVol
Vol1 Proxy1 Proxy2 Vol2
name name
DB
DG1 Network Network DG2
DG
NIC Phantom NetworkSG
NetworkSG
Lab 9 Synopsis: Creating a Parallel Service Group

The purpose of this lab is to add a parallel service group to monitor the NIC
resource and replace the NIC resources in the failover service groups with Proxy
resources.
Lab 9: Creating a Parallel Service Group, page B-73
Lab 9 Solutions: Creating a Parallel Service Group, page C-109

Creating a Parallel Network Service Group
Work with your lab partner to create a parallel service group containing network
resources using the information in the design worksheet.

Group NetworkSG
Required Attributes
Parallel 1
Optional Attributes
AutoStartList train1 train2

Adding Resources
Use the values in the following tables to create NIC and Phantom resources and
then bring them online. Remember to save the cluster configuration.
A
Service Group NetworkSG
Resource Name NetworkNIC
Resource Type NIC
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
Critical? No (0)
Enabled? Yes (1)

Resource Name NetworkPhantom
Resource Type Phantom
Required Attributes
Critical? No (0)
Enabled? Yes (1)

Replacing NIC Resources with Proxy Resources
Working on your own, use the values in the tables to replace the NIC resources
with Proxy resources and create new links.

Resource Name nameProxy1
Resource Type Proxy
Required Attributes
TargetResName NetworkNIC
Critical? No (0)
Enabled? Yes (1)

Resource Type Proxy
Required Attributes
Critical? No (0)
Enabled? Yes (1)

Resource Name csgProxy
Resource Type Proxy
Required Attributes
Critical? No (0)
Enabled? Yes (1)

Linking Resources and Testing the Service Group
1 Use the values in the tables to replace the NIC resources with Proxy resources
and create new links.
A
2 Switch each service group (nameSG1, nameSG2, ClusterService) to ensure
that they can run on each system.
3 Set all resources to critical.
4 Save and close the cluster configuration.

nameIP1 nameProxy1

nameIP2 nameProxy2

webip csgProxy

Lab 10: Configuring Notification
nameSG1 nameSG2
ClusterService
NotifierMngr
Optional Lab
resfault
resfault
Triggers
Triggers nofailover
nofailover SMTP
SMTPServer:
Server:
resadminwait
resadminwait
___________________________________
___________________________________
Lab 10 Synopsis: Configuring Notification

The purpose of this lab is to configure notification.
Lab 10: Configuring Notification, page B-85
Lab 10 Solutions: Configuring Notification, page C-125

Configuring the NotifierMngr Resource
1 Work with your lab partner to add a NotifierMngr type resource to the
A
ClusterService service group using the information in the design worksheet.
2 Bring the resource online and test the service group by switching it between
systems.
3 Set the notifier resource to critical.
4 Save and close the cluster configuration and view the configuration file to
verify your changes.

Resource Name notifier
Resource Type NotifierMngr
Required Attributes
SmtpServer localhost
SmtpRecipients root Warning
PathName /xxx/xxx (AIX only)
Critical? No (0)
Enabled? Yes (1)
Note: In the next lab, you will see the effects of configuring notification and
triggers when you test various resource fault scenarios.

Optional Lab: Configuring Triggers
Use the following procedure to configure triggers for notification. In this lab, each
student creates a local copy of the trigger script on their own system. If you are
working alone in the cluster, copy your completed triggers to the other system.
1 Create a text file in the /opt/VRTSvcs/bin/triggers directory named

resfault. Add the following lines to the file:
#!/bin/sh
echo `date` > /tmp/resfault.msg
echo message from the resfault trigger >> /tmp/
resfault.msg
echo Resource $2 has faulted on System $1 >> /tmp/
resfault.msg
echo Please check the problem. >> /tmp/resfault.msg
/usr/lib/sendmail root </tmp/resfault.msg
rm /tmp/resfault.msg
2 Create a nofailover trigger using the same script, replacing resfault

with nofailover.
3 Create a resadminwait trigger using the same script, replacing resfault

with resadminwait.
4 Ensure that all trigger files are executable.
5 If you are working alone, copy all triggers to the other system.

Lab 11: Configuring Resource Fault Behavior
Critical=0
A
Critical=1
FaultPropagation=0
nameSG1 FaultPropagation=1
nameSG2
ManageFaults=NONE
ManageFaults=ALL
RestartLimit=1
Note:
Note:Network
Networkinterfaces
interfacesfor
forvirtual
virtualIP
IPaddresses
addresses
are
areunconfigured
unconfiguredtotoforce
forcethe
theIP
IPresource
resourcetotofault.
fault.
In
Inyour
yourclassroom,
classroom,the
theinterface
interfaceyou
youspecify
specifyis:______
is:______
Replace
Replacethe
thevariable
variableinterface
interfacein
inthe
thelab
labsteps
stepswith
withthis
this
value.
value.
Lab 11 Synopsis: Configuring Resource Fault Behavior

The purpose of this lab is to observe how VCS responds to faults in a variety of
scenarios.
Lab 11: Configuring Resource Fault Behavior, page B-93
Lab 11 Solutions: Configuring Resource Fault Behavior, page C-133
Non-Critical Resource Faults
This part of the lab exercise explores the default behavior of VCS. Each student
works independently in this lab.
1 Verify that all resources in the nameSG1 service group are currently set to
critical; if not, set them to critical.
2 Set the IP and Process resources to not critical in the nameSG1 service group.

3 Change the monitor interval for the IP resource type to 10 seconds and the
offline monitor interval for the IP resource type to 30 seconds.
4 Verify that your nameSG1 service group is currently online on your system.
5 Unconfigure the interface corresponding to the virtual IP addressoutside of

VCS.
What happens?
6 Clear any faults.
7 Bring the IP and Process resources back online on your system.
8 Set the IP and process resource to critical in the nameSG1 service group.
Critical Resource Faults
critical.
2 Verify that your nameSG1 service group is currently online on your system.

VCS.
What happens?
4 Without clearing faults from the last failover, unconfigure the virtual IP
address on their system.
What happens?
5 Clear the nameIP1 resource on all systems and bring the nameSG1 service
group online on your system.

Faults within Frozen Service Groups
critical.
A
2 Verify that your nameSG1 Service group is currently online on your system.
3 Freeze the nameSG1 service group.

VCS.
What happens?
5 Bring up the virtual IP address outside of VCS.
What happens?
6 Unconfigure the virtual IP address outside of VCS to fault the IP resource

again. While the resource is faulted, unfreeze the service group.
7 Did unfreezing the service group cause a failover or any resources to come
offline? Explain why or why not.
8 Clear the fault and bring the resource online.

Effects of ManageFaults and FaultPropagation
critical.
2 Set the FaultPropagation attribute for the nameSG1 service group to off (0).

VCS.
What happens?
4 Clear the faulted resource and bring the resource back online.
5 Set the ManageFaults attribute for the nameSG1 service group to NONE and
set the FaultPropagation attribute back to one (1).

VCS.
What happens?
7 Recover the resource from the ADMIN_WAIT state.

VCS.
What happens?
9 Recover the resource from the ADMIN_WAIT state by faulting the service
group.
10 Clear the faulted nameIP1 resource and switch the nameSG1 service group
back to your system.
11 Set ManageFaults back to ALL for the nameSG1 service group and save the

RestartLimit Behavior
This section illustrates failover behavior of a resource type using restart limits.
1 Verify that all resources in the nameSG1 service group are set to critical.
A
2 Set the RestartLimit Attribute for the Process resource type to 1.
3 Stop the loopy process running in the nameSG1 service group by sending a
kill signal.
What happens?
kill signal.
Note: The effects of stopping loopy can take up to 60 seconds to be detected.
What happens?
5 Clear the faulted resource and switch the nameSG1 service group back to your
system.
6 When all students have completed the lab, save and close the configuration.

Lab 13: Testing Communication Failures
1. Configure the InJeopardy trigger (optional).
2. Configure a low-priority link.
3. Test failures.
trainxx
trainxx
O trainxx
trainxx
Optional Lab
Trigger
Trigger injeopardy
injeopardy
Lab 13 Synopsis: Testing Communication Failures

The purpose of this lab is to configure a low-priority link and then pull network
cables and observe how VCS responds.
Lab 13 Details: Testing Communication Failures, page B-101
Lab 13 Solutions: Testing Communication Failures, page C-149

Optional Lab: Configuring the InJeopardy Trigger
Use the following procedure to configure triggers for jeopardy notification. In this
lab, students create a local copy of the trigger script on their own systems.
A
injeopardy. Add the following lines to the file:
#!/bin/sh
echo `date` > /tmp/injeopardy.msg
echo message from the injeopardy trigger >> /tmp/
injeopardy.msg
echo System $1 is in Jeopardy >> /tmp/injeopardy.msg
echo Please check the problem. >> /tmp/injeopardy.msg
/usr/lib/sendmail root </tmp/injeopardy.msg
rm /tmp/injeopardy.msg
2 Make the trigger file executable.
3 If you are working alone, copy the trigger to the other system.
4 Continue with the next lab sections. The Multiple LLT Link Failures
Jeopardy section of this lab shows the effects of configuring the InJeopardy
trigger.

Adding a Low-Priority Link
Working with your lab partner, use the procedures to create a low-priority link and
then fault communication links and observe what occurs in a cluster environment
when fencing is not configured.

Public Ethernet interface Solaris: eri0
for link low-pri Sol Mob: dmfe0
AIX: en1
HP-UX lan0
Linux: eth0
VA: bge0
Cluster interconnect link 1 Solaris: qfe0
Sol Mob: dmfe0
AIX: en2
HP-UX lan1
Linux: eth1
VA: bge2
Sol Mob: dmfe1
AIX: en3
HP-UX lan2
Linux: eth2
VA: bge3
Host name for sysname file train1
for your_sys
for their_sys
2 Shut down VCS, leaving the applications running on all systems in the cluster.
3 Unconfigure GAB and LLT on each system in the cluster.

4 Edit the /etc/llttab LLT configuration file on each system to add a
directive for a low-priority LLT link on the public network.
Solaris Mobile
Skip this step for mobile classrooms. There is only one public interface and it
A
is already configured as a low-priority link.
5 Start LLT and GAB on each system.
6 Verify GAB membership.
7 Start VCS on each system.
Single LLT Link Failure
Note: For Solaris mobile classrooms, skip this section.
1 Copy the lltlink_enable and lltlink_disable utilities from the

location provided by your instructor into the /tmp directory.
_____________________________________________________________
2 Change the NIC resource type MonitorInterval attribute to 3600 seconds

temporarily for communications testing. This prevents the NetworkNIC
resource from faulting during this lab when the low-priority LLT link is pulled.
3 Throughout this lab, use the lltlink_disable command to simulate

failure of an LLT link where you are instructed to remove a link.
Notes:
Use lltlink_enable to restore the LLT link.
The utilities prompt you to select an interface.
These classroom utilities are provided to enable you to simulate
disconnecting and reconnecting Ethernet cables without risk of damaging
connectors.
Run the utility from one system only, unless otherwise specified.

4 Using the lltlink_disable utility, remove one LLT link and watch for
the link to expire in the console or system log file.
5 Restore communications using lltlink_enable.
Multiple LLT Link FailuresJeopardy
1 Verify the status of GAB.
2 Remove all but one LLT link and watch for the link to expire in the console.
Solaris Mobile
Remove only the one high-priority LLT link (dmfe1).
4 Restore communications by replacing the LLT link cables.
Multiple LLT Link FailuresNetwork Partition
1 Verify the status of GAB from each system.
Solaris Mobile
3 From each system, verify that the links are down by checking the status of
GAB.
4 Remove the last LLT link and watch for the link to expire in the console.

5 What is the status of service groups running on each system?
6 Recover from the network partition.
A
7 Change the NIC resource type MonitorInterval attribute back to 60 seconds.

Lab 14: Configuring I/O Fencing
Work with your lab partner to configure fencing.
trainxx trainxx
Disk 1:___________________
Disk 2:___________________ Coordinator Disks
Disk 3:___________________
nameDG1, nameDG2
Lab 14 Synopsis: Configuring I/O Fencing

Use the lab instructions in one of the following appendixes.
Lab 14: Configuring I/O Fencing, page B-111
Lab 14 Solutions: Configuring I/O Fencing, page C-163

Appendix B
Lab Details
B2 VERITAS Cluster Server for UNIX, Fundamentals
B
Lab 2: Validating Site Preparation B3

Visually
Visuallyinspect
inspectthe
theclassroom
site.
Complete
Use
Usethe
thelab
labappendix
appendixbest
bestsuited
suitedto
toyour
your
experience
experiencelevel:
level:
?? Appendix
AppendixA:
A:Lab
LabSynopses
Synopses
?? Appendix
AppendixB:
B:Lab
LabDetails
Details
?? Appendix
AppendixC:
C:Lab
LabSolutions
Solutions
train2
train1
System train1
System train2
See
Seethe
thenext
nextslide
slidefor
forlab
labassignments.
assignments.
In this lab, you work with your partner to prepare the systems for installing VCS.
Brief instructions for this lab are located on the following page:
Lab 2 Synopsis: Validating Site Preparation, page A-2
Lab 2 Solutions: Validating Site Preparation, page C-3
Lab Assignments
Fill in the table with the applicable values for your lab cluster.

your_sys

their_sys
objects

Sol Mob: dfme0
AIX: en2
HP-UX lan1
Linux: eth2
VA bge2

Sol Mob: dmfe1
AIX: en3
HP-UX lan2
Linux: eth3
VA bge3
B
AIX: en1
HP-UX lan0
Linux: eth1
VA bge0
your_sys

their_sys

Four NodeUNIX
Software Share
192.168.XX.100
train6 train7
192.168.XX.106 Hub/Switch 192.168.XX.107
192.168.XX.105 192.168.XX.108
train4 train9
192.168.XX.104 192.168.XX.109
train3 train10
Hub/Switch
Hub/Switch
Hub/Switch
Hub/Switch
192.168.XX.103 192.168.XX.110
SAN
train2 Disk train11
192.168.XX.102 Array 192.168.XX.111
train1 SAN train12
192.168.XX.101 Tape 192.168.XX.112
Library
LAN LAN
2 Verify that the public interface is cabled, as shown in the diagram.

Virtual Academy
Skip this step.
3 Determine the host name of the local system.
4 Determine the base IP address configured on the public network interface for
both your system and your partners system.
5 Verify that the public IP address of each system in your cluster is listed in the
/etc/hosts file.

6 Test connectivity to your partners system on the public network.
Other Checks
B
When you install any Storage Foundation product or VERITAS Volume

Replicator, the VRTSalloc package (the VERITAS Volume Manager Intelligent
Storage Provisioning feature) requires that the following Red Hat packages are
installed:
compat-gcc-c++-7.3-2.96.128
To determine whether these library versions are installed, type:
# rpm -qi compat-gcc-c++
# rpm -qi compat-libstdc++

To configure ssh:
1 Log on to your system.
2 Generate a DSA key pair on this system by running the following command:
ssh-keygen -t dsa
3 Accept the default location of ~/.ssh/id_dsa.
4 When prompted, do not enter a passphrase.
5 Change the permissions of the .ssh directory by typing:
# chmod 755 ~/.ssh
6 The file ~/.ssh/id_dsa.pub contains a line beginning with ssh_dss and

ending with the name of the system on which it was created.
a Copy this line to the /root/.ssh/authorized_keys2 file on all

systems where VCS is to be installed.
b Ensure that you copy the line to the other systems in your cluster.
c To ensure easy accessibility, include all of the ssh_dss lines in the

authorized_keys2 file on each system in the cluster. This allows
commands to be run from any system to any system.

ssh-add
To save your passphrase during your GNOME session, follow these steps:
1 The openssh-askpass-gnome package should be loaded on your system.
To confirm this, type:
B
rpm -q openssh-askpass-gnome
If it is not installed, see your instructor.
2 If you do not have a $HOME/.Xclients file (you should not have one after
installation), run switchdesk to create it. In your $HOME/.Xclients file,
edit the following:
exec $HOME/.Xclients-default
Change the line so that it reads:
exec /usr/bin/ssh-agent $HOME/.Xclients-default
3 From the Red Hat icon, select Preferences>More Preferences>Sessions.
a Click the Startup Programs Tab and Add and enter /usr/bin/ssh-add
in the Startup Command text area.
b Set the priority to a number higher than any existing commands to ensure
that it is executed last. A good priority number for ssh-add is 70 or
higher. The higher the priority number, the lower the priority. If you have
other programs listed, this one should have the lowest priority.
c Click OK to save your settings, and exit the GNOME Control Center.
4 Log out and then log back into GNOME; in other words, restart X.
After GNOME is started, a dialog box is displayed, prompting for your

passphrases. Enter the passphrase requested. If you have both DSA and RSA
key pairs configured, you are prompted for both. From this point on, you
should not be prompted for a password by ssh, scp, or sftp.
For more information, see the Linux Customization Guide.

a Select Run from the main menu or Gnome foot icon.
b Type this command:
xterm -C -fg white -bg black -sl 2000 &
This opens a console window with a white foreground, a black background,

and a scroll line buffer of 2000 lines.
From the RedHat icon, select System Tools>System Logs.

B
Lab 3: Installing VCS B11

vcs1
Link 1:______
Link 1:______ Link 2:______
Link 2:______
train1 train2
4.x ## ./installer
4.x
location:_______________________________
Pre-4.0
./installvcs
Subnet:_______
In this lab, you work with your lab partner to install VCS on both systems.
Lab 3 Synopsis: Installing VCS, page A-6
Lab 3 Solutions: Installing VCS, page C-13

File System.

VCS.
B
AIX: en2
HP-UX lan1
Linux: eth1
VA: bge2
AIX: en3
HP-UX lan2
Linux: eth2
VA: bge3
Public network Solaris: eri0
interface AIX: en1
HP-UX lan0
Linux: eth0
VA: bge0

VCS.
Web GUI IP Address:

Sol Mob: dmfe0
AIX: en0
HP-UX lan0
Linux: eth0
VA: bge0
only)
location
install_dir
License

Password password

____________________________________________________________
B
2 This step is to be performed from only one system in the cluster. The install

directory specified above.
Notes:
For VCS 4.x, install Storage Foundation HA (which includes VCS,
Volume Manager, and File System).
Use the information in the previous table or design worksheet to
respond to the installation prompts.
Sample prompts and input are provided at the end of the lab solution in
Appendix C.
For versions of VCS before 4.0, use installvcs.
here.
License Key: _________________________________
e Accept the default of Y to configure VCS.

3 If you did not install the Java GUI package as part of the installer (CPI)
instructor.


instructor.
B
_______________________________________
install_dir
operating system-specific command, as shown in the following examples.
Solaris
pkgadd -d /install_dir/pkgs VRTSxxxx
HP
swinstall -s /install_dir/pkgs VRTSxxxx
AIX
installp -a -d /install_dir/pkgs/VRTSxxxx.rte.bff
VRTSxxxx.rte
Linux
rpm -ihv VRTSxxxx-x.x.xx.xx-GA_RHEL.i686.rpm


ClusterService service group is online on one of your cluster systems, VCS has
been properly installed and configured.


a Verify that the cluster ID, system names, and network interfaces specified
during install are present in the /etc/llttab file.
B
b Verify the system names in the /etc/llthosts file.
Verify that the number of systems in the cluster matches the value for the
-n flag set in the /etc/gabtab file.
Verify the cluster name, system names, and IP address for the Cluster Manager
in the /etc/VRTSvcs/conf/config/main.cf file.

password.

The URL is http://ipaddress:8181/vcs.
The IP address is given in the design worksheet and was entered during
installation to configure the Cluster Manager.
Cluster alias: nameCluster
Host name: ip_address (used during installation)
Failover retries: 12 (retain default)

B
Lab 4: Using the VCS Simulator B21

hasimgui &
2. Add a cluster.
directory.
Simulator GUI.
Java Console
oper.
VCS.
Seenext
See nextslide
slidefor
forlab
labassignments.
assignments.
Lab 4 Synopsis: Using the VCS Simulator, page A-18
Lab 4 Solutions: Using the VCS Simulator, page C-35


Port 15559
password

File Locations

Lab main.cf file
cf_files_dir
Local Simulator
config directory
sim_config_dir



4 Add a cluster.

System Name: S1
Port: 15559
Platform: Solaris
WAC Port: -1

Specify this directory in place of sim_config_dir variable elsewhere in

the lab.

your instructor into the vcs_operations simulation configuration directory.
Source location of main.cf, types.cf, and OracleTypes.cf files:
___________________________________________
cf_files_dir
8 From the Simulator GUI, start the vcs_operations cluster.
9 Launch the VCS Java Console for the vcs_operations simulated cluster.

10 Log in as oper with password oper.


AppSG
OracleSG
ClusterService
account?
attribute enabled?

What happens?
What happens?
B
What happens?

1 Attempt to take the OraListener resource in OracleSG offline on S3.
2 Bring the OraListener resource online on S3.
What happens?
What happens?
5 Fault the Oracle (oracle) resource in the OracleSG service group.
What happens?
11 Save and close the configuration.
12 Log off from the GUI.
13 Stop the simulator.

B
Lab 5: Preparing Application Services B29


NIC NIC
do do
done done
bobDG1 sueDG1
/bob1 bobVol1
disk1 sueVol1 /sue1
disk2
Disk/Lun Disk/Lun
See
Seenext
nextslide
slidefor
forclassroom
classroomvalues.
values.
Lab 5 Synopsis: Preparing Application Services, page A-24
Lab 5 Solutions: Preparing Application Services, page C-51
Lab Assignments

your_sys
their_sys
objects

group: disk_dev AIX: hdisk##
HP-UX: c#t#d#
Linux: sd##

Volume name nameVol1
Mount point /name1
Public network interface: Solaris: eri0

AIX: en1
HP-UX lan0
B
Linux: eth0
VA bge0
IP Address train1 192.168.xxx.51
ipaddress train2 192.168.xxx.52
train3 192.168.xxx.53
train4 192.168.xxx.54
train5 192.168.xxx.55
train6 192.168.xxx.56
train7 192.168.xxx.57
train8 192.168.xxx.58
train9 192.168.xxx.59
train10 192.168.xxx.60
train11 192.168.xxx.61
train12 192.168.xxx.62
class_sw_dir

1 Verify disk availability for Volume Manager.
2 Determine whether any disks are already in use in disk groups.
3 Initialize a disk for Volume Manager using the disk device from the worksheet.
4 Create a disk group with the name from the worksheet using the initialized
disk.
5 Create a 2 GB volume in the disk group.
6 Create a vxfs file system on the volume.
7 Create a mount point on each system in the cluster.
8 Mount the file system on your cluster system.
9 Verify that the file system is mounted on your system.

Complete the following steps to set up a virtual IP address for the application.
2 Configure a virtual IP address on the public network interface. Use the IP

B
3 Verify that the virtual IP address is configured.

__________________________________________________________
2 Copy or type this code to a file named loopy on the file system you created
previously in this lab.
3 Verify that you have a console window open to see the display from the script.

1 Stop your loopy process by sending a kill signal. Verify that the process is
stopped.
2 Remove the virtual IP address configured earlier in this lab. Verify that the IP
B
address is no longer configured.
3 Unmount your file system and verify that it is no longer mounted.
4 Stop the volume and verify that it is disabled.
5 Deport your disk group and verify that it is deported.
6 Log in to the other system.
7 Update VxVM so that the disk group is visible.
8 Import your disk group and verify that it imported.
9 Start your volume and verify that it is enabled.
10 Verify that your mount point directory exists. Create it if it does not exist.
11 Mount your file system and verify that it is mounted.
12 Configure your virtual IP address and verify that it is configured.
13 Start the loopy application and verify that it is running.

Bringing the Services Offline
Complete the following steps to bring the application offline on the other system
so that it is ready to be placed under VCS control.
1 While still logged into the other system, stop your loopy process by sending a
kill signal. Verify that the process is stopped.

B
Lab 6: Starting and Stopping VCS B37

vcs1
train1 train2
## hastop
hastop all
all -force
-force
Lab 6 Synopsis: Starting and Stopping VCS, page A-29
Lab 6 Solutions: Starting and Stopping VCS, page C-63
1 Change to the /etc/VRTSvcs/conf/config directory.
2 Verify that there is no .stale file in the /etc/VRTSvcs/conf/config

directory. This file should not exist yet.
3 Open the cluster configuration.
4 Verify that the .stale file has been created in the directory,
/etc/VRTSvcs/conf/config.
5 Attempt to stop VCS using the hastop -all command.

6 Stop the cluster using the hastop -all -force command from one system
only to stop VCS forcibly and leave the applications running.
7 Start VCS on each system in the cluster.
9 Why are all systems in the STALE_ADMIN_WAIT state?
B
10 Verify that the .stale file is present in the /etc/VRTSvcs/conf/config
directory. This file should exist.
11 Return all systems to a running state (from one system in the cluster).
12 Watch the console during the build process to see the LOCAL_BUILD and
REMOTE_BUILD system states.
13 Check the status of the cluster.

directory. This file should have been removed.
Lab 6: Starting and Stopping VCS B39

B
Lab 7: Online Configuration of a Service
Group
Lab 7: Online Configuration of a Service Group B41

Create a service
group.
Add resources to
the service group
from the bottom of
the dependency
tree.
Substitute the
name you used to
create the disk
group and volume.
Lab 7 Synopsis: Online Configuration of a Service Group, page A-31
Lab 7 Solutions: Online Configuration of a Service Group, page C-67
Classroom-Specific Values
Fill in this table with the applicable values for your lab cluster.

Service group prefix name
name
your_sys
their_sys


Group nameSG1
Required Attributes
B
Optional Attributes
1 If you are using the GUI, start Cluster Manager and log in to the cluster.
3 Create the service group.
4 Modify the SystemList to allow the service group to run on the two systems
specified in the design worksheet.
5 Modify the AutoStartList attribute to allow the service group to start on your
system.
6 Verify that the service group can autostart and that it is a failover service group.
changes.

Complete the following steps to add NIC, IP, DiskGroup, Volume, and Process
resources to the service group using the information from the design worksheet.
Adding an NIC Resource

Resource Type NIC
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
only)
Critical? No (0)
Enabled? Yes (1)
1 Add the resource to the service group.
2 Set the resource to not critical.
3 Set the required attributes for this resource, and any optional attributes, if
needed.
4 Enable the resource.

5 Verify that the resource is online. Because this is a persistent resource, you do
not need to bring it online.
changes.

Adding an IP Resource

Resource Type IP
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth1
VA: bge0
Optional Attributes
Netmask 255.255.255.0
Critical? No (0)
Enabled? Yes (1)
System IP Address
train1 192.168.xx.51
train2 192.168.xx.52
train3 192.168.xx.53
train4 192.168.xx.54
train5 192.168.xx.55
train6 192.168.xx.56
train7 192.168.xx.57
train8 192.168.xx.58
train9 192.168.xx.59
train10 192.168.xx.60
train11 192.168.xx.61
train12 192.168.xx.62

needed.
B
5 Bring the resource online on your system.
6 Verify that the resource is online.
changes.

Adding a DiskGroup Resource

Required Attributes
DiskGroup nameDG1
Optional Attributes
StartVolumes 1
StopVolumes 1
Critical? No (0)
Enabled? Yes (1)
1 Add the resource to the service group using either the GUI or CLI.
needed.
6 Verify that the resource is online in VCS and at the O/S level.
changes.

Adding a Volume Resource

Required Attributes
B
Volume nameVol1
DiskGroup nameDG1
Critical? No (0)
Enabled? Yes (1)
needed.
6 Verify that the resource is online in VCS and at the operating system level.
changes.

Adding a Mount Resource

Resource Type Mount
Required Attributes
MountPoint /name1
FSType vxfs
FsckOpt -y
Critical? No (0)
Enabled? Yes (1)
needed.
changes.

Adding a Process Resource

Required Attributes
B
PathName /bin/sh
Optional Attributes
Critical? No (0)
Enabled? Yes (1)
needed.
5 Ensure that you have the console or a terminal window open for loopy output.
changes.


nameVol1 nameDG1
nameMount1 nameVol1
nameIP1 nameNIC1
1 Link resource pairs together based on the design worksheet.
2 Verify that the resources are linked properly.
changes.

Complete the following steps to test the service group on each system in the
service group SystemList.
1 Test the service group by switching away from your system in the cluster.
2 Verify that the service group came online properly on their system.
B
3 Test the service group by switching it back to your system in the cluster.
4 Verify that the service group came online properly on your system.

Setting Resources to Critical
changes.
finished.

Partial Sample Configuration File
group nameSG1 (
SystemList = { train1 = 0, train2 = 1 }
AutoStartList = { train1 }
)
B
DiskGroup nameDG1 (
DiskGroup = nameDG1
)
IP nameIP1 (
Device = eri0
Address = "192.168.27.51"
)
Mount nameMount1 (
MountPoint = "/name1"
BlockDevice = "/dev/vx/dsk/nameDG1/
nameVol1"
FSType = vxfs
FsckOpt = "-y"
)
Process nameProcess1 (
PathName = "/bin/sh"
Arguments = "/name1/loopy name 1"
)
NIC nameNIC1 (
Device = eri0
)

Volume nameVol1 (
Volume = nameVol1
DiskGroup = nameDG1
)
nameIP1 requires nameNIC1

nameMount1 requires nameVol1
nameProcess1 requires nameIP1
nameProcess1 requires nameMount1
nameVol1 requires nameDG1

B
Lab 8: Offline Configuration of a Service
Group
Lab 8: Offline Configuration of a Service Group B57

nameSG1
nameSG1 nameSG2
nameSG2 name
name
Process1 Process2
name name name name

name name name name

AppVol
Vol1 NIC1 NIC2 Vol2
name name
App
DG1 Working DG2
Workingtogether,
together,follow
followthe
theoffline
offline DG
configuration
procedure.
Alternately,
Alternately,work
workalone
aloneand
anduse
usethe
the
GUI
GUIto
tocreate
createaanew
newservice
servicegroup.
group.
Lab 8 Synopsis: Offline Configuration of a Service Group, page A-38
Lab 8 Solutions: Offline Configuration of a Service Group, page C-89
Lab Assignments

objects

group AIX: hdisk##
HP-UX: c#t#d#
Linux: sd##

Mount point /name2
B
class_sw_dir

Prepare Resources
disk.
5 Create a vxfs file system on the volume.
9 Copy the loopy script to your file system created in this lab.
10 Start the new loopy application.
11 Verify that the new loopy application is working correctly.

section of the lab.
a Stop the loopy process by sending a kill signal. Verify that the process is
stopped.
b Unmount your file system and verify that it is no longer mounted.
c Stop the volume and verify that it is disabled.
B
d Deport your disk group and verify that it is deported.

In the design worksheet, record information needed to create a new service group
using the offline process described in the next section.

Group nameSG2
Required Attributes
Optional Attributes

Resource Type NIC
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
only)
Critical? No (0)
Enabled? Yes (1)

Resource Type IP
Required Attributes
Sol Mob: dmfe0
B
AIX: en1
HP-UX: lan0
Linux: eth0
VA bge0
Optional Attributes
Netmask 255.255.255.0
Critical? No (0)
Enabled? Yes (1)
System IP Address
train1 192.168.xx.71
train2 192.168.xx.72
train3 192.168.xx.73
train4 192.168.xx.74
train5 192.168.xx.75
train6 192.168.xx.76
train7 192.168.xx.77
train8 192.168.xx.78
train9 192.168.xx.79
train10 192.168.xx.80
train11 192.168.xx.81
train12 192.168.xx.82

Required Attributes
DiskGroup nameDG2
Optional Attributes
StartVolumes 1
StopVolumes 1
Critical? No (0)
Enabled? Yes (1)

Required Attributes
Volume nameVol2
DiskGroup nameDG2
Critical? No (0)
Enabled? Yes (1)

Resource Type Mount
Required Attributes
MountPoint /name2
B
FSType vxfs
FsckOpt -y
Critical? No (0)
Enabled? Yes (1)

Required Attributes
PathName /bin/sh
Optional Attributes
Critical? No (0)
Enabled? Yes (1)

nameVol2 nameDG2
nameMount2 nameVol2
nameIP2 nameNIC2

Note: You may choose to use the GUI to create the nameSG2 service group. If so,
skip this section and complete the Alternate Lab section instead.
and closed.
B
2 Change to the VCS configuration directory.
3 Make a subdirectory named test.
4 Copy the main.cf and types.cf files into the test subdirectory.
Linux
5 Change to the test directory.
6 Edit the main.cf file in the test directory on one system in the cluster.
a For each students service group, copy the nameSG1 service group
structure to a nameSG2.
b Rename all of the resources within the nameSG1 service group to end with
2 instead of 1, as shown in the following table.

nameIP1 nameIP2
nameNIC1 nameNIC2
nameVol1 nameVol2
nameDG1 nameDG2
c Copy and modify the dependency section.

10 Verify that the loopy applications are still running.
directory.
12 Start the cluster from the system where you edited the configuration file.
13 Start the cluster in the stale state on the other system in the cluster (where the
configuration was not edited).
15 View the build process to see the LOCAL_BUILD and REMOTE_BUILD

system states.

Alternate Lab: Using the GUI to Create the Service Group
Use the information in the design worksheet in the previous section to create a new
service group, using the GUI to copy resources from the nameSG1 service group.
1 Start Cluster Manager and log in to the cluster.
B
system.
changes.
8 Copy all resources from the nameSG1 service group to nameSG2.
Note: When you paste a copied resource or resource tree, the Name Clashes
window is displayed, which enables you to rename each resource you are
pasting.
Change the resource names as shown in the table:

nameIP1 nameIP2
nameNIC1 nameNIC2
nameVol1 nameVol2
nameDG1 nameDG2
9 Set each resource to not critical.
10 Modify each resource to set the attribute values as specified in the worksheet.
changes.
12 Enable each resource.
13 Bring the nameSG2 resources online, starting from the bottom of the
dependency tree.
Note: In the GUI, the Close configuration action saves the configuration
automatically.

group nameSG2 (
)
DiskGroup nameDG2 (
B
DiskGroup = nameDG2
)
IP nameIP2 (
Device = eri0
Address = "192.168.27.71"
)
Mount nameMount2 (
nameVol2"
FSType = vxfs
FsckOpt = "-y"
)
)
NIC nameNIC2 (
Device = eri0
)

Volume nameVol2 (
Volume = nameVol2
DiskGroup = nameDG2
)


B
Lab 9: Creating a Parallel Service Group B73

nameSG1
nameSG1 nameSG2
nameSG2 name
name
Process1 Process2
name name name name

name name name name

DBVol
name name
DB
DG
NetworkSG
resources.
Lab 9 Synopsis: Creating a Parallel Service Group, page A-47
Lab 9 Solutions: Creating a Parallel Service Group, page C-109

Group NetworkSG
Required Attributes
Parallel 1
Optional Attributes

3 Modify the SystemList to allow the service group to run on the systems
4 Modify the AutoStartList attribute to allow the service group to start on both
B
systems.
5 Modify the Parallel attribute to allow the service group to run on both systems.
6 View the service group attribute settings.

Adding Resources
Use the values in the following tables to create NIC and Phantom resources.

Resource Type NIC
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
Critical? No (0)
Enabled? Yes (1)

Required Attributes
Critical? No (0)
Enabled? Yes (1)
1 Add the NIC resource to the service group.
needed.

5 Verify that the resource is online. Because it is a persistent resource, you do not
need to bring it online.
6 Add the Phantom resource to the service group.
B
9 Verify that the status of the NetworkSG service group now shows as online.
10 Save the cluster configuration and view the configuration file.

Use the values in the tables to replace the NIC resources with Proxy resources and
create new links.

Resource Type Proxy
Required Attributes
Critical? No (0)
Enabled? Yes (1)

Resource Type Proxy
Required Attributes
Critical? No (0)
Enabled? Yes (1)

Resource Type Proxy
Required Attributes
Critical? No (0)
Enabled? Yes (1)

1 Delete all NIC resources in the ClusterService, nameSG1, and nameSG2
service groups.
Note: Only one student can delete the ClusterService NIC resource.
2 Add a proxy resource to each failover service group using the service group
naming convention:
nameProxy1
nameProxy2
csgProxy
B
3 Set the value for each Proxy TargetResName attribute to NetworkNIC.
4 Set the resources to not critical.
5 Enable the resources.
6 Verify that the Proxy resources are in an online state.
7 Save the cluster configuration.

create new links.

nameIP1 nameProxy1

nameIP2 nameProxy2

webip csgProxy
1 Link the Proxy resources as children of the corresponding IP resources of each

service group.


Sample Configuration File
include "types.cf"
cluster vcs (
UserNames = { admin = ElmElgLimHmmKumGlj }
ClusterAddress = "192.168.27.51"
Administrators = { admin }
B
CounterInterval = 5
)
system train1 (
)
system train2 (
)
group ClusterService (
AutoStartList = { train1, train2 }
OnlineRetryLimit = 3
Tag = CSG
)
IP webip (
Device = eri0
Address = "192.168.27.42"
NetMask = "255.255.255.0"
)
Proxy csgProxy (
TargetResName = NetworkNIC
)
VRTSWebApp VCSweb (

Critical = 0
AppName = vcs
InstallDir = "/opt/VRTSweb/VERITAS"
TimeForOnline = 5
)
VCSweb requires webip

webip requires csgProxy
group NetworkSG (
Parallel = 1
AutoStartList = ( train1, train2 }
)
NIC NetworkNIC (
Device = eri0
)
Phantom NetworkPhantom (
)
group nameSG1 (
)
DiskGroup nameDG1 (
DiskGroup = nameDG1
)
IP nameIP1 (
Device = eri0
Address = "192.168.27.51"
)

Mount nameMount1 (
nameVol1"
FSType = vxfs
FsckOpt = "-y"
)
B
)
Proxy nameProxy1 (
)
Volume nameVol1 (
Volume = nameVol1
DiskGroup = nameDG1
)
nameIP1 requires nameProxy1

group nameSG2 (
)
DiskGroup nameDG2 (

DiskGroup = nameDG2
)
IP nameIP2 (
Device = eri0
Address = "192.168.27.71"
)
Mount nameMount2 (
nameVol2"
FSType = vxfs
FsckOpt = "-y"
)
)
Proxy nameProxy2 (
)
Volume nameVol2 (
Volume = nameVol2
DiskGroup = nameDG2
)


B
Lab 10: Configuring Notification B85

nameSG1 nameSG2
ClusterService
NotifierMngr
Optional Lab
resfault
resfault
Triggers
Triggers nofailover
nofailover SMTP
SMTPServer:
Server:
resadminwait
resadminwait
___________________________________
___________________________________

Lab 10 Synopsis: Configuring Notification, page A-52
Lab 10 Solutions: Configuring Notification, page C-125

Work with your lab partner to add a NotifierMngr type resource to the

B
Required Attributes
Critical? No (0)
Enabled? Yes (1)
4 Set the required attributes for this resource and any optional attributes, if
needed.
6 Link the notifier resource to csgproxy.
7 Bring the resource online on the system running the ClusterService service
group.


1 Test the service group by switching it to the other system in the cluster.
2 Verify that the service group came online properly on the other system.
3 Test the service group by switching it back to the original system in the cluster.
B
4 Verify that the service group came online properly on the original system.

#!/bin/sh
resfault.msg
resfault.msg

nofailover. Add the following lines to the file.
#!/bin/sh
echo `date` > /tmp/nofailover.msg
echo message from the nofailover trigger >> /tmp/
nofailover.msg
echo no failover for service group $2 >> /tmp/
nofailover.msg
echo Please check the problem. >> /tmp/nofailover.msg
/usr/lib/sendmail root </tmp/nofailover.msg
rm /tmp/nofailover.msg

resadminwait. Add the following lines to the file.
#!/bin/sh
echo `date` > /tmp/resadminwait.msg
echo message from the resadminwait trigger >> /tmp/
resadminwait.msg
echo Resource $2 on System $1 is in adminwait for
Reason $3 >> /tmp/resadminwait.msg
echo Please check the problem. >> /tmp/resadminwait.msg
B
/usr/lib/sendmail root </tmp/resadminwait.msg
rm /tmp/resadminwait.msg

B
Lab 11: Configuring Resource Fault
Behavior
Lab 11: Configuring Resource Fault Behavior B93

Critical=0
Critical=1
FaultPropagation=0
nameSG2
ManageFaults=NONE
ManageFaults=ALL
RestartLimit=1
Note:
Note:Network
Networkinterfaces
interfacesfor
forvirtual
virtualIP
IPaddresses
addresses
are
areunconfigured
forcethe
theIP
IPresource
resourcetotofault.
fault.
In
Inyour
yourclassroom,
classroom,the
theinterface
interfaceyou
youspecify
specifyis:______
is:______
Replace
Replacethe
thevariable
variableinterface
interfacein
inthe
thelab
labsteps
stepswith
withthis
this
value.
value.
scenarios.
Lab 11 Synopsis: Configuring Resource Fault Behavior, page A-55
Lab 11 Solutions: Configuring Resource Fault Behavior, page C-133

This part of the lab exercise explores the default behavior of VCS. Each student
works independently in this lab.
B
6 Verify that your nameSG1 service group is currently online on your system. If
it is not, bring it online or switch it to your system.

VCS.
a What happens to the resources?
b Does the service group fail over?
c Did you receive e-mail notification?
8 Clear any faults.

critical.
2 Set all resources to critical, if they are not already set, and save the cluster
configuration.
it is not online locally, bring it online or switch it to your system.

VCS.

critical.
configuration.
B
3 Verify that your nameSG1 Service group is currently online on your system. If

VCS.

What happens?


This section illustrates service group failover behavior using the ManageFaults
and FaultPropagation attributes.
critical.
configuration.

VCS.

VCS.
8 Recover the resource from the ADMIN_WAIT state by bringing up the IP

address outside of VCS and clearing the AdminWait attribute without a fault.

VCS.
B
group.

1 Verify that all resources in the nameSG1 service group are set to critical.
2 Set all resources to critical and save the cluster configuration.
kill signal.
kill signal.
Note: The effects of stopping loopy can take up to 60 seconds to be detected.
system.

B
Lab 13 Details: Testing Communication
Failures
Lab 13 Details: Testing Communication Failures B101

3. Test failures.
trainxx
trainxx
O trainxx
trainxx
Optional Lab
Trigger
Trigger injeopardy
injeopardy
Lab 13 Synopsis: Testing Communication Failures, page A-60
Lab 13 Solutions: Testing Communication Failures, page C-149

lab, students create a local copy of the trigger script on their own systems. If you
are working alone in the cluster, copy your completed triggers to the other system.

B
#!/bin/sh
injeopardy.msg
trigger.

Adding a Low-Priority Link
Working with your lab partner, use the procedures to create a low-priority link and
then fault communication links and observe what occurs in a cluster environment
when fencing is not configured.

AIX: en1
HP-UX lan0
Linux: eth0
VA: bge0
Sol Mob: dmfe0
AIX: en2
HP-UX lan1
Linux: eth1
VA: bge2
Sol Mob: dmfe1
AIX: en3
HP-UX lan2
Linux: eth2
VA: bge3
for your_sys
for their_sys
3 Unconfigure GAB on each system in the cluster.
4 Unconfigure LLT on each system in the cluster.

Solaris Mobile
6 Start LLT on each system.
7 Verify that LLT is running.
B
8 Start GAB on each system.
11 Verify that VCS is running.


_____________________________________________________________
2 Change to the /tmp directory.
cd /tmp


Notes:
connectors.
6 Verify that the link is down.
7 Restore communications using the lltlink_enable utility.
8 Verify that the link is now up and communications are restored.

2 Use lltlink_disable to remove all but one LLT link and watch for the
link to expire in the console.
Solaris Mobile
B
3 Verify that the links are down.

Solaris Mobile
3 Verify that the links are down from each system.
6 Verify that all links are down from each system.
a Stop HAD on one system but leave services running.
Note: If you have more than two systems in the cluster, you must stop
HAD on all systems on either side of the network partition.
b If you physically unplugged cables, restore communications reconnecting

the LLT link cables.
Note: If you used lltlink_disable to simulate link failure, skip this

step.
c Verify that the LLT connections are up.

d Verify that GAB has proper membership.
e Start VCS on the system where you stopped it.
f Verify that each service group is autoenabled.

B
Lab 14: Configuring I/O Fencing B111

trainxx trainxx
Disk 1:___________________
Disk 3:___________________
nameDG1, nameDG2
The purpose of this lab is to set up I/O fencing in a two-node cluster and simulate
node and communication failures.
Lab 14 Synopsis: Configuring I/O Fencing, page A-66
Lab 14 Solutions: Configuring I/O Fencing, page C-163

Lab Assignments
Working with your lab partner, use the following procedure and the information
provided in the table to configure fencing for your cluster.

Disk assignments for cXtXdXsX
coordinator disk group cXtXdXsX
cXtXdXsX
Disk group name oddfendg
B
or
evenfendg
/etc/vxfendg oddfendg
or
evenfendg
UseFence cluster attribute SCSI3

Configuring Disks and Fencing Driver
1 Configure a disk group for the coordinator disks.
a Initialize three disks for use in the disk group.
b Display your cluster ID. Your cluster ID determines your coordinator disk
group name.
c Initialize the disk group.

If your cluster ID is odd, use oddfendg for the disk group name.
If your cluster ID is even, use evenfendg for the disk group name.
Note: Replace the placeholder string "______fendg" with the

appropriate odd or even coordinator disk name throughout the remainder of
this lab.
d Deport the disk group.
2 Optional for the classroom: Use the vxfentsthdw utility to verify that the
shared storage disks support SCSI-3 persistent reservations.
Notes:
For the purposes of this lab, you do not need to test the disks. The disks
used in this lab support SCSI-3 persistent reservations. The complete steps
are given here as a guide for real-world use.
To see how the command is used, you can run vxfentsthdw on a disk
not in use; this will enable you to continue with the lab while the
vxfentsthdw is running.
Create a test disk group with one disk and run vxfentsthdw on that test
disk group.
Use the -r option to perform read-only testing of data disks.
3 Enter the coordinator disk group name in the /etc/vxfendg fencing

configuration file on each system in the cluster.
4 Start the fencing driver on each system using the vxfen init script.

5 Verify that the /etc/vxfentab file has been created on each system and it
contains a list of the coordinator disks.
6 Verify the setup of the coordinator disks.
a Verify that port b GAB membership is listed for both nodes.
b Verify that registrations are assigned to the coordinator disks.
How many keys are present for each disk and why?
B
c

Configuring VCS for Fencing
1 Verify that you have a Storage Foundation Enterprise license installed on each
system for fencing support using vxlicrep.
2 Working together, verify that the cluster configuration is saved and closed.
4 Make a subdirectory named test, if one does not already exist.
7 Edit the main.cf file on that one system to set UseFence to SCSI3.
8 Verify the cluster configuration and correct any errors found.
9 Stop VCS and shut down the applications. The disk groups must be reimported
for fencing to take effect.
directory.
14 Verify that the UseFence cluster attribute is set.

Verifying Data Disks for I/O Fencing
1 If the service groups with disk groups did not come online at cluster startup,
bring them online now. This imports the disk groups, which initiates fencing
on the data disks. Each student can perform these steps on their service groups.
2 Verify registrations and reservations on the data disks.

Testing Communication Failures
In most cases, the following sections require that you work together with your lab
partner to observe how fencing protects data in a variety of failure situations.
Steps you can perform on your own are indicated within the procedure.
Scenario 1: Manual Concurrency Violation

Students can try this scenario on their own. Try to import a disk group imported on
one system on another system using vxdg with the -C option.
1 On the system where nameDG1 is not imported, attempt to manually import it

clearing the host locks.
2 Were you successful? Describe why or why not.
Scenario 2: Response to System Failure

Work with your lab partner to observe how VCS responds to system failures.
1 Verify that the nameSG1 and nameSG2 service groups are online on your
system if two students are working on the cluster. If you are working alone,
ensure that you have a service group online on each system. This scenario
requires that disk groups be imported on each system. Switch them, if
necessary.
2 Verify the registrations on the coordinator disks for both systems.
3 Verify the registrations and reservations on the data disks for the disk groups
imported on each system.
4 Fail one of the systems by removing power or hard booting the system.
Observe the failure.
5 Verify the registrations on the coordinator disks for the remaining system.
6 Verify that the service groups that were running on the failed system have
failed over to the remaining system.

7 Verify that the registrations and reservations on the data disks are now for the
remaining system.
8 Boot the failed system and observe it rejoin cluster membership. Verify cluster
membership and verify that the coordinator disks have registrations for both
systems again.
Scenario 3: Response to Interconnect Failures

Work with your lab partner to observe how VCS responds to cluster interconnect
B
failures.
1 If you did not already perform this step in the Testing Communication
Failures lab, copy the lltlink_enable and lltlink_disable
utilities from the location provided by your instructor into the /tmp directory.
_____________________________________________________________

temporarily for the purposes of communications testing. This prevents the
NetworkNIC resource from faulting during this lab when the low-priority LLT
link is pulled.
requires that one disk group be imported on each system. Switch the service
groups, if necessary.
7 Using the lltlink_disable utility, remove all cluster interconnect links

from one system. Watch for the link to expire in the console.

8 Observe LLT and GAB timeouts and membership change.
9 What happens to the systems?
10 On one system, view the registrations for the coordinator disks.
11 What happens to the service groups?
remaining system.
13 When the system that rebooted is running, check the status of GAB and HAD.
14 Verify that the coordinator disks have registrations for the remaining system
only.
15 Recover the system that rebooted.
a Shut down the system.
b Reconnect the cluster interconnects.
c Reboot the system.
16 Verify that cluster membership has been established for both systems and both
systems are now registered with the coordinator disks.
17 Set the NIC resource type's monitor interval to back to 60.

Optional: Removing the Fencing Configuration
Note: Do not complete this section unless directed by your instructor.
1 Verify that the cluster configuration is saved and closed.
2 Stop VCS and all service groups.
B
3 Unconfigure the fencing driver.
4 From one system, import and remove the coordinator disk group.
5 Use the offline configuration procedure to set the UseFence cluster attribute to
the value NONE in the main.cf file and restart the cluster with the new
configuration.
Note: You cannot set UseFence dynamically while VCS is running.
a Change to the configuration directory.
b Copy the main.cf file into the test subdirectory.
c Edit the main.cf file in the test directory on one system in the cluster to
set the value of UseFence to NONE.
7 Copy the main.cf file back into the /etc/VRTSvcs/conf/config

directory.

Appendix C
Lab Solutions
C2 VERITAS Cluster Server for UNIX, Fundamentals
Lab 2 Solutions: Validating Site
Preparation
Lab 2 Solutions: Validating Site Preparation C3

Visually
Visuallyinspect
inspectthe
theclassroom
site.
Complete
Use
Usethe
thelab
labappendix
appendixbest
bestsuited
suitedto
toyour
your
experience
experiencelevel:
level:
?? Appendix
AppendixA:
A:Lab
LabSynopses
Synopses
?? Appendix
AppendixB:
B:Lab
LabDetails
Details
?? Appendix
AppendixC:
C:Lab
LabSolutions
Solutions
train2
train1
System train1
System train2
See
Seethe
thenext
nextslide
slidefor
forlab
labassignments.
assignments.
In this lab, you work with your partner to prepare the systems for installing VCS.
Lab 2 Synopsis: Validating Site Preparation, page A-2
Lab 2: Validating Site Preparation, page B-3
Lab Assignments
Fill in the following table with the applicable values for your lab cluster.

your_sys

their_sys

objects

Sol Mob: dfme0
AIX: en2
HP-UX lan1
Linux: eth2
VA bge2

Sol Mob: dmfe1
AIX: en3
HP-UX lan2
Linux: eth3
VA bge3
AIX: en1
HP-UX lan0
Linux: eth1
VA bge0
C
your_sys

their_sys

Four NodeUNIX
Software Share
192.168.XX.100
train6 train7
192.168.XX.106 Hub/Switch 192.168.XX.107
192.168.XX.105 192.168.XX.108
train4 train9
192.168.XX.104 192.168.XX.109
train3 train10
Hub/Switch
Hub/Switch
Hub/Switch
Hub/Switch
192.168.XX.103 192.168.XX.110
SAN
train2 Disk train11
192.168.XX.102 Array 192.168.XX.111
train1 SAN train12
192.168.XX.101 Tape 192.168.XX.112
Library
LAN LAN
2 Verify that the public interface is cabled, as shown in the diagram.

Virtual Academy
Skip this step.
3 Determine the host name of the local system.
hostname
4 Determine the base IP address configured on the public network interface for
both your system and your partners system.
ifconfig public_interface

5 Verify that the public IP address of each system in your cluster is listed in the
/etc/hosts file.
cat /etc/hosts
6 Test connectivity to your partners system on the public network.
ping public_IP_address

Other Checks
echo $PATH | grep VRTSvcs
If you are using the Bourne Shell (sh, ksh, or bash), use the following
command:
$PATH=/sbin:/usr/sbin:/opt/VRTS/bin:/opt/VRTSvcs/
bin:$PATH;
export PATH
If you are using the C Shell (csh or tcsh), use the following command:
% setenv PATH /sbin:/usr/sbin:/opt/VRTS/bin:/opt/
VRTSvcs/bin:$PATH

vxlicrep -s
License Key = P2EE-TCBU-FSUN-NDOR-3JEP-CWEO

Product Name = VERITAS Cluster Server
License Type = DEMO
Demo End Date = Sat Nov 8 00:00:00 2003
(15.3 days from now).

When you install any Storage Foundation product or VERITAS Volume

Replicator, the VRTSalloc package (the VERITAS Volume Manager Intelligent
Storage Provisioning feature) requires that the following Red Hat packages are
installed:
compat-gcc-c++-7.3-2.96.128
To determine whether these library versions are installed, type:
# rpm -qi compat-gcc-c++
# rpm -qi compat-libstdc++

1 Log on to your system.
2 Generate a DSA key pair on this system by running the following command:
ssh-keygen -t dsa
3 Accept the default location of ~/.ssh/id_dsa.
4 When prompted, do not enter a passphrase.
5 Change the permissions of the .ssh directory by typing:
# chmod 755 ~/.ssh
6 The file ~/.ssh/id_dsa.pub contains a line beginning with ssh_dss and

ending with the name of the system on which it was created.
a Copy this line to the /root/.ssh/authorized_keys2 file on all

systems where VCS is to be installed.
b Ensure that you copy the line to the other systems in your cluster.
c To ensure easy accessibility, include all of the ssh_dss lines in the

authorized_keys2 file on each system in the cluster. This allows
commands to be run from any system to any system.

ssh-add
To save your passphrase during your GNOME session, follow these steps.
1 The openssh-askpass-gnome package should be loaded on your system.

To confirm this, type:
rpm -q openssh-askpass-gnome
If it is not installed, see your instructor.
2 If you do not have a $HOME/.Xclients file (you should not have one after
C
installation), run switchdesk to create it. In your $HOME/.Xclients file,
edit the following:
exec $HOME/.Xclients-default
Change the line so that it reads:
exec /usr/bin/ssh-agent $HOME/.Xclients-default
3 From the Red Hat icon, select Preferences>More Preferences>Sessions.
a Click the Startup Programs Tab and Add and enter /usr/bin/ssh-add
in the Startup Command text area.
b Set the priority to a number higher than any existing commands to ensure
that it is executed last. A good priority number for ssh-add is 70 or
higher. The higher the priority number, the lower the priority. If you have
other programs listed, this one should have the lowest priority.
c Click OK to save your settings, and exit the GNOME Control Center.
4 Log out and then log back into GNOME; in other words, restart X.
After GNOME is started, a dialog box is displayed, prompting for your

passphrases. Enter the passphrase requested. If you have both DSA and RSA
key pairs configured, you are prompted for both. From this point on, you
should not be prompted for a password by ssh, scp, or sftp.
For more information, see the Linux Customization Guide.

a Select Run from the main menu or Gnome foot icon.
b Type this command:
xterm -C -fg white -bg black -sl 2000 &
This opens a console window with a white foreground, a black background,

and a scroll line buffer of 2000 lines.
From the RedHat icon, select System Tools>System Logs.

Lab 3 Solutions: Installing VCS
Lab 3 Solutions: Installing VCS C13

vcs1
Link 1:______
Link 1:______ Link 2:______
Link 2:______
train1 train2
4.x ## ./installer
4.x
location:_______________________________
Pre-4.0
./installvcs
Subnet:_______
In this lab, work with your lab partner to install VCS on both systems.
Lab 3 Synopsis: Installing VCS, page A-6
Lab 3: Installing VCS, page B-11

File System.

Cluster Definition These values define cluster properties and are required to
install VCS.
AIX: en2
HP-UX lan1
C
Linux: eth1
VA: bge2
AIX: en3
HP-UX lan2
Linux: eth2
VA: bge3
AIX: en1
HP-UX lan0
Linux: eth0
VA: bge0

Cluster Definition These values define cluster properties and are required to
install VCS.
Web GUI IP Address:

Sol Mob: dfme0
AIX: en0
HP-UX lan0
Linux: eth0
VA: bge0
only)
location
install_location
License

Password password

Installation software directory:
_____________________________________________________________
install_dir
2 This step is to be performed from only one system in the cluster. The install
C
cd install_dir

directory specified above.
Notes:
For VCS 4.x, install Storage Foundation HA (which includes VCS,
Volume Manager, and File System).
Use the information in the previous table or design worksheet to
respond to the installation prompts.
Sample prompts and input are provided at the end of the lab.
For versions of VCS before 4.0, use installvcs.
here.
License Key: _________________________________
e Accept the default of Y to configure VCS.

3 If you did not install the Java GUI package as part of the installer (VPI)
instructor.
Solaris
pkgadd -d /install_dir/cluster_server/pkgs VRTScscm
HP
swinstall -s /install_dir/cluster_server/pkgs VRTScscm
AIX
installp -a -d /install_dir/cluster_server/pkgs/
VRTScscm.rte.bff VRTScscm.rte
Linux
rpm -ihv VRTScscm-4.1.00.0-GA_GENERIC.noarch.rpm


instructor.
_______________________________________
install_dir
operating system-specific command, as shown in the following examples.
Solaris
C
pkgadd -d /install_dir/pkgs VRTSxxxx
HP
swinstall -s /install_dir/pkgs VRTSxxxx
AIX
installp -a -d /install_dir/pkgs/VRTSxxxx.rte.bff
VRTSxxxx.rte
Linux
rpm -ihv VRTSxxxx-x.x.xx.xx-GA_RHEL.i686.rpm

hastatus -sum

ClusterService service group is online on one of your cluster systems, VCS
has been properly installed and configured.


Solaris
pkginfo | grep -i vrts
AIX
lslpp -L | grep -i vrts
HP-UX
swlist | grep -i vrts
Linux
rpm -qa | grep VRTS
lltconfig
gabconfig -a

a Verify that the cluster ID, system names, and network interfaces specified
during install are present in the /etc/llttab file.
cat /etc/llttab
b Verify the system names in the /etc/llthosts file.
cat /etc/llthosts
C
Verify that the number of systems in the cluster matches the value for the
-n flag set in the /etc/gabtab file.
cat /etc/gabtab

Verify the cluster name, system names, and IP address for the Cluster Manager
in the /etc/VRTSvcs/conf/config/main.cf file.
cat /etc/VRTSvcs/conf/config/main.cf

password.

The URL is http://ipaddress:8181/vcs.
The IP address is given in the design worksheet and was entered during
installation to configure the Cluster Manager.
Cluster alias: nameCluster
Host name: ip_address (used during installation)
Failover retries: 12 (retain default)
hagui &
Select File>New Cluster.

Sample Installation Answers
VERITAS Storage Foundation and High Availability Solutions

4.1
VERITAS Licensing utilities are not installed on this

system.
Product menu cannot be displayed until VERITAS Licensing
utilities are installed.
Selection Menu:
I) Install/Upgrade a Product C) Configure an Installed
C
Product
L) License a Product P) Perform a
Preinstallation Check
U) Uninstall a Product D) View a Product
Description
Q) Quit ?) Help
Enter a Selection: [I,C,L,P,U,D,Q,?] I
VERITAS Storage Foundation and High Availability Solutions

4.1
1) VERITAS Cluster Server

2) VERITAS File System
3) VERITAS Volume Manager
4) VERITAS Volume Replicator
5) VERITAS Storage Foundation, Storage Foundation for
Oracle, Storage
Foundation for DB2, and Storage Foundation for Sybase
6) VERITAS Storage Foundation Cluster File System
7) VERITAS Storage Foundation for Oracle RAC
B) Back to previous menu
Select a product to install: [1-7,b,q] 5

Enter the system names separated by spaces on which to
install SF: train1 train2
Checking system communication:
Checking OS version on train1

............................... SunOS 5.9
Verifying communication with train2
................... ping successful
Attempting rsh with train2
............................. rsh successful
Attempting rcp with train2
............................. rcp successful
Checking OS version on train2
............................... SunOS 5.9
Creating log directory on train2
................................. Done
Logs for installer are being created in /var/tmp/

installer131114423.
Using /usr/bin/rsh and /usr/bin/rcp to communicate with

remote systems.
Initial system check completed successfully.
VERITAS Infrastructure package installation:
Installing VERITAS Infrastructure packages on train1:
Checking VRTScpi package

................................ not installed
. . .
Each system requires a SF product license before

installation. License keys for additional product features
should also be added at this time.

Some license keys are node locked and are unique per
system. Other license keys, such as demo keys and site
license keys, are registered on all systems and must be
entered on the first system.
SF Licensing Verification:
Checking SF license key on train1

........................ not licensed
Enter a SF license key for train1: [?] RRPE-BDP6-DRME-
NRFS-O47X-CNNN-3C
Registering VERITAS Storage Foundation Enterprise HA DEMO
key on train1
C
Do you want to enter another license key for train1?
[y,n,q,?] (n) n
Registering RRPE-BDP6-DRME-NRFS-O47X-CNNN-3C on train2

Checking SF license key on train2 ... Storage
Foundation Enterprise HA Demo
Do you want to enter another license key for train2?

[y,n,q,?] (n) n
SF licensing completed successfully.
installer can install the following optional SF packages:
VRTSobgui VERITAS Enterprise Administrator

VRTSvmman VERITAS Volume Manager Manual Pages
. . .
1) Install all of the optional packages
2) Install none of the optional packages
3) View package descriptions and select optional
packages
Select the optional packages to be installed on all

systems? [1-3,q,?] (1) 1
. . .

Installation requirement checks completed successfully.
Press [Return] to continue:
It is possible to install SF packages without performing

configuration.
SF cannot be started without proper configuration.
It is optional to configure SF now. If you choose to

configure SF later, you can either do so manually or run
the installsf -configure command.
Are you ready to configure SF? [y,n,q] (y) y
installer will now ask sets of SF configuration-related

questions.
To configure VCS for SF the following is required:
A unique Cluster name

A unique Cluster ID number between 0-255
Two or more NIC cards per system used for heartbeat
links
One or more heartbeat links are configured as

private links
One heartbeat link may be configured as a low
priority link
All systems are being configured to create one cluster
Enter the unique cluster name: [?] vcs1

Enter the unique Cluster ID number between 0-255: [b,?] 1
Discovering NICs on train3 ........ discovered eri0

qfe0 qfe1 qfe2 qfe3

Enter the NIC for the first private heartbeat NIC on
train3: [b,?] qfe0
Enter the NIC for the second private heartbeat NIC on
train3: [b,?] qfe1
Would you like to configure a third private heartbeat
link? [y,n,q,b,?] (n) n
Do you want to configure an additional low priority
heartbeat link?[y,n,q,b,?] (n) n
Are you using the same NICs for private heartbeat links on
all systems?[y,n,q,b,?] (y) y
Cluster information verification:
Cluster Name: vcs1

Cluster ID Number: 1
C
Private Heartbeat NICs for train1: link1=qfe0
link2=qfe1
Private Heartbeat NICs for train2: link1=qfe0
link2=qfe1
Is this information correct? [y,n,q] (y) y
Storage Foundation can be configured to utilize VERITAS

Security Services.
Running VCS in Secure Mode guarantees that all inter-

system communication is
encrypted and that users are verified with security
credentials.
When running VCS in Secure Mode, NIS and system usernames

and passwords are used to verify identity. VCS usernames
and passwords are no longer utilized when a cluster is
running in Secure Mode.
Before configuring a cluster to operate using VERITAS

Security Services, another system must already have
VERITAS Security Services installed and be operating as a
Root Broker. Refer to the Cluster Server Installation
Guide for more information on configuring a VxSS Root
Broker.

Would you like to configure SF to use VERITAS Security
Services? [y,n,q] (n) n
VERITAS STORAGE FOUNDATION 4.1 INSTALLATION

PROGRAM
The following information is required to add VCS users:
A user name
A password for the user
User privileges (Administrator, Operator, or
Guest)
Do you want to set the username and/or password for the

Admin user
(default username = 'admin', password='password')?
[y,n,q] (n) n
Do you want to add another user to the cluster? [y,n,q]
(y) n
VCS User verification:
User: admin Privilege: Administrators
Passwords are not displayed
Is this information correct? [y,n,q] (y) Y

The following information is required to configure Cluster
Manager:
A public NIC used by each system in the cluster

A Virtual IP address and netmask for Cluster Manager
Do you want to configure Cluster Manager (Web Console)

[y,n,q] (y) y
Active NIC devices discovered on train3: eri0
Enter the NIC for Cluster Manager (Web Console) to use on
train3: [b,?] (qfe1) eri0
Is qfe1 to be the public NIC used by all systems
[y,n,q,b,?] (y) y
Enter the Virtual IP address for Cluster Manager: [b,?]
C
192.168.XXX.XXX
Enter the netmask for IP 192.168.XXX.XXX: [b,?]
(255.255.255.0) 255.255.255.0
Cluster Manager (Web Console) verification:
NIC: eri0
IP: 192.168.27.91
Netmask: 255.255.255.0
Is this information correct? [y,n,q] (y)
The following information is required to configure SMTP

notification:
The domain-based hostname of the SMTP server

The e-mail address of each SMTP recipient
A minimum severity level of messages to send to each
recipient
Do you want to configure SMTP notification? [y,n,q] (y) n
The following information is required to configure SNMP

notification:

System names of SNMP consoles to receive VCS trap
messages
SNMP trap daemon port numbers for each console
A minimum severity level of messages to send to each
console
Do you want to configure SNMP notification? [y,n,q] (y) n
SF can be installed on systems consecutively or

simultaneously. Installing on systems consecutively takes
more time but allows for better error handling.
Would you like to install Storage Foundation Enterprise HA

on all systems
simultaneously? [y,n,q,?] (y) y
Installing Storage Foundation Enterprise HA 4.1 on all

systems simultaneously:
Copying VRTSperl.tar.gz to train2 ................. Done 1

of 123 steps
Installing VRTSperl 4.0.12 on train1 .............. Done 2
of 123 steps
. . .
The enclosure-based naming scheme is a feature of Volume

Manager. It allows one to reference disks using a symbolic
name that is more meaningful than the operating system's
normal device access name. This symbolic name is typically
derived from the array name.
Do you want to set up the enclosure-based naming scheme?

[y,n,q,?] (n) n
Do you want to start Storage Foundation Enterprise HA

processes now? [y,n,q] (y) y
Note: The vxconfigd daemon will be started, which can take

a while depending
upon the hardware configuration.

Disabling enclosure-based naming on train1
....................... Done
Starting vxconfigd for VxVM on train1
......................... Started
Disabling enclosure-based naming on train2
....................... Done
Starting vxconfigd for VxVM on train2
......................... Started
Starting Cluster Server:
Starting LLT on train1

........................................ Started
Starting LLT on train2
C
........................................ Started
Starting GAB on train1
........................................ Started
Starting GAB on train2
........................................ Started
Starting Cluster Server on train1
............................. Started
Starting Cluster Server on train2
............................. Started
Confirming Cluster Server startup ...................
2 systems RUNNING
Volume Manager default disk group configuration:
Many Volume Manager commands affect the contents or

configuration of a disk group. Such commands require that
the user specify a disk group. This is accomplished by
using the -g option of a command or setting the
VXVM_DEFAULTDG environment variable. An alternative to
these two methods is to configure the default disk group
of a system.
Do you want to set up the default disk group for each

system? [y,n,q,?] (y) n
Volume Manager default disk group setup and daemon startup

You declined to set up the default disk group for
train1.
Starting vxcached on train1
................................... Started
You declined to set up the default disk group for
train2.
Starting vxcached on train2
................................... Started
Storage Foundation Enterprise HA was started successfully.
Press [Return] to continue:
Installation of Storage Foundation Enterprise HA 4.1 has

completed successfully.
The installation summary is saved at:
/opt/VRTS/install/logs/installer131114527.summary
The installer log is saved at:
/opt/VRTS/install/logs/installer131114527.log
The installation response file is saved at:
/opt/VRTS/install/logs/installer131114527.response
Reboot all systems on which VxFS was installed or

upgraded.
shutdown -y -i6 -g0
See the VERITAS File System Administrators Guide for

information on using VxFS.

include "types.cf"
cluster vcs (
CredRenewFrequency = 0
CounterInterval = 5
)
C
system train1 (
)
system train2 (
)
)
IP webip (
Device = eri0
Address = "192.168.27.42"
NetMask = "255.255.255.0"
)
NIC csgnic (
Device = eri0
)
VRTSWebApp VCSweb (

Critical = 0
AppName = vcs
TimeForOnline = 5
)

webip requires csgnic

Lab 4 Solutions: Using the VCS Simulator
Lab 4 Solutions: Using the VCS Simulator C35

hasimgui &
2. Add a cluster.
directory.
Simulator GUI.
Java Console
oper.
VCS.
Seenext
See nextslide
slidefor
forlab
labassignments.
assignments.
Lab 4 Synopsis: Using the VCS Simulator, page A-18
Lab 4: Using the VCS Simulator, page B-21


Port 15559
password

File Locations

Lab main.cf file:
cf_files_dir
Local Simulator
config directory:
sim_config_dir


PATH=$PATH:/opt/VRTScssim/bin
export PATH

VCS_SIMULATOR_HOME=/opt/VRTScssim
export VCS_SIMULATOR_HOME
hasimgui &

4 Add a cluster.
Click Add Cluster.

System Name: S1
Port: 15559
Platform: Solaris
WAC Port: -1
C
cd /opt/VRTScssim/vcs_operations/conf/config
Specify this directory in place of sim_config_dir variable elsewhere in

the lab.

your instructor into the vcs_operations simulation configuration
directory.
Source location of main.cf, types.cf, and OracleTypes.cf files:
___________________________________________
cf_files_dir
cp cf_files_dir/main.cf /opt/VRTScssim/vcs_operations/
conf/config
cp cf_files_dir/types.cf /opt/VRTScssim/vcs_operations/
conf/config
cp cf_files_dir/OracleTypes.cf /opt/VRTScssim/
vcs_operations/conf/config
8 From the Simulator GUI, start the vcs_operations cluster.
Select vcs_operations under Cluster Name.

Click Start Cluster.

9 Launch the VCS Java Console for the vcs_operations simulated cluster.
Select vcs_operations under Cluster Name.

Click Launch Console.
C
10 Log in as oper with password oper.

11 Notice the cluster name is now VCS. This is the cluster name specified in the
new main.cf file you copied into the config directory.

3
With the Cluster object name selected in the left-hand frame of the
Cluster Manager, click on the Status tab in the right-hand frame.
Notice the Systems-> indicator and count the number of named columns,
that is, one for each cluster member.
With the Cluster object name selected in the left-hand frame of the
Cluster Manager, click on the Status tab in the right frame. The service
C
groups with their names as labels are shown.

AppSG Online Offline Offline
OracleSG Offline Online Offline
ClusterService Online Offline Offline
account?
AppSG and OracleSG.

For each service group:
a Click on the service group name in the left-hand frame of the Cluster
Manager.
b Click on the Properties tab.
c Click on the Show all attributes button.
d Scroll down and observe the value of the Operators service group
attributes. For each of these attributes, you may have to click on the ->
symbol in the Value column of the display panel. The value should be
oper, which is the user name with which you are logged into Cluster
Manager.
e Close the Show all attributes panel.

attribute enabled?
AppNIC, AppIP, and AppMount.

Mouse over each resource to see status.
Alternately:
a Click on the AppSG service group name in the left-hand frame of the
Cluster Manager.
b Click on the Resources tab in the right-hand frame. For each resource
shown in the dependency tree, right-click on the resource and observe
whether the Critical menu item is checked or not. A checked Critical
menu item indicates that the resource is set to critical.
OraListener.
a Click on the OracleSG service group name in the left-hand frame of
the Cluster Manager.
b Click on the Resources tab in the right-hand frame.
c Observe the top-most parent resource in the resource dependency tree.
OraMount.
a Click on the OracleSG service group name in the left-hand frame of
the Cluster Manager.
c Observe the child resources in the resource dependency tree for the
dependent parent resource named Oracle.

Right-click on the ClusterService service group name in the left-hand

panel of the Cluster Manager.
What happens?
There is no offline menu selection.

You cannot take the service group offline because you do not have
privileges for this service group.
C
Right-click on the AppSG service group, select Offline, and click S1.
What happens?
The Offline selection is displayed for this service group and you can take
the group offline because you have privileges for this service group.
Right-click on the OracleSG service group, select Offline, and click S1.
What happens?
OracleSG is currently online on system S2, so it is already offline S1.
For each service group for which you have privileges (AppSG and
OracleSG) and that is not already offline everywhere:
a Right-click the service group.
b Select the Offline menu option and click All Systems.
Note: The Simulator attempts to represent a real-world cluster environment so

resources may take some time to change state (offline/online). Wait for
service groups to show as fully offline or online before attempting further
operations.

a Right-click on the AppSG service group.

b Select the Online menu option and click S2.
a Right-click on the OracleSG service group.

b Select the Online menu option and click S1.
a Right-click on the AppSG service group.

b Select Switch To and click S1.
a Right-click on the OracleSG service group.

b Select the Switch To menu option and click S2.
a Right-click on the AppSG service group

b Select the Switch To menu option and click S3.
c Right-click on the OracleSG service group.
d Select the Switch To menu option and click S3.

1 Attempt to take the OraListener resource in OracleSG offline on system S3.
a Click on the OracleSG service group.

c Right-click on the OraListener resource, select the Offline menu
option, and click on system name S3.
Status shows as partial online on system S3.

OracleSG does not fail over; taking a resource offline does not cause
C
failover.
2 Bring the OraListener resource online on system S3.
Right-click on the OraListener resource, select the Online menu option,

and click on system name S3.
a Click on the S3 system name in the lower portion of the right-hand

frame.
b Right-click on the OraMount resource, select the Offline menu option,
and click on system name S3.
What happens?
You cannot take OraMount offline because a dependent resource

(Oracle) is online.

Right-click on the OraListener resource, select the Online menu option,

and click on system S1.
What happens?
You cannot bring OraListener online on S1 because OracleSG is a failover

service group that is already online on S3.
5 Fault the Oracle resource in the OracleSG service group.
Right-click on the Oracle resource and select Fault resource.
The resource is marked faulted (red x).

The service group is shown as faulted on S3 in the bottom row showing
each system. The S3 icon is surrounded by a red box.
The service group is brought offline on S3.
The service group is failed over to another system and brought online.
Click the exclamation point icon in the toolbar.
What happens?
a Right-click the OracleSG service group.

b Choose Switch To from the menu.
S3 is not available. You cannot switch a group to a system where it is
faulted.
a Right-click on Oracle.
b Choose Clear Fault from the menu.
c Choose S3.
The fault is now cleared.

a Right-click on the OracleSG.

b Choose Switch To from the menu.
c Choose S3.
The group should return to S3.
11 Save and close the configuration.
Select File>Close configuration.
12 Log off from the GUI.
Select File>Log Out.
C
13 Stop the simulator from the Simulator Java Console.
Select the vcs_operations cluster and click Stop Cluster.

Lab 5 Solutions: Preparing Application
Services
Lab 5 Solutions: Preparing Application Services C51


NIC NIC
do do
done done
bobDG1 sueDG1
/bob1 bobVol1
disk1 sueVol1 /sue1
disk2
Disk/Lun Disk/Lun
See
Seenext
nextslide
slidefor
forclassroom
classroomvalues.
values.
Lab 5 Synopsis: Preparing Application Services, page A-24
Lab 5: Preparing Application Services, page B-29
Lab Assignments

your_sys
their_sys
objects

group: disk_dev AIX: hdisk##
HP-UX: c#t#d#
Linux: sd##

Mount point /name1
Public network interface: Solaris: eri0

interface Sol Mob dmfe0
AIX: en1
HP-UX lan0
Linux: eth0
VA: bge0
ipaddress train2 192.168.xxx.52
train3 192.168.xxx.53
C
train4 192.168.xxx.54
train5 192.168.xxx.55
train6 192.168.xxx.56
train7 192.168.xxx.57
train8 192.168.xxx.58
train9 192.168.xxx.59
train10 192.168.xxx.60
train11 192.168.xxx.61
train12 192.168.xxx.62
class_sw_dir

vxdisk list
2 Determine whether any disks are already in use in disk groups.
vxdisk -o alldgs list
vxdisksetup -i disk_device
disk.
vxdg init nameDG1 nameDG101=disk_device
vxassist -g nameDG1 make nameVol1 2g
6 Create a file system on the volume.

Solaris
mkfs -F vxfs /dev/vx/rdsk/nameDG1/nameVol1
HP-UX
AIX
mkfs -V vxfs /dev/vx/rdsk/nameDG1/nameVol1
Linux
mkfs -t vxfs /dev/vx/rdsk/nameDG1/nameVol1

All
mkdir /name1
Solaris, AIX
rsh their_sys mkdir /name1
HP-UX
remsh their_sys mkdir /name1
Linux
ssh their_sys mkdir /name1

Solaris
mount -F vxfs /dev/vx/dsk/nameDG1/nameVol1 /name1
HP-UX
AIX
mount -V vxfs /dev/vx/dsk/nameDG1/nameVol1 /name1
Linux
mount -t vxfs /dev/vx/dsk/nameDG1/nameVol1 /name1
mount | grep name1

Complete the following steps to set up a virtual IP address for the application.
Solaris, AIX, Linux

ifconfig -a
HP-UX
netstat -i
2 Configure a virtual IP address on the public network interface. Use the IP

Solaris
ifconfig interface addif ipaddress up
AIX
ifconfig interface inet ipaddress netmask mask alias
HP-UX
ifconfig interface inet ipaddress
Linux
ifconfig interface add ipaddress
3 Verify that the virtual IP address is configured.
ifconfig -a

__________________________________________________________
class_sw_dir
2 Copy or type this code into a file named loopy on the file system you created
previously in this lab.
cp /class_sw_dir/loopy /name1/loopy
C
3 Verify that you have a console window open to see the display from the script.
/name1/loopy name 1 &
Solaris, AIX, HP-UX

View the console and verify that loopy is echoing nameSG1 in the
message.
Linux
Use the System Log Viewer.

1 Stop your loopy process by sending a kill signal. Verify that the process is
stopped.
ps -ef | grep "loopy name 1"

kill -9 pid
Solaris
ifconfig -a
ifconfig virtual_interface unplumb
ifconfig -a
AIX
ifconfig -a
ifconfig interface ipaddress delete
ifconfig -a
HP-UX
netstat -in
ifconfig interface inet 0.0.0.0
netstat -i
Linux
ifconfig -a
ifconfig interface:instance down
ifconfig -a
umount /name1
mount | grep name1
vxvol -g nameDG1 stop nameVol1

vxprint | grep nameVol1

vxdg deport nameDG1

6 Log in to the other system.

Solaris, AIX, HP-UX
rlogin their_sys
Linux
ssh their_sys
Virtual Academy
Use the Operations pull-down menu to connect to the other system.
7 Update VxVM so that the disk group is visible.
C
vxdctl enable
8 Import your disk group and verify that it imported.
vxdg import nameDG1

vxdisk list
9 Start your volume and verify that it is enabled.
vxvol -g nameDG1 start nameVol1

10 Verify that your mount point directory exists. Create it if it does not exist.
ls -d /name1
mkdir /name1
11 Mount your file system.

Solaris
HP
AIX
Linux

12 Verify that it is mounted.
mount | grep name
13 Configure your virtual IP address and verify that it is configured.

Solaris
ifconfig -a
AIX
ifconfig -a
HP-UX
netstat -in
Linux
ifconfig interface add ipaddress
ifconfig -a
14 Start the loopy application.
15 Verify that it is running.

Solaris, AIX, HP-UX
Watch the console on their system, and ensure that loopy is echoing your
name in the message.
Linux
Use the System Log Viewer.

Bringing the Services Offline
Complete the following steps to bring the application offline on the other system
so that it is ready to be placed under VCS control.
1 While still logged into the other system, stop your loopy process by sending a
kill signal. Verify that the process is stopped.

kill -9 pid
Solaris
C
ifconfig -a
ifconfig virtual_interface unplumb
ifconfig -a
AIX
ifconfig -a
ifconfig -a
HP-UX
netstat -in
netstat -in
Linux
ifconfig -a
ifconfig interface:instance down
ifconfig -a
umount /name1
mount | grep name1


vxdg deport nameDG1


Lab 6 Solutions: Starting and Stopping
VCS
Lab 6 Solutions: Starting and Stopping VCS C63

vcs1
train1 train2
## hastop
hastop all
all -force
-force
Lab 6 Synopsis: Starting and Stopping VCS, page A-29
Lab 6: Starting and Stopping VCS, page B-37
1 Change to the /etc/VRTSvcs/conf/config directory.
cd /etc/VRTSvcs/conf/config

directory. This file should not exist yet.
ls -al .
haconf -makerw

4 Verify that the .stale file has been created in the directory,
/etc/VRTSvcs/conf/config.
ls -al .
5 Try to stop VCS using the hastop -all command.
hastop -all
The command should return an error asking to close the configuration or

stop with the -force option.
6 Stop the cluster using the hastop -all -force command from one system
only to stop VCS forcibly and leave the applications running.
hastop -all -force
C
7 Start VCS on each system in the cluster.
hastart
hastatus -summary
9 Why are all systems in the STALE_ADMIN_WAIT state?
The cluster configuration was left open when VCS was stopped.
10 Verify that the .stale file is present in the /etc/VRTSvcs/conf/config

directory. This file should exist.
ls -al /etc/VRTSvcs/conf/config
11 Return all systems to a running state (from one system in the cluster).
hacf -verify /etc/VRTSvcs/conf/config

hasys -force your_sys
Lab 6 Solutions: Starting and Stopping VCS C65

system states.
Solaris, AIX, HP-UX
Watch the console to see the build process.
Linux
Use the System Log Viewer to watch the build process.
Virtual Academy
Use dmesg or tail /var/adm/messages to see the VCS build states.
13 Check the status of the cluster.
hastatus -summary
Any service groups that were online at the time that the hastop -all
-force command was run should still be online now that VCS has been
restarted.

directory. This file should have been removed.
ls -al /etc/VRTSvcs/conf/config

Lab 7 Solutions: Online Configuration of a
Service Group
Lab 7 Solutions: Online Configuration of a Service Group C67

Create a service
group.
Add resources to
the service group
from the bottom of
the dependency
tree.
Substitute the
name you used to
create the disk
group and volume.
Lab 7 Synopsis: Online Configuration of a Service Group, page A-31
Lab 7: Online Configuration of a Service Group, page B-41
Classroom-Specific Values
Fill in this table with the applicable values for your lab cluster.

Service group prefix name
name
your_sys
their_sys


Group nameSG1
Required Attributes
Optional Attributes
C
1 If you are using the GUI, start Cluster Manager and log in to the cluster.
hagui &
GUI: Select File>Open configuration.
CLI: haconf -makerw
GUI: Right-click your cluster name in the left panel and select Add
Service Group.
CLI: hagrp -add nameSG1
GUI: Select each system and click the right arrow button.
CLI: hagrp -modify nameSG1 SystemList your_sys 0

their_sys 1

system.
GUI: Click the Startup box for your system; then click OK to create the
service group.
CLI: hagrp -modify nameSG1 AutoStartList your_sys
GUI: Right click the service group, select Properties, and click Show all
attributes.
CLI: hagrp -display nameSG1
changes.
GUI: Select File>Save configuration.
CLI: haconf -dump
view /etc/VRTSvcs/conf/config/main.cf

Complete the following steps to add NIC, IP, DiskGroup, Volume, and Process
resources to the service group using the information from the design worksheet.
Adding an NIC Resource

Resource Type NIC
Required Attributes
C
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
only)
Critical? No (0)
Enabled? Yes (1)
GUI:
a Right-click the service group and select Add Resource.
b Type the name from the table.
c Select the resource type from the list.
CLI: hares -add nameNIC1 NIC nameSG1
GUI: Clear Critical.
CLI: hares -modify nameNIC1 Critical 0

needed.
GUI: For each attribute in the table:

a Click Edit.
b Double-click in the Value field.
c Type the values you entered in your table.
CLI:
Solaris
hares -modify nameNIC1 Device interface
AIX
HP-UX
hares -modify nameNIC1 NetworkHosts other_system1
other_system2
Linux
GUI: Check Enabled and click OK to complete resource configuration.
CLI: hares -modify nameNIC1 Enabled 1
5 Verify that the resource is online. Because this is a persistent resource, you do
not need to bring it online.
GUI: Verify that the resource icon is blue.
CLI: hares -display nameNIC1
changes.
GUI: Select File>Close configuration.
CLI: haconf -dump -makero

Adding an IP Resource

Resource Type IP
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
C
VA: bge0
Optional Attributes
Netmask 255.255.255.0
Critical? No (0)
Enabled? Yes (1)
System IP Address
train1 192.168.xx.51
train2 192.168.xx.52
train3 192.168.xx.53
train4 192.168.xx.54
train5 192.168.xx.55
train6 192.168.xx.56
train7 192.168.xx.57
train8 192.168.xx.58
train9 192.168.xx.59
train10 192.168.xx.60
train11 192.168.xx.61
train12 192.168.xx.62

GUI:
a Right-click the service group and select Add Resource.
b Type the name from the table.
c Select the resource type from the list.
CLI: hares -add nameIP1 IP nameSG1
GUI: Clear Critical.
CLI: hares -modify nameIP1 Critical 0
needed.
GUI: For each attribute in the table:

a Click Edit.
b Double-click in the Value field.
c Type the values you entered in your table.
CLI:
hares -modify nameIP1 Device interface
hares -modify nameIP1 Address xxx.xxx.xxx.xxx
GUI: Check Enabled and click OK to complete resource configuration.
hares -modify nameIP1 Enabled 1
GUI: Right-click the resource and select Online>your_sys.
CLI: hares -online nameIP1 -sys your_sys

GUI: Verify that the resource icon is blue.
CLI: hares -display nameIP1
Solaris, AIX, Linux

ifconfig -a
HP-UX
netstat -in
changes.
C
CLI: haconf -dump

Adding a DiskGroup Resource

Required Attributes
DiskGroup nameDG1
Optional Attributes
StartVolumes 1
StopVolumes 1
Critical? No (0)
Enabled? Yes (1)
hares -add nameDG1 DiskGroup nameSG1
hares -modify nameDG1 Critical 0
needed.
hares -modify nameDG1 DiskGroup nameDG1
hares -modify nameDG1 Enabled 1
hares -online nameDG1 -sys your_sys

6 Verify that the resource is online in VCS and at the O/S level.
hares -display nameDG1

vxprint -g nameDG1
changes.
haconf -dump

Adding a Volume Resource

Required Attributes
Volume nameVol1
DiskGroup nameDG1
Critical? No (0)
Enabled? Yes (1)
hares -add nameVol1 Volume nameSG1
hares -modify nameVol1 Critical 0
needed.
hares -modify nameVol1 Volume nameVol1

hares -modify nameVol1 DiskGroup nameDG1
hares -modify nameVol1 Enabled 1
hares -online nameVol1 -sys your_sys
hares -display nameVol1

vxprint -g nameDG1

changes.
haconf -dump

Adding a Mount Resource

Resource Type Mount
Required Attributes
MountPoint /name1
FSType vxfs
FsckOpt -y
Critical? No (0)
Enabled? Yes (1)
hares -add nameMount1 Mount nameSG1
hares -modify nameMount1 Critical 0
needed.
hares -modify nameMount1 MountPoint /name1

hares -modify nameMount1 BlockDevice /dev/vx/dsk/
nameDG1/nameVol1
hares -modify nameMount1 FSType vxfs
hares -modify nameMount1 FsckOpt %-y
hares -modify nameMount1 Enabled 1
hares -online nameMount1 -sys your_sys

hares -display nameMount1

mount
changes.
haconf -dump

Adding a Process Resource

Required Attributes
PathName /bin/sh
Optional Attributes
Critical? No (0)
Enabled? Yes (1)
hares -add nameProcess1 Process nameSG1
hares -modify nameProcess1 Critical 0
needed.
hares -modify nameProcess1 PathName /bin/sh

hares -modify nameProcess1 Arguments "/name1/loopy name
1"
Note: If you are using the GUI to configure the resource, you do not need
to include the quotation marks.
hares -modify nameProcess1 Enabled 1
5 Ensure that you have the console or a terminal window open for loopy output.

hares -online nameProcess1 -sys your_sys
hares -display nameProcess1
changes.
haconf -dump


nameVol1 nameDG1
nameMount1 nameVol1
nameIP1 nameNIC1
1
1 Link resource pairs together based on the design worksheet.
hares -link nameIP1 nameNIC1

hares -link nameVol1 nameDG1
hares -link nameMount1 nameVol1
hares -link nameProcess1 nameIP1
hares -link nameProcess1 nameMount1
2 Verify that the resources are linked properly.
hares -dep
changes.
haconf -dump

Complete the following steps to test the service group on each system in the
service group SystemList.
1 Test the service group by switching away from your system in the cluster.
hagrp -switch nameSG1 -to their_sys
2 Verify that the service group came online properly on their system.
hastatus -summary
3 Test the service group by switching it back to your system in the cluster.
C
hagrp -switch nameSG1 -to your_sys
4 Verify that the service group came online properly on your system.
hastatus -summary

Setting Resources to Critical
hares -modify nameNIC1 Critical 1

hares -modify nameIP1 Critical 1
hares -modify nameDG1 Critical 1
hares -modify nameVol1 Critical 1
hares -modify nameMount1 Critical 1
changes.
haconf -dump
finished.
haconf -dump -makero

group nameSG1 (
)
DiskGroup nameDG1 (
DiskGroup = nameDG1
)
IP nameIP1 (
C
Device = eri0
Address = "192.168.27.51"
)
Mount nameMount1 (
nameVol1"
FSType = vxfs
FsckOpt = "-y"
)
)
NIC nameNIC1 (
Device = eri0
)
Volume nameVol1 (
Volume = nameVol1
DiskGroup = nameDG1
)


Lab 8 Solutions: Offline Configuration of
a Service Group
Lab 8 Solutions: Offline Configuration of a Service Group C89

nameSG1
nameSG1 nameSG2
nameSG2 name
name
Process1 Process2
name name name name

name name name name

AppVol
Vol1 NIC1 NIC2 Vol2
name name
App
DG1 Working DG2
Workingtogether,
together,follow
followthe
theoffline
offline DG
configuration
procedure.
Alternately,
Alternately,work
workalone
aloneand
anduse
usethe
the
GUI
GUIto
tocreate
createaanew
newservice
servicegroup.
group.
Lab 8 Synopsis: Offline Configuration of a Service Group, page A-38
Lab 8: Offline Configuration of a Service Group, page B-57


objects

group AIX: hdisk##
HP-UX: c#t#d#
Linux: sd##
C
Mount point /name2

Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA bge0
train2 192.168.xxx.72
train3 192.168.xxx.73
train4 192.168.xxx.74
train5 192.168.xxx.75
train6 192.168.xxx.76
train7 192.168.xxx.77
train8 192.168.xxx.78
train9 192.168.xxx.79
train10 192.168.xxx.80
train11 192.168.xxx.81
train12 192.168.xxx.82

Prepare Resources
vxdisk list
vxdisksetup -i disk_device
disk.
vxdg init nameDG2 nameDG201=disk_device
vxassist -g nameDG2 make nameVol2 2g
5 Create a VxFS file system on the volume.

Solaris
HP
AIX
mkfs -V vxfs /dev/vx/rdsk/nameDG2/nameVol2
Linux
mkfs -t vxfs /dev/vx/rdsk/nameDG2/nameVol2
All
mkdir /name2
Solaris, AIX
rsh their_sys mkdir /name2
HP-UX
remsh their_sys mkdir /name2
Linux
ssh their_sys mkdir /name2

Solaris
HP
AIX
Linux
mount
9 Copy the loopy script to your file system created in this lab.
C
cp /class_sw_dir/loopy /name2/loopy
10 Start the new loopy application.
11 Verify that the new loopy application is working correctly.
View the console and verify that the new loopy process is echoing
nameSG2 in the message.
section of the lab.
a Stop the loopy process by sending a kill signal. Verify that the process is
stopped.

kill -9 pid
b Unmount your file system and verify that it is no longer mounted.
umount /name2
mount

c Stop the volume and verify that it is disabled.

d Deport your disk group and verify that it is deported.
vxdg deport nameDG2


Record information needed to create a new service group in the design worksheet.

Group nameSG2
Required Attributes
Optional Attributes
C
Resource Type NIC
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
only)
Critical? No (0)
Enabled? Yes (1)

Resource Type IP
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
Optional Attributes
Netmask 255.255.255.0
Critical? No (0)
Enabled? Yes (1)
System IP Address
train1 192.168.xx.71
train2 192.168.xx.72
train3 192.168.xx.73
train4 192.168.xx.74
train5 192.168.xx.75
train6 192.168.xx.76
train7 192.168.xx.77
train8 192.168.xx.78
train9 192.168.xx.79
train10 192.168.xx.80
train11 192.168.xx.81
train12 192.168.xx.82

Required Attributes
DiskGroup nameDG2
Optional Attributes
StartVolumes 1
StopVolumes 1
Critical? No (0)
Enabled? Yes (1)
C
Required Attributes
Volume nameVol2
DiskGroup nameDG2
Critical? No (0)
Enabled? Yes (1)

Resource Type Mount
Required Attributes
MountPoint /name2
FSType vxfs
FsckOpt -y
Critical? No (0)
Enabled? Yes (1)

Required Attributes
PathName /bin/sh
Optional Attributes
Critical? No (0)
Enabled? Yes (1)

nameVol2 nameDG2
nameMount2 nameVol2
nameIP2 nameNIC2

Note: You may choose to use the GUI to create the nameSG2 service group. If so,
skip this section and complete the Alternate Lab section instead.
and closed.
3 Make a subdirectory named test.
mkdir test
All
cp main.cf types.cf test
Linux
cd test
6 Edit the main.cf file in the test directory on one system in the cluster.
a For each students service group, copy the nameSG1 service group
structure to a nameSG2.
b Rename all of the resources within the nameSG1 service group to end with
2 instead of 1, as shown in the following table.

nameIP1 nameIP2

nameNIC1 nameNIC2
nameVol1 nameVol2
nameDG1 nameDG2
Partial Example:
# vi main.cf
group nameSG2 (
)
C
DiskGroup nameDG2 (
DiskGroup = nameDG2
)
.
.
.
c Copy and modify the dependency section.

.
.
.

hacf -verify /etc/VRTSvcs/conf/config/test
hastop -all -force
10 Verify that the loopy applications are still running.
View the console window. (loopy 2 is not running, it was stopped in an

earlier section.)
directory.
cp main.cf ../main.cf
hastart
hastart -stale
hastatus -summary

system states.
Solaris, AIX, HP-UX

Watch the console during the build process to see the system states.
Linux
Use the System Log Viewer to watch the build process.
Virtual Academy
Use dmesg or tail /var/adm/messages to see the VCS build states.

hagrp -online nameSG2 -sys your_sys
hastatus -summary

Alternate Lab: Using the GUI to Create the Service Group
Use the information in the design worksheet in the previous section to create a new
service group using the GUI to copy resources from the nameSG1 service group.
1 Start Cluster Manager and log in to the cluster.
hagui &
GUI: Select File>Open configuration.
GUI: Right-click your cluster name in the left panel and select Add
Service Group.
GUI: Select each system and click the right arrow button.
system.
GUI: Click the Startup box for your system then click OK to create the
service group.
GUI: Right-click the service group, select Properties, and click Show all
attributes.
changes.

8 Copy all resources from the nameSG1 service group to nameSG2.
a Click the nameSG1 service group in the left pane.
b Select the Resources tab to display the resource icons.
c Right-click the top-most resource in the dependency tree,

nameProcess1.
d Select Copy>Self and Child Nodes.
e Click the new nameSG2 service group in the left pane.
f Select the Resources tab to display the resource view. There are no
resources yet in nameSG2.
C
g Right-click anywhere in the right pane display area of the Resources
tab.
h Select Paste.
The Name Clashes window is displayed, which enables you to rename

each resource you are pasting.
i Change the resource names as follows:

nameIP1 nameIP2
nameNIC1 nameNIC2
nameVol1 nameVol2
nameDG1 nameDG2

j Click Apply.
k Click OK.
9 Set each resource to not critical.
Right-click the resource and clear Critical.
10 Modify each resource to set the attribute values as specified in the worksheet.
Right-click a resource and select View>Properties View.
changes.
Select File>Save configuration.
12 Enable each resource.
Right-click the resource and select Enabled.
13 Bring the nameSG2 resources online, starting from the bottom of the
dependency tree.
Right-click the resource, select Online, and choose your system.

Select File>Close configuration.
Note: In the GUI, the Close configuration action saves the configuration
automatically.

group nameSG2 (
)
DiskGroup nameDG2 (
DiskGroup = nameDG2
)
IP nameIP2 (
C
Device = eri0
Address = "192.168.27.71"
)
Mount nameMount2 (
nameVol2"
FSType = vxfs
FsckOpt = "-y"
)
)
NIC nameNIC2 (
Device = eri0
)

Volume nameVol2 (
Volume = nameVol2
DiskGroup = nameDG2
)


Lab 9 Solutions: Creating a Parallel
Service Group
Lab 9 Solutions: Creating a Parallel Service Group C109

nameSG1
nameSG1 nameSG2
nameSG2 name
name
Process1 Process2
name name name name

name name name name

DBVol
name name
DB
DG
NetworkSG
resources.
Lab 9 Synopsis: Creating a Parallel Service Group, page A-47
Lab 9: Creating a Parallel Service Group, page B-73


Group NetworkSG
Required Attributes
Parallel 1
Optional Attributes
C
haconf -makerw
hagrp -add NetworkSG
3 Modify the SystemList to allow the service group to run on the systems
hagrp -modify NetworkSG SystemList your_sys 0

their_sys 1
4 Modify the AutoStartList attribute to allow the service group to start on both
systems.
hagrp -modify NetworkSG AutoStartList your_sys

their_sys
5 Modify the Parallel attribute to allow the service group to run on both systems.
hagrp -modify NetworkSG Parallel 1
6 View the service group attribute settings.
hagrp -display NetworkSG

Adding Resources
Use the values in the following tables to create NIC and Phantom resources.

Resource Type NIC
Required Attributes
Sol Mob: dmfe0
AIX: en1
HP-UX: lan0
Linux: eth0
VA: bge0
Critical? No (0)
Enabled? Yes (1)

Required Attributes
Critical? No (0)
Enabled? Yes (1)
1 Add the NIC resource to the service group.
hares -add NetworkNIC NIC NetworkSG
hares -modify NetworkNIC Critical 0

needed.
All
hares -modify NetworkNIC Device interface
HP-UX
hares -modify NetworkNIC NetworkHosts other_system1
other_system2
hares -modify NetworkNIC Enabled 1
5 Verify that the resource is online. Because it is a persistent resource, you do not
need to bring it online.
C
hares -display NetworkNIC
6 Add the Phantom resource to the service group.
hares -add NetworkPhantom Phantom NetworkSG
hares -modify NetworkPhantom Critical 0
hares -modify NetworkPhantom Enabled 1
9 Verify that the status of the NetworkSG service group now shows as online.
hastatus -sum
10 Save the cluster configuration and view the configuration file.
haconf -dump

create new links.

Resource Type Proxy
Required Attributes
Critical? No (0)
Enabled? Yes (1)

Resource Type Proxy
Required Attributes
Critical? No (0)
Enabled? Yes (1)

Resource Type Proxy
Required Attributes
Critical? No (0)
Enabled? Yes (1)

1 Delete all NIC resources in the ClusterService, nameSG1, and nameSG2
service groups.
hares -delete nameNIC1

hares -delete nameNIC2
hares -delete csgnic
Note: Only one student can delete the ClusterService NIC resource.
2 Add a proxy resource to each failover service group using the service group
naming convention:
nameProxy1
nameProxy2
csgProxy
C
hares -add nameProxy1 Proxy nameSG1
hares -add nameProxy2 Proxy nameSG2
hares -add csgProxy Proxy ClusterService
3 Set the value for each Proxy TargetResName attribute to NetworkNIC.
hares -modify nameProxy1 TargetResName NetworkNIC

hares -modify nameProxy2 TargetResName NetworkNIC
hares -modify csgProxy TargetResName NetworkNIC
4 Set the resources to not critical.
hares -modify nameProxy1 Critical 0

hares -modify nameProxy2 Critical 0
hares -modify csgProxy Critical 0
5 Enable the resources.
hares -modify nameProxy1 Enabled 1

hares -modify nameProxy2 Enabled 1
hares -modify csgProxy Enabled 1
6 Verify that the Proxy resources are in an online state.
hares -display nameProxy1

hares -display nameProxy2
hares -display csgProxy

haconf -dump

Use the values in the following tables to replace the NIC resources with Proxy
resources and create new links.

nameIP1 nameProxy1
C
nameIP2 nameProxy2

webip csgProxy
1 Link the Proxy resources as children of the corresponding IP resources of each

service group.
hares -link nameIP1 nameProxy1

hares -link nameIP2 nameProxy2
hares -link webip csgProxy

hagrp -switch nameSG1 -to other_system

hagrp -switch nameSG2 -to other_system
hagrp -switch ClusterService -to other_system

hares -display | grep Critical | grep 0

haconf -makerw
hares -modify resource_name Critical 1
. . .

include "types.cf"
cluster vcs (
CounterInterval = 5
)
system train1 (
C
)
system train2 (
)
Tag = CSG
)
IP webip (
Device = eri0
Address = "192.168.27.42"
NetMask = "255.255.255.0"
)
Proxy csgProxy (
)

VRTSWebApp VCSweb (
Critical = 0
AppName = vcs
TimeForOnline = 5
)

webip requires csgProxy
group NetworkSG (
Parallel = 1
AutoStartList = ( train1, train2 }
)
NIC NetworkNIC (
Device = eri0
)
Phantom NetworkPhantom (
)
group nameSG1 (
)
DiskGroup nameDG1 (
DiskGroup = nameDG1
)

IP nameIP1 (
Device = eri0
Address = "192.168.27.51"
)
Mount nameMount1 (
nameVol1"
FSType = vxfs
FsckOpt = "-y"
)
C
PathName = "/bin/ksh"
)
Proxy nameProxy1 (
)
Volume nameVol1 (
Volume = nameVol1
DiskGroup = nameDG1
)


group nameSG2 (
)
DiskGroup nameDG2 (
DiskGroup = nameDG2
)
IP nameIP2 (
Device = eri0
Address = "192.168.27.71"
)
Mount nameMount2 (
nameVol2"
FSType = vxfs
FsckOpt = "-y"
)
PathName = "/bin/ksh"
)
Proxy nameProxy2 (
)
Volume nameVol2 (
Volume = nameVol2
DiskGroup = nameDG2
)


Lab 10 Solutions: Configuring
Notification
Lab 10 Solutions: Configuring Notification C125

nameSG1 nameSG2
ClusterService
NotifierMngr
Optional Lab
resfault
resfault
Triggers
Triggers nofailover
nofailover SMTP
SMTPServer:
Server:
resadminwait
resadminwait
___________________________________
___________________________________

Lab 10 Synopsis: Configuring Notification, page A-52
Lab 10: Configuring Notification, page B-85

Work with your lab partner to add a NotifierMngr type resource to the

Required Attributes
C
Critical? No (0)
Enabled? Yes (1)
haconf -makerw
hares -add notifier NotifierMngr ClusterService
hares -modify notifier Critical 0
4 Set the required attributes for this resource and any optional attributes, if
needed.
Solaris, HP-UX, Linux

hares -modify notifier SmtpServer localhost
hares -modify notifier SmtpRecipients -add root Warning
AIX
hares -modify notifier SmtpServer localhost
hares -modify notifier SmtpRecipients -add root Warning
hares -modify notifier PathName /xxx/xxx

hares -modify notifier Enabled 1
6 Link the notifier resource to csgproxy.
hares -link notifier csgproxy
7 Bring the resource online on the system running the ClusterService service
group.
hares -online notifier -sys your_system
hares -display notifier

ps -ef | grep notifier
haconf -dump

1 Test the service group by switching it to the other system in the cluster.
hagrp -switch ClusterService -to other_sys
2 Verify that the service group came online properly on the other system.
hastatus -sum
3 Test the service group by switching it back to the original system in the cluster.
hagrp -switch ClusterService -to original_sys
4 Verify that the service group came online properly on the original system.
C
hastatus -sum
hares -modify notifier Critical 1



#!/bin/sh
resfault.msg
resfault.msg

nofailover. Add the following lines to the file.
#!/bin/sh
echo `date` > /tmp/nofailover.msg
echo message from the nofailover trigger >> /tmp/
nofailover.msg
echo no failover for service group $2 >> /tmp/
nofailover.msg
echo Please check the problem. >> /tmp/nofailover.msg
/usr/lib/sendmail root </tmp/nofailover.msg
rm /tmp/nofailover.msg

resadminwait. Add the following lines to the file.
#!/bin/sh
echo `date` > /tmp/resadminwait.msg
echo message from the resadminwait trigger >> /tmp/
resadminwait.msg
echo Resource $2 on System $1 is in adminwait for
Reason $3 >> /tmp/resadminwait.msg
echo Please check the problem. >> /tmp/resadminwait.msg
/usr/lib/sendmail root </tmp/resadminwait.msg
rm /tmp/resadminwait.msg
chmod 744 resfault
C
chmod 744 nofailover
chmod 744 resadminwait
Solaris, AIX, HP-UX

rcp resfault their_sys:/opt/VRTSvcs/bin/triggers
rcp nofailover their_sys:/opt/VRTSvcs/bin/triggers
rcp resadminwait their_sys:/opt/VRTSvcs/bin/triggers
Linux
scp file their_sys:directory

Lab 11 Solutions: Configuring Resource
Fault Behavior
Lab 11 Solutions: Configuring Resource Fault Behavior C133

Critical=0
Critical=1
FaultPropagation=0
nameSG2
ManageFaults=NONE
ManageFaults=ALL
RestartLimit=1
Note:
Note:Network
Networkinterfaces
interfacesfor
forvirtual
virtualIP
IPaddresses
addresses
are
areunconfigured
forcethe
theIP
IPresource
resourcetotofault.
fault.
In
Inyour
yourclassroom,
classroom,the
theinterface
interfaceyou
youspecify
specifyis:______
is:______
Replace
Replacethe
thevariable
variableinterface
interfacein
inthe
thelab
labsteps
stepswith
withthis
this
value.
value.
scenarios.
Lab 11 Synopsis: Configuring Resource Fault Behavior, page A-55
Lab 11: Configuring Resource Fault Behavior, page B-93

This part of the lab exercise explores the default behavior of VCS.
haconf -makerw
hares -display -attribute Critical -group nameSG1

hares -modify nameResource1 Critical 1
haconf -dump
C
hatype -modify IP MonitorInterval 10

hatype -modify IP OfflineMonitorInterval 30
haconf -dump
it is not, bring it online or switch it to your system.
hastatus -sum

VCS.
Solaris
ifconfig interface removeif 192.168.xx.xx
HP
AIX
Linux
ifconfig interface down
hares -display -group nameSG1 | grep " State"
The nameIP1 resource should fault.

The nameProcess1 resource should go offline.
There should be no failover.
The notifier and the resfault trigger should send e-mail.
8 Clear any faults.
hares -clear nameIP1
hares -online nameIP1 -sys your_sys

hares -online nameProcess1 -sys your_sys

haconf -dump

critical.
configuration.

haconf -dump
C
hastatus -sum

VCS.
Solaris
HP
AIX
Linux
Note: The effects of stopping loopy may take up to 60 seconds to be

detected.

The nameIP1 resource should go offline.
All other resources come offline.

The group should fail over to the other system (their_sys).
The notifier sends two e-mail messagesone for the faulted resource
and one for the faulted service group. The resfault trigger should send
e-mail if configured.
Solaris
rsh their_sys ifconfig interface removeif 192.168.xx.xx
HP
rsh their_sys ifconfig interface inet 0.0.0.0
AIX
rsh their_sys ifconfig interface ipaddress delete
Linux
ssh -l root their_sys ifconfig interface down

The nameProcess1 resource should go offline.
All other resources are brought offline.
The group cannot fail over because there are no failover targets left.
The group stays offline.
and one for the faulted service group. The resfault and nofailover
triggers should send e-mail, if configured.


critical.
configuration.

haconf -dump
3 Verify that your nameSG1 Service group is currently online on your system. If
C
hastatus -sum
hagrp -freeze nameSG1

VCS.
Solaris
HP
AIX
Linux

The nameIP1 resource should fault and show the state as
PARTIAL|FAULTED.
The nameProcess1 resource should stay online.

There is no failover.

Solaris
HP
AIX
Linux
ifconfig interface add ipaddress up
What happens?
The resource fault should clear on its own, when the agent probes the
resource (after the offline monitor interval), which is now online. You can
probe the resource to manually check the state more quickly.

Solaris
HP
AIX
Linux
Wait for the resource to fault.
hagrp -unfreeze nameSG1

No. The failover decision is made at the time of the fault.


This section illustrates service group failover behavior using the ManageFaults
and FaultPropagation attributes.
critical.
configuration.

haconf -dump
hagrp -modify nameSG1 FaultPropagation 0

VCS.
Solaris
HP
AIX
Linux

The service group is in the PARTIAL|FAULTED state.
The nameProcess1 resource should stay online.


hagrp -modify nameSG1 ManageFaults NONE

hagrp -modify nameSG1 FaultPropagation 1
C
VCS.
Solaris
HP
AIX
Linux
The nameIP1 resource should be in the admin wait state.

The nameProcess1 should stay online.
The resadminwait trigger should send e-mail.

8 Recover the resource from the ADMIN_WAIT state by bringing up the IP
address outside of VCS and clearing the AdminWait attribute without a fault.
Note: The ADMIN_WAIT state can be cleared automatically if a monitor

interval has run.
Solaris
HP
AIX
Linux
ifconfig interface add ipaddress up
hagrp -clearadminwait nameSG1 -sys your_sys

VCS.
Solaris
HP
AIX
Linux
The nameIP1 resource should be in the ONLINE|ADMIN_WAIT state.

The nameProcess1 should stay online.
The resadminwait trigger should send e-mail.

group.
Note: The ADMIN_WAIT state can be cleared automatically if a monitor

interval has run.
hagrp -clearadminwait -fault nameSG1 -sys your_sys
The group should now fail over to their_sys.

C
hagrp -modify nameSG1 ManageFaults ALL

haconf -dump

critical.
configuration.

haconf -dump
hatype -modify Process RestartLimit 1

haconf -dump
kill signal.
ps -ef | grep /name1/loopy

kill pid
The loopy process should be restarted automatically on the same

system.
There is no notification of restart. However, there should be a log

entry.

kill signal.
ps -ef | grep /name1/loopy

kill pid
Note: It can take approximately 60 seconds to see the effects of stopping

the loopy process.
The resource is faulted because the RestartLimit has been exceeded.
C
The group fails over.
and one for the faulted service group. The resfault trigger should send
e-mail if configured.
system.
hares -clear nameProcess1


Lab 13 Solutions: Testing
Communication Failures
Lab 13 Solutions: Testing Communication Failures C149

3. Test failures.
trainxx
trainxx
O trainxx
trainxx
Optional Lab
Trigger
Trigger injeopardy
injeopardy
Lab 13 Synopsis: Testing Communication Failures, page A-60
Lab 13 Details: Testing Communication Failures, page B-101

lab, students create a local copy of the trigger script on their own systems. If you
are working alone in the cluster, copy your completed triggers to the other system.
#!/bin/sh
injeopardy.msg
C
chmod 744 injeopardy
Solaris, AIX
rcp injeopardy their_sys:/opt/VRTSvcs/bin/triggers/
injeopardy
HP-UX
remsh injeopardy their_sys:/opt/VRTSvcs/bin/triggers/
injeopardy
Linux
scp injeopardy their_sys:/opt/VRTSvcs/bin/triggers/
injeopardy
trigger.

Adding a Low Priority Link
Working with your lab partner, use the procedures to create a low-priority link
and then fault communication links and observe what occurs in a cluster
environment when fencing is not configured.

AIX: en1
HP-UX lan0
Linux: eth0
VA: bge0
Sol Mob: dmfe0
AIX: en2
HP-UX lan1
Linux: eth1
VA: bge2
Sol Mob: dmfe1
AIX: en3
HP-UX lan2
Linux: eth2
VA: bge3
for your_sys
for their_sys

hastop -all -force
3 Unconfigure GAB on each system in the cluster.
gabconfig -U
4 Unconfigure LLT on each system in the cluster.
lltconfig -U

Solaris Mobile
Solaris Example
set-cluster 1
set-node train1
link tag1 /dev/qfe:0 - ether - -
link tag2 /dev/qfe:1 - ether - -
link-lowpri tag3 /dev/eri:0 - ether - -
AIX Example
set-cluster 1
set-node train1
link tag1 /dev/en:2 - ether - -
link tag2 /dev/en:3 - ether - -
link-lowpri tag3 /dev/en:1 - ether - -
HP-UX Example
set-cluster 10
set-node train1
link tag1 /dev/lan:1 - ether - -
link tag2 /dev/lan:2 - ether - -
link-lowpri tag3 /dev/lan:0 - ether - -
Linux Example
set-cluster 1
set-node train1
link tag1 eth1 - ether - -
link tag2 eth2 - ether - -
link-lowpri tag3 eth0 - ether - -
Virtual Academy Example

set-cluster 1
set-node train1
link tag1 bge2 - ether - -
link tag2 bge3 - ether - -
link-lowpri tag3 bge0 - ether - -

6 Start LLT on each system.
lltconfig -c
7 Verify that LLT is running.
lltconfig
8 Start GAB on each system.
sh /etc/gabtab
Alternatively, you can start GAB using gabconfig. However, sourcing the
gabtab is preferred to ensure any changes to /etc/gabtab you may have
made are tested.
C
gabconfig -c -n 2
gabconfig -a
hastart
11 Verify that VCS is running.
hastatus -sum


_____________________________________________________________
cd /tmp

a Open the cluster configuration.
haconf -makerw
b Modify the MonitorInterval attribute.
hatype -modify NIC MonitorInterval 3600
c Save and close the cluster configuration.

Notes:
connectors.

Use the lltlink_disable utility to simulate failure of an LLT link

(private or low- priority). Type:
./lltlink_disable
Select a link from the displayed list.
6 Verify that the link is down.
lltstat -nvv
7 Restore communications using the lltlink_enable utility.
C
Replace the removed cable. To use the lltlink_disable utility, type:
./lltlink_enable
Select a link from the displayed list.
lltstat -nvv

gabconfig -a
All nodes should have regular membership.
2 Use lltlink_disable to remove all but one LLT link and watch for the
link to expire in the console.
Use lltlink_disable to remove all but one LLT links from operation
(private or low priority).
./lltlink_disable
Select the first LLT link from the list.
./lltlink_disable
Select the next LLT link from the list.
Solaris Mobile
3 Verify that the links are down.
lltstat -nvv
gabconfig -a
One node should have jeopardy membership.
Replace removed cables.
./lltlink_enable
Select the first LLT link to restore.
./lltlink_enable
Select the second LLT link to restore.

lltstat -nvv
gabconfig -a

gabconfig -a
2 Remove all but one LLT link and watch for the link to expire in the console or
system log.
Disable all but one LLT link (private or low priority). For each link, type:
./lltlink_disable
Solaris Mobile
Disable only the one high-priority LLT link (dmfe1).
3 Verify that the links are down from each system.
lltstat -nvv
gabconfig -a
One node should have jeopardy membership.
Disable the last LLT link using lltlink_disable.

./lltlink_disable
6 Verify that all links are down from each system.
lltstat -nvv
gabconfig -a
Each side of the cluster should only have membership for its node.

hastatus -sum
The jeopardy condition should have autodisabled the service groups on

the systems on which they are not running. A split brain situation has been
avoided.
a Stop HAD on one system but leave services running.
Note: If you have more than two systems in the cluster, you must stop
HAD on all systems on either side of the network partition.
hastop -local -force
C
b If you physically unplugged cables, restore communications reconnecting
the LLT link cables.
Note: If you used lltlink_disable to simulate link failure, skip this

step.
c Verify that the LLT connections are up.
lltstat -nvv
d Verify that GAB has proper membership.
gabconfig -a
e Start VCS on the system where you stopped it.
hastart
f Verify that each service group is autoenabled.
hastatus -sum

haconf -makerw

Lab 14 Solutions: Configuring I/O
Fencing
Lab 14 Solutions: Configuring I/O Fencing C163

trainxx trainxx
Disk 1:___________________
Disk 3:___________________
nameDG1, nameDG2
The purpose of this lab is to set up I/O fencing in a two-node cluster and simulate
node and communication failures.
Lab 14 Synopsis: Configuring I/O Fencing, page A-66
Lab 14: Configuring I/O Fencing, page B-111

Lab Assignments
Working with your lab partner, use the following procedure and the information
provided in the table to configure fencing for your cluster.

Disk assignments for cXtXdXsX
coordinator disk group cXtXdXsX
cXtXdXsX
Disk group name oddfendg
or
evenfendg
/etc/vxfendg oddfendg
or
evenfendg
C
UseFence cluster attribute SCSI3

Configuring Disks and Fencing Driver
1 Configure a disk group for the coordinator disks.
a Initialize three disks for use in the disk group.
vxdisksetup -i coor_disk1
b Display your cluster ID. Your cluster ID determines your coordinator disk
group name.
cat /etc/llttab
c Initialize the disk group.
If your cluster ID is odd, use oddfendg for the disk group name.
vxdg init oddfendg coor_disk1 coor_disk2 coor_disk3
If your cluster ID is even, use evenfendg for the disk group name.
vxdg init evenfendg coor_disk1 coor_disk2 coor_disk3
Note: Replace the placeholder string "______fendg" with the

appropriate odd or even coordinator disk name throughout the remainder of
this lab.
d Deport the disk group.
vxdg deport ______fendg

2 Optional for the classroom: Use the vxfentsthdw utility to verify that the
shared storage disks support SCSI-3 persistent reservations.
Notes:
For the purposes of this lab, you do not need to test the disks. The disks
used in this lab support SCSI-3 persistent reservations. The complete steps
are given here as a guide for real-world use.
To see how the command is used, you can run vxfentsthdw on a disk
not in use; this will enable you to continue with the lab while the
vxfentsthdw is running.
Create a test disk group with one disk and run vxfentsthdw on that test
disk group.
vxfentsthdw -g testdg
Use the -r option to perform read-only testing of data disks.
C
3 Enter the coordinator disk group name in the /etc/vxfendg fencing
configuration file on each system in the cluster.
echo "______fendg" > /etc/vxfendg
4 Start the fencing driver on each system using the vxfen init script.
/etc/init.d/vxfen start
5 Verify that the /etc/vxfentab file has been created on each system and it
contains a list of the coordinator disks.
cat /etc/vxfentab
6 Verify the setup of the coordinator disks.
a Verify that port b GAB membership is listed for both nodes.
gabconfig -a
GAB should show port a, b, and h membership for nodes 0 and 1.

b Verify that registrations are assigned to the coordinator disks.
vxfenadm -g all -f /etc/vxfentab
c How many keys are present for each disk and why?
There should be A------- keys for LLT node 0 and B------- keys for LLT
node 1 on each coordinator disk for each path to that coordinator disk.
Example:
Device Name: /dev/rdsk/c1t9d0s2
Total Number Of Keys: 2
key[0]:
Key Value [Numeric Format]: 65,45,45,45,45,45,45,45
Key Value [Character Format]: A-------
key[1]:
Key Value [Character Format]: B-------

key[0]:
key[1]:

key[0]:
key[1]:

Configuring VCS for Fencing
1 On each system, verify that you have a Storage Foundation Enterprise license
installed for fencing support using vxlicrep.
vxlicrep
Check for this output:

Product Name = VERITAS Storage Foundation Enterprise
PGR#VERITAS Volume Manager = Enabled
2 Working together, verify that the cluster configuration is saved and closed.
C
4 Make a subdirectory named test, if one does not already exist.
mkdir test
cp main.cf types.cf test
cd test

7 Edit the main.cf file in the test directory on that one system in the cluster
to set the value of UseFence to SCSI3.
Partial Example:
# vi main.cf
cluster vcs (
CounterInterval = 5
UseFence = SCSI3
. . .
)
9 Stop VCS and shut down the applications. The disk groups must be reimported
for fencing to take effect.
hastop -all
directory.
cp main.cf ../main.cf
hastart
hastart -stale
hastatus -summary
14 Verify that the UseFence cluster attribute is set.
haclus -value UseFence

Verifying Data Disks for I/O Fencing
1 If the service groups with disk groups did not come online at cluster startup,
bring them online now. This will import the disk groups, which initiate fencing
on the data disks. Each student can perform these steps on their service groups.
hastatus -sum
2 Verify registrations and reservations on the data disks
There should be AVCS keys on LLT node 0 imported disk groups and
BVCS on LLT node 1 imported disk groups.
C
# vxfenadm -g /dev/rdsk/data_disk1
Reading SCSI Registration Keys...

key[0]:
Key Value [Character Format]: AVCS
# vxfenadm -r /dev/rdsk/data_disk2
Reading SCSI Reservation Information...

Key[0]:
Reservation Type:
SCSI3_RESV_WRITEEXCLUSIVEREGISTRANTSONLY

Testing Communication Failures
In most cases, the following sections require that you work together with your lab
partner to observe how fencing protects data in a variety of failure situations.
Steps you can perform on your own are indicated within the procedure.
Scenario 1: Manual Concurrency Violation

Students can try this scenario on their own. Try to import a disk group imported on
one system to another system using vxdg with the -C option.
1 On the system where nameDG1 is not imported, attempt to manually import it

clearing the host locks.
vxdg -C import nameDG1
2 Were you successful? Describe why or why not.
This command should fail because the node where the disk group is not
imported does not have rights to write to the disk, and therefore cannot
import the disk group and update the private region header information.
The error message should say: VxVM vxdg ERROR V-5-1-587 Disk
group nameDG1: import failed: No valid disk found
containing disk group.
This indicates that data corruption from a possible concurrency violation
has been prevented.
Scenario 2: Response to System Failure

Work with your lab partner to observe how VCS responds to system failures.
requires that disk groups be imported on each system. Switch them, if
necessary.
hastatus -sum
There should be registrations for both systems.

# vxdisk list
# vxfenadm -g /dev/rdsk/data_disk

key[0]:
# vxfenadm -r /dev/rdsk/data_disk
C

Key[0]:
Reservation Type:
4 Fail one of the systems by removing power or hard booting the system.
Observe the failure.
LLT and GAB should time out heartbeats from the failed system. The
remaining system should fence off the drive.

5 Verify the registrations on the coordinator disks for the remaining system.
There should be registrations for only the remaining system.
# vxfenadm -g all -f /etc/vxfentab

key[0]:

key[0]:
key[0]:
6 Verify that the service groups that were running on the failed system have
failed over to the remaining system.
hastatus -sum
remaining system.
# vxdisk list
# vxfenadm -g /dev/rdsk/data_disk

key[0]:

# vxfenadm -r /dev/rdsk/data_disk

Key[0]:
Reservation Type:
8 Boot the failed system and observe it rejoin cluster membership. Verify cluster
membership and verify that the coordinator disks have registrations for both
systems again.
C
gabconfig -a
Scenario 3: Response to Interconnect Failures

Work with your lab partner to observe how VCS responds to cluster interconnect
failures.
1 If you did not already perform this step in the Testing Communication
Failures lab, copy the lltlink_enable and lltlink_disable
utilities from the location provided by your instructor into the /tmp directory.
_____________________________________________________________
cd /tmp

temporarily for the purposes of communications testing. This prevents the
NetworkNIC resource from faulting during this lab when the low-priority LLT
link is pulled.
haconf -makerw

requires that one disk group be imported on each system. Switch the service
groups, if necessary.
hastatus -sum
There should be registrations for both systems.
vxdisk list
vxfenadm -g /dev/rdsk/data_disk
. . .
7 Using the lltlink_disable utility, remove all cluster interconnect links

from one system. Watch for the link to expire in the console.
For each LLT link, type:
./lltlink_disable
Select an LLT link.
8 Observe LLT and GAB timeouts and membership change.
lltstat -nvv
gabconfig -a

9 What happens to the systems?
One side of the cluster should panic and reboot. When the rebooted
system is back up, VCS cannot start there because it cannot seed.
10 On one system, view the registrations for the coordinator disks.
Only one systems keys are displayed on the coordinator disks. The other
keys have been rejected.
11 What happens to the service groups?
hastatus -sum
C
The service groups that were running on the system that rebooted have
failed over to the running system.
remaining system.
vxdisk list
. . .
Only one systems keys are shown on the data disks. The other keys have
been rejected.
13 When the system that rebooted is running, check the status of GAB and HAD.
gabconfig -a
This system is not listed in the GAB, Fence, or HAD membership. It is

waiting to seed.
14 Verify that the coordinator disks have registrations for the remaining system
only.

15 Recover the system that rebooted.
a Shut down the system.
shutdown -y
b If you physically unplugged the Ethernet cables for the LLT links,
reconnect the cluster interconnects.
If you used lltlink_disable to simulate interconnect failure, skip

this step.
c Reboot the system.
16 Verify that cluster membership has been established for both systems and both
systems are now registered with the coordinator disks.
gabconfig -a
17 Set the monitor interval for the NIC resource type to back to 60.
haconf -makerw

Optional: Removing the Fencing Configuration
Note: Do not complete this section unless directed by your instructor.

1 Verify that the cluster configuration is saved and closed.
2 Stop VCS and all service groups.
hastop -all
3 Unconfigure the fencing driver.
/etc/init.d/vxfen stop
C
4 From one system, import and remove the coordinator disk group.
vxdg import ______fendg

vxdg destroy ______fendg
5 Use the offline configuration procedure to set the UseFence cluster attribute to
the value NONE in the main.cf file and restart the cluster with the new
configuration.
Note: You cannot set UseFence dynamically while VCS is running.
a Change to the configuration directory.
b Copy of the main.cf file into the test subdirectory.
cp main.cf test
c Edit the main.cf file in the test directory on one system in the cluster to
set the value of UseFence to NONE.

Partial Example:
# vi main.cf
cluster vcs (
CounterInterval = 5
UseFence=NONE
. . .
)
7 Copy the main.cf file back into the /etc/VRTSvcs/conf/config

directory.
cp main.cf ..
hastart
hastart -stale
hastatus -summary

Appendix D
Job Aids
Startup States and Transitions
UNKNOWN
hastart
INITING
Valid configuration on disk Stale configuration on disk
CURRENT_DISCOVER_WAIT STALE_DISCOVER_WAIT
Peer in Peer in Peer in Peer in Peer in
ADMIN_WAIT LOCAL_BUILD RUNNING LOCAL_BUILD ADMIN_WAIT
ADMIN_WAIT CURRENT_PEER_WAIT STALE_ADMIN_WAIT ADMIN_WAIT
Peer in Peer starts
No Peer
RUNNING LOCAL_BUILD
LOCAL_BUILD STALE_PEER_WAIT
Disk Peer in
Error RUNNING
REMOTE_BUILD
The only peer in

RUNNING RUNNING state crashes
Cluster System States

STALE States
STALE_ADMIN_WAIT: The system has a stale configuration and no other
system is in the RUNNING state.
STALE_DISCOVER_WAIT: The system joined the cluster with an invalid
configuration file and is waiting for information from peers.
STALE_PEER_WAIT: The system has no valid configuration file, but another
system is doing a build from disk.
WAIT States
ADMIN_WAIT: This state can occur under these circumstances:
A .stale flag exists and the main.cf file has a syntax problem.
The system is in local build and receives a disk error while reading
main.cf.
The system is in remote build and the last running system fails.
CURRENT_DISCOVER_WAIT: The system has joined a cluster and its
configuration file is valid.
CURRENT_PEER_WAIT: The system has a valid configuration file and
another system is building a configuration from disk.
BUILD States
LOCAL_BUILD: The system is building a configuration from disk.
REMOTE_BUILD: The system is building a configuration from a peer.
D2 VERITAS Cluster Server for UNIX, Fundamentals

Shutdown States and Transitions
RUNNING
Unexpected
exit hastop hastop local -force
FAULTED LEAVING EXITING_FORCIBLY

Resources are taken offlin;
agents are stopped
EXITING
EXITED
EXITING States
D
LEAVING: The system is leaving the cluster gracefully. When agents have
been stopped, the system transitions to the EXITING state.
EXITING: The system is leaving the cluster.
EXITED: The system has left the cluster.
EXITING_FORCIBLY: The hastop -local -force command has
caused the system to exit the cluster. Agents are stopped but applications
continue to run.
OTHER States
RUNNING: The system is an active member of the cluster.
FAULTED: The system is leaving the cluster unexpectedly (ungracefully).
INITING: The system has joined the cluster.
UNKNOWN: The system has no entry in the configuration and has not joined
the cluster.
Appendix D Job Aids D3

Resource States and Transitions
online UP
offline
clean
DOWN UNKNOWN
fault
Resource States and Transitions

The diagram shows resource states and the transitions between those states.

Configuring a Service Group
Add Service Group Test Failover Done
Set SystemList Set Critical Res
Y
Set Opt Attributes
N
Success? Check Logs/Fix
Add/Test Resource
Resource Flow Chart

Test Switching
Y
More? Link Resources
N
Service Group Configuration Procedure
D
Use this procedure to create a service group.
Note: When you switch a service group to another system, keep the service group
running on that system for the duration of the OfflineMonitorInterval (the default
is five minutes) to ensure that the agents properly report all resources offline on
other systems.

Configuring a Resource
Add Resource
Set Non-Critical
Disable Resource* Flush Group

Modify Attributes
Enable Resource* Clear Resource

Y
Bring Online N
Faulted?
Waiting to Go Online
N
Online?
Check Log
Y
Verify Offline (OS)
Done
Everywhere
Resource Configuration Procedure

Use this procedure to configure and test resources.
*Note: Some resources do not need to be disabled and reenabled. Only resources
whose agents have open and close entry points, such as MultiNICA, require you to
disable and enable them again after fixing the problem. By contrast, a Mount
resource does not need to be disabled if, for example, you incorrectly specify the
MountPoint attribute.

List of Notifier Events and Traps
The following tables specify which events generate traps, e-mail notification, or
both. Note that SevereError indicates the highest severity level, and Information,
the lowest. Traps specific to Global Cluster option are ranked from Critical, the
highest severity, to Normal, the lowest.
Clusters
Event Severity Level Description
Global service group is online/partial on Critical A concurrency violation has occurred

multiple clusters. for the global service group.
(Global Cluster option)
Attributes for global service groups are Major The attributes ClusterList,
mismatched. AutoFailover, and Parallel are
(Global Cluster option) mismatched for the same global
service group on different clusters.
Remote cluster has faulted. Major The trap for this event includes
(Global Cluster option) information on how to take over the
global service groups running on the
remote cluster before the cluster
faulted.
D
Heartbeat is down. Warning The connector on the local cluster has
lost its heartbeat connection to the
remote cluster.
Remote cluster is in RUNNING state. Normal The local cluster has a complete
(Global Cluster option) snapshot of the remote cluster,
indicating the remote cluster is in the
RUNNING state.
Heartbeat is alive. Normal Self-explanatory.

(Global Cluster option)
User has logged on to VCS Information A user log on has been recognized
because a user logged on via Cluster
Manager, or because a haxxx
command was invoked.
Agents
Agent is faulted. Warning The agent has faulted on one node in

the cluster.
Agent is restarting Information VCS is restarting the agent.

Resources
Resource state is unknown Warning VCS cannot identify the state of the
resource.
Resource monitoring has timed out Warning The monitoring mechanism for the resource
has timed out.
Resource is not going offline Warning VCS cannot take the resource offline.
Cluster resource health is declined Warning This is used by agents to give additional
information on the state of a resource.
Health of the resource declined while it was
online.
Resource went online by itself Warning (not for The resource was brought online on its
first probe) own.
Resource has faulted Error Self-explanatory.
Resource is being restarted by agent Information The resource is being restarted by its agent.
Cluster resource health is improved Information This is used by agents to give extra
information about state of resource. Health
of the resource improved while it was
online.
Systems
VCS is being restarted by hashadow. Warning Self-explanatory.
VCS is in jeopardy. Warning One node running VCS is in jeopardy.
VCS is up on the first node in the Information Self-explanatory.

cluster.
VCS has faulted. Information Self-explanatory.
A node running VCS has joined Information Self-explanatory.

cluster.
VCS has exited manually. Information VCS has exited gracefully from one node on
which it was previously running.
VCS is up but is not in the cluster. Information VCS is running on one node but the node is
not visible.

Service Groups
Service group has faulted Error Self-explanatory.
Service group has a concurrency SevereError A failover service group has come online on
violation more than one node in the cluster.
Service group has faulted and cannot SevereError The specified service group has faulted on all
be failed over anywhere nodes where the group could be brought
online, and there are no nodes to which the
group can fail over.
Service group is online Information Self-explanatory.
Service group is offline Information Self-explanatory.
Service group is autodisabled Information VCS has autodisabled the specified group
because one node exited the cluster.
Service group is restarting Information Self-explanatory.
Service group is being switched Information The service group is being taken offline on
one node and being brought online on another.
Service group is restarting in response Information Self-explanatory.

to a persistent resource going online

Example Bundled Agent Reference Guide Entries
NIC Agent
Description Monitors the configured NIC. If a network link fails, or if a problem arises with
the device card, the resource is marked OFFLINE. The NIC listed in the Device
attribute must have an administration IP address, which is the default IP address
assigned to the physical interface of a host on a network. This agent does not
configure network routes or administration IP addresses.
Entry Point MonitorTests the network card and network link. Pings the network hosts or
broadcast address of the interface to generate traffic on the network. Counts the
number of packets passing through the device before and after the address is
pinged. If the count decreases or remains the same, the resource is marked
OFFLINE.
State Definitions ONLINEIndicates that the NIC is working.

OFFLINEIndicates that the NIC has failed.
UNKNOWNIndicates that the device is not configured or is configured
incorrectly.
Required Attribute Type and Definition

Dimension
Device string-scalar Name of the NIC.
Optional Attributes Type and Definition

Dimension
NetworkHosts string-vector List of hosts on the network.

If network hosts are specified, the agent sends pings to
the hosts to determine if the network connection is
alive. Enter the IP address of the host instead of the
HostName to prevent the monitor from timing out
(DNS problems cause the ping to hang); for example,
166.96.15.22.
If network hosts are not specified, the monitor tests the
NIC by sending pings the broadcast address on the
NIC. If more than one network host is listed, the
monitor returns ONLINE if at least one of the hosts is
alive.
NetworkType string-scalar Type of network. VCS currently only supports Ethernet

(ether).
PingOptimize integer-scalar Number of monitor cycles to detect if configured

interface is inactive.
A value of 1 optimizes broadcast pings and requires
two monitor cycles.
A value of 0 performs a broadcast ping during each
monitor cycle and detects the inactive interface within
the cycle.
Default is 1.

Requirements for NIC
Verify that each NIC has the correct administrative IP address and subnet
mask.
Verify that each NIC does not have built-in failover support. If it does, disable
it. (If necessary, refer to the NIC documentation.)
Type Definition
type NIC (
static str ArgList[] = { Device, NetworkType,
NetworkHosts, PingOptimize }
NameRule = group.Name + "_" + resource.Device
static int OfflineMonitorInterval = 60
static str Operations = None
str Device
str NetworkType
int PingOptimize = 1
str NetworkHosts[]
)
D
Sample NIC Configurations
Sample 1: Without Network Hosts (Using Default Ping Mechanism)
NIC NIC_le0 (
Device = le0
PingOptimize = 1
)
Sample 2: With Network Hosts
NIC NIC_le0 (
Device = le0
NetworkHosts = { "166.93.2.1", "166.99.1.2" }
)

Mount Agent
Description Brings online, takes offline, and monitors a file system mount point.
Entry Points OnlineMounts a block device on the directory. If the mount process fails, the agent
attempts to run the fsck command on the raw device to remount the block device.
OfflineUnmounts the file system.
MonitorDetermines if the file system is mounted. Checks mount status using the stat
and statvfs commands.
CleanSee description on the following pages.
InfoSee description on the following pages.
State Definitions ONLINEIndicates that the block device is mounted on the specified mount point
OFFLINEIndicates that the block device is not mounted on the specified mount point
UNKNOWNIndicates that a problem exists with the configuration
Required Attributes Type and Description

Dimension
BlockDevice string-scalar Device for mount point.
FsckOpt string-scalar Options for fsck command. "-y" or "-n" must be included as arguments
to fsck; otherwise, the resource cannot come online. VxFS file systems
will perform a log replay before a full fsck operation (enabled by "-y")
takes place. Refer to the manual page on the fsck command for more
information.
FSType string-scalar Type of file system.

For example, vxfs or ufs.
MountPoint string-scalar Directory for mount point.
Optional Attributes Type and Description

Dimension
MountOpt string-scalar Options for mount command.
SnapUmount integer-scalar If set to 1, this attribute automatically unmounts VxFS snapshots when
the file system is unmounted.
Default is 0 (No).

Info Entry Point (4.x only)
The Mount info entry point executes the command:
df -k mount_point
The output displays Mount resource information:
Size Used Avail Use%
The following steps are necessary to initiate the info entry point by setting the
InfoInterval timing to a value greater than 0. For example,
haconf -makerw
hatype -modify Mount InfoInterval 60
In this case, the info entry point is executed every 60 seconds. The command to
retrieve information about the Mount resource is:
hares -value mountres ResourceInfo
Output includes the following information:
Size 2097152
Used 139484
Available 1835332
Used% 8%
D
Type Definition
type Mount (
static str ArgList[] = { MountPoint, BlockDevice, FSType,
MountOpt, FsckOpt, SnapUmount }
NameRule = resource.MountPoint
str MountPoint
str BlockDevice
str FSType
str MountOpt
str FsckOpt
)

Sample Configuration
Mount export1 (
MountPoint= "/export1"
BlockDevice = "/dev/dsk/c1t1d0s3"
FSType = "vxfs"
FsckOpt = "-n"
MountOpt = "ro"
)

Process Agent
Description Starts, stops, and monitors a process specified by the user.
Entry Points OnlineStarts the process with optional arguments.

OfflineTerminates the process with a SIGTERM. If the process does not exit, VCS sends a
SIGKILL.
MonitorChecks to see if the process is alive by scanning the process table for the name of the
executable pathname and argument list.
Required Attribute Type and Description

Dimension
PathName string-scalar Defines complete pathname to access an executable program. This path
includes the program name. If a process is controlled by a script, the
PathName defines the complete path to the shell.
Pathname must not exceed 80 characters.
Optional Attribute Type and Description

Dimension
Arguments string-scalar Passes arguments to the process. If a process is controlled by a script, the
script is passed as an argument. Multiple arguments must be separated by
a single space. A string cannot accommodate more than one space
between arguments, nor allow for leading or trailing whitespace
D
characters. Arguments must not exceed 80 characters (total).
Type Definition
type Process (
static str ArgList[] = { PathName, Arguments }
NameRule = resource.PathName
str PathName
str Arguments
)
Sample Process Configurations

Sample 1
Process usr_lib_sendmail (
PathName = "/usr/lib/sendmail"
Arguments = "bd q1h"
)

Sample 2
include "types.cf"
cluster ProcessCluster (
.
.
.
group ProcessGroup (
SystemList = { sysa, sysb }
AutoStartList = { sysa }
)
Process Process1 (
PathName = "/usr/local/bin/myprog"
Arguments = "arg1 arg2"
)
Process Process2 (
PathName = "/bin/csh"
Arguments = "/tmp/funscript/myscript"
)
// resource dependency tree

//
// group ProcessGroup
// {
// Process Process1
// Process Process2
// }

SCSI-3 Persistent Reservations
Disk registration and reservations are performed by the VERITAS fencing driver
using a relatively new technology known as SCSI-3 persistent reservations (SCSI-
3 PR, or just PR).
PR uses the concepts of registration and reservation. Participating systems register
a key with a device (controlling registration is discussed later). Registered systems
can then set a reservation mode on these devices. The VERITAS fencing
implementation uses a mode called Write Exclusive Registrants Only. This mode
ensures that only members registered with the device can write. Other nodes can
potentially read to allow for off-host backup schemes.
Current SCSI-3 PR specifications enable VCS to support 32 nodes with multiple
paths from each node.
SCSI-3 Persistent Reservation Blocking

With SCSI-3 PR technology, blocking write access is as simple as removing a
registration from a device. Only registered members can eject the registration of
another member. A member wanting to eject another member issues a preempt and
abort command that ejects another node from the membership. Nodes not in the
membership cannot issue this command.
Looking at this in another way, this means after a node is ejected, it cannot, in turn,
D
eject another; ejecting is final and atomic.
In the VCS implementation, a node registers the same key for all paths to the
device. A single preempt and abort command ejects a node from all paths to the
storage device.
Several important concepts are summarized below:
Only a registered node can eject another.
Because a node registers the same key down each path, ejecting a single key
blocks all I/O paths from the node.
After a node is ejected, it has no key registered, and it cannot eject others.
The SCSI-3 PR specification describes the method to control access to disks with
the registration and reservation mechanism. The method to determine who can
register with a disk and who is eligible to eject another node is implementation-
specific.

Best Practices
Cluster Interconnect
Using the sysname File

Use the sysname file to specify the local node name. This removes any
dependency on the UNIX host name given by the uname -a command. If the host
name is changed and no longer matches the llthosts, llttab, and main.cf
system name entries, VCS cannot start.
Redundant LLT Links

Two Ethernet LLT heartbeat links are the recommended minimum.
No single point of failure should be allowed anywhere in the cluster
interconnect, including hubs, NICs, and NIC position within the system.
No routers can be used in the path of the interconnect.
Configure the public network as an additional low-priority LLT link.
Shared Storage
Volume Resources
Volume resources are not required. They provide additional monitoring; however,
in environments with many volumes, the additional overhead of monitoring all the
volumes may be undesirable.
File Systems
Ensure that all file systems controlled by VCS resources are set to manual control
in the operating system configuration files. The operating system should not
perform any automatic mounts or unmounts.
SANs/Arrays
Shared disks on a SAN must reside in the same zone as all of the nodes in the
cluster.
Data residing on shared storage should be mirrored or protected by a hardware-
based RAID mechanism.
Use redundant storage and paths.
Use multiple single-port HBAs or SCSI controllers rather than multiport
interfaces to avoid single points of failure.
Include all cluster-controlled data in your backup planning and
implementation. Periodically test restoration of critical data to ensure that the
data can be restored.

Public Network
Allocate a dedicated administrative IP address to each node of the cluster. This
must not be failed over to any other node.
Allocate one or more virtual IP addresses for each service group requiring
access by way of the public network.
Map DNS entries to the service group IP addresses for the cluster.
Note the service group IP addresses s in the hosts file.
When specifying a NetworkHosts for the NIC resource, specify one or more
highly available IP addresses.
Critical Resources
During configuration, consider initially setting all resources to non-critical. This
prevents service groups from failing over if you make errors when setting up a new
resource. Then set all resources to critical, which should cause a service group to
fault and fail over in the event the resource faults.
Deleting a Service Group

Delete all resources before removing a service group. This prevents possible
resource faults and error log entries that can occur if a service group with online
resources is deleted.
D
Proxy Resources
If you have multiple service groups that use the same network interface, you can
reduce monitoring overhead by using Proxy resources instead of NIC resources. If
you have many NIC resources, consider using Proxy resources to minimize any
potential performance impacts of monitoring.
Outside Services
Minimize reliance on services that are not within control of the cluster to ensure
high availability for your applications. Consider:
Network name resolution services
NFS mounts
NIS
In addition, ensure that external resources, such as DNS and gateways, are highly
available.
Multiple Oracle Instance Configurations

The following list describes some best practices for configuring and managing
multiple Oracle instances in a VCS environment.
For each SID to be configured, create UNIX accounts with DBA privileges.
Ensure that each Oracle instance has a separate disk group and is configured as
a separate service group.

Define the /etc/system parameters such that the allocation of semaphore
and shared memory is appropriate on all systems.
Use a dedicated set of binaries for each Oracle instance, even if each instance
uses the same Oracle version.
If your configuration uses the same Oracle version for all instances, install a
version on the root disk or, preferably, on a secondary disk. Locate the pfiles in
the default location and define several listener processes to ensure clean
failover.
If your configuration has several 8.1.x instances and just one listener, set up a
parallel service group.
If your configuration has different versions of Oracle, create a separate
$ORACLE_HOME for each Oracle version.
Follow the Optimal Flexible Architecture (OFA) standard (/uxx/SID). In
cluster configurations, you can adapt the standard to make it more application-
specific, for example, /app/uxx/SID.
Listeners accompanying different versions of Oracle may not be backward
compatible. Therefore, if you want to create a single listener.ora file, you
must verify that the listener supports the other versions of Oracle in the cluster.
You must also create a separate Envfile for each version of Oracle.
Ensure that each listener listens to a different virtual address. Also, assign
different names to listeners and ensure that they do not listen to the same port.
If you create a single user named oracle and define the variables required by
Oracle, you must redefine at minimum the $ORACLE_HOME and
$ORACLE_SID variables every time you want to invoke svrmgrl.
VERITAS recommends that you define several Oracle users in the passwd
file, with each user having the appropriate environment variables, so you can
easily identify which level of Oracle code you are running.
The pfiles must be coordinated between systems. If you have two instances
using the Oracle version, keep a copy of both of the init SID.ora files in
the default directory, so that if one systems fails, $ORACLE_HOME is set on
each system.
Testing
Test services on each failover target system before putting them under VCS
control.
Create a test cluster for performing the initial implementation and testing any
changes.
Test all possible failure scenarios.
Create and execute an acceptance/solution test plan before deploying a
cluster in a production environment and when making any changes.

Training
Provide appropriate training, as follows:
VERITAS Cluster Server
System administrators
Database administrators
Developers
VERITAS File System: System administrators
VERITAS Volume Manager: System administrators
VERITAS NetBackup: System and Backup administrators

New Features in VCS 4.1
The following features are introduced in VCS version 4.1.
Solaris 10 Local Zone Support

Solaris 10 provides a means of virtualizing operating system services, allowing
one or more processes to run in isolation from other activity on the system. Such a
"sandbox" is called a local zone. Each zone can provide a rich and customized
set of services. There is also a "global zone" and processes running in this zone
have the same set of privileges available on a Solaris system today.
VCS provides high availability to applications running in local zones by extending
the failover capability to zones. VCS is installed in a global zone and all VCS
agents and engine components run in the global zone. For applications running
within local zones, agents run entry points inside the zones. If a zone configured
under VCS control faults, VCS fails over the entire service group containing the
zone.
VERITAS Security Services (VxSS)

VCS 4.1 is integrated with VERITAS Security Services (VxSS) to provide secure
communication between cluster nodes and clients, including the Java and the Web
consoles. VxSS uses digital certificates and uses SSL to encrypt communication
over the public network.
User Management in the Secure Mode

Change in behavior: If VCS is running in the secure mode, you can add system
users to VCS and assign them privileges. You must specify user names in the
format username@domain. You cannot assign or change passwords for users when
VCS is running in the secure mode.
NFS Lock Failover

VCS 4.1 adds support for failover of NFS 3.0 file locks with the addition of the
NFSLock bundled agent. For details, refer to the VERITAS Cluster Server 4.1
Bundled Agents Reference Guide.
JumpStart Compliance
VCS 4.1 is compliant with Solaris JumpStart technology.
Web Console Features

The Web console now includes support for:
Secure clusters
SystemList modification
Static resource type attribute overrides

Java Console Features
The Java console now includes support for:
Secure clusters
Static resource type attribute overrides
VCS Login Environment

When non-root users execute haxxx commands, they are prompted for their VCS
user name and password to authenticate themselves. In VCS 4.1, you can use the
halogin command to save the authentication information so that you do not have
to enter your credentials every time you run a VCS command. You must also set
the VCS_HOST environment variable or populate the /etc/.vcshosts file to
run commands remotely. Users must have proper cluster- and group-level
privileges to execute commands. You cannot remotely run ha commands that
require localhost root privileges. See Logging On to VCS in the VERITAS
Cluster Server Users Guide, or more information about the halogin command.

New Features in VCS 4.0
The following features are introduced in VCS version 4.0.
Global Cluster Option

The Global Cluster option to VCS enables a collection of VCS clusters to work
together to provide wide-area disaster recovery. Previously, the wide-area
functionality was available in a separate Global Cluster Manager product. The
functionality has now been incorporated into VCS 4.0.
VCS Simulator
VCS Simulator is a tool for simulating any cluster configuration and determining
how service groups will behave during cluster or system faults. With the simulator,
you can designate and fine-tune configuration parameters, view state transitions,
and evaluate complex, multinode configurations. The tool is especially valuable
because it enables you to design and evaluate a specific configuration without test
clusters or changes to existing production configurations.
I/O Fencing
VCS 4.0 provides a new capability, called I/O fencing, to arbitrate cluster
membership and ensure data integrity in the event of communication failure
among cluster members. The I/O fencing kernel module uses SCSI-III Persistent
Reservations and designated coordinator disks, as described in the I/O Fencing
chapter of the VERITAS Cluster Server 4.0 Users Guide.
Fire Drill
Fire drill is a procedure for testing the fault readiness of a configuration. A fire
drill on a VCS-controlled application uses a separate fire drill service group that
contains a copy of the live applications resources. See the VERITAS Cluster
Server 4.0 Users Guide for more information.
Steward
The Steward mechanism minimizes chances of a wide-area split-brain in two-node
clusters. The steward process can run on any system outside of the clusters in a
Global Cluster configuration. See the VERITAS Cluster Server 4.0 Users Guide
for more information.

Web Console Features
Support for global clustering
Home portal
User management
Java Console Features

Support for Global Clustering
VCS Simulator
Display of agent logs
cpuusage Event Trigger

The new cpuusage event trigger is invoked on systems where CPU usage exceeds
the configured threshold value. See the VCS 4.0 Users Guide for more
information.
multinicb Event Trigger

The new multinicb event trigger is invoked when a network device under
MultiNICB control changes its state. The trigger is also always called in the first
D
monitor cycle. See the VCS 4.0 Users Guide for more information.
Action Entry Point

The action entry point enables agents to perform actions that can be completed
within a few seconds and that are outside the scope of traditional actions, such as
being brought online and taken offline.
Info Entry Point

The info entry point enables agents to gather specific information for an online
resource.
New Bundled Agents

The DNS bundled agent was added in the VCS 4.0 Release. For details, refer to the
VERITAS Cluster Server Bundled Agents Reference Guide.
New Attributes
Resource Type Attributes
ActionTimeout
FireDrill
InfoInterval

InfoTimeout
LogDbg
MonitorStatsParam
SupportedActions
Resource Attributes
ComputeStats
MonitorTimeStats
ResourceInfo
Service Group Attributes
Authority
ClusterFailoverPolicy
ClusterList
System Attributes
CPUUsage
CPUUsageMonitoring
NoAutoDisable
Cluster Attributes
AutoStartTimeout
ClusState
ClusterAddress
ConnectorState
Stewards
UserFence
New Attribute Category

Heartbeat attributes are introduced to VCS 4.0 with the new global cluster
features.
AgentState
Arguments
AYAInterval
AYARetryLimit
AYATimeout
CleanTimeOut
ClusterList
InitTimeout
LogDbg
State
StartTimeout
StopTimeout

Appendix E
Design Worksheet: Template
Cluster Interconnect Configuration
First system:

set-node
(host name)
set-cluster
(number in host name of odd
system)
link
link
E28 VERITAS Cluster Server for UNIX, Fundamentals

Second system:

set-node
set-cluster
link
link
E
Cluster Configuration (main.cf)
Types Definition Sample Value Your Value

Include types.cf
Cluster Definition Sample Value Your Value

Cluster
Required Attributes
UserNames
Appendix E Design Worksheet: Template E29

ClusterAddress
Administrators
Optional Attributes
CounterInterval

System
System

Group
Required Attributes
FailoverPolicy
SystemList
Optional Attributes
AutoStartList
OnlineRetryLimit

Service Group
Resource Name
Resource Type
Required Attributes
Optional Attributes
Critical?
E
Enabled?

Service Group
Resource Name
Resource Type
Required Attributes

Optional Attributes
Critical?
Enabled?

Service Group
Resource Name
Resource Type
Required Attributes
Optional Attributes

Critical?
Enabled?

Service Group
Resource Name
Resource Type
Required Attributes
E
Optional Attributes
Critical?
Enabled?

Service Group


Ha Vcs 410 101a 2 10 srtpg2 130918134117 Phpapp01

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Ha Vcs 410 101a 2 10 srtpg2 130918134117 Phpapp01

Hochgeladen von

Copyright:

Verfügbare Formate

VERITAS Cluster Server for

Appendix B: Lab Details

Appendix C: Lab Solutions

Appendix D: Job Aids

Appendix E: Design Worksheet: Template

ii VERITAS Cluster Server for UNIX, Fundamentals

Lab 2 Synopsis: Validating Site Preparation

A2 VERITAS Cluster Server for UNIX, Fundamentals

Object Sample Value Your Value

Partner system host name train2

name prefix for your bob

Interconnect link 1 Solaris: qfe0

Admin IP address for 192.168.xx.xxx

Appendix A Lab Synopses A3

Classroom LAN 192.168.XX, where XX=27, 28, or 29

A4 VERITAS Cluster Server for UNIX, Fundamentals

2 Check the VERITAS licenses to determine whether a VERITAS Cluster Server

Checking PackagesLinux Only

When installing any Storage Foundation product or VERITAS Volume Replicator,

Configuring Secure ShellLinux Only

Setting Up a Console WindowLinux Only

2 Open a System Log Display tool.

Appendix A Lab Synopses A5

Lab 3 Synopsis: Installing VCS

Obtaining Classroom Information

A6 VERITAS Cluster Server for UNIX, Fundamentals

Appendix A Lab Synopses A7

Subnet Mask 255.255.255.0

Administrator account Name admin

A8 VERITAS Cluster Server for UNIX, Fundamentals

1 Obtain the location of the installation software from your instructor.

a Change to the install directory.

b Run the installer script (VERITAS Product Installer) located in the

d Install all optional packages (including Web console and Simulator).

e Accept default of Y to configure VCS.

f Do not configure a third heartbeat link at this time.

g Do not configure a low-priority heartbeat link at this time.

h Do not configure VERITAS Security Services.

i Do not set any user names or passwords.

j Retain the default admin user account and password.

k Configure the Cluster Server Cluster Manager.

l Do not configure SMTP Notification.

Appendix A Lab Synopses A9

n Select the option to install all packages simultaneously on all systems.

p Start Storage Foundation Enterprise HA processes.

q Do not set up a default disk group.

Installing Other Software

1 If your instructor indicates that additional software, such as VCS patches or

Installation software directory:

A10 VERITAS Cluster Server for UNIX, Fundamentals

If hastatus -sum shows the cluster systems in a running state and a

2 Perform additional verification (generally only necessary if there is a problem

a Verify that all packages are loaded.

b Verify that LLT is running.

c Verify that GAB is running.

Exploring the Default VCS Configuration

View the configuration files set up by the VCS installation procedure.

1 Explore the LLT configuration.

2 Explore the GAB configuration.

3 Explore the VCS configuration files.

Appendix A Lab Synopses A11

1 Use a Web browser to connect to the Web GUI.

3 Browse the cluster configuration.