Sie sind auf Seite 1von 53

PowerHA SystemMirror Common Tasks

for HA Administrators
Session ID: 41CO
Michael Herrera
PowerHA SystemMirror (HACMP) for AIX
ATS Certified IT Specialist
mherrera@us.ibm.com

2010 IBM Corporation

IBM Power Systems

Agenda
 Management:
Start & Stop of cluster services
Moving Resources
Saving off Configuration

 Maintenance

 Configuration Optimization
Hostname Changes
Naming requirements in V7.x
Auto start or not of cluster
services
Dynamic Node Priority
Application Monitoring
DLPAR Integration
Resource Group Dependencies

Upgrading AIX & Cluster Software


CSPOC - LVM Changes
Adding / Removing Physical volumes
 Tunables
Network Changes (dynamic)
Cluster Security
Setting up Pager Notification
Failure Detection Rate (FDR)
Deploying File Collections
Adding Users
Custom Cluster Verification Methods
Password Changes
Practical use of UDE events
 Common Commands (CLI)
Online Planning Worksheets
clmgr, lscluster

2010 IBM Corporation

IBM Power Systems

How do you check what version of code you are running ?


 Historically we have run:
# lslpp l cluster.es.server.rte
Fileset
Level State
Description
---------------------------------------------------------------------------Path: /usr/lib/objrepos
cluster.es.server.rte
7.1.1.1 COMMITTED Base Server Runtime
Path: /etc/objrepos
cluster.es.server.rte
7.1.1.1 COMMITTED Base Server Runtime

 Now you can also run:


# halevel s
7.1.1 SP1

 even though machine may be running SP2

 Also useful:
# lssrc ls clstrmgrES | grep fix
cluster fix level is "3"

Attention:
Be aware that HA 7.1.1 SP2 or SP3 does not get reported back properly. The halevel command
probes with the wrong option and since the server.rte fileset is not updated it will not catch the
updates to the cluster.cspoc.rte filesets.
3

2010 IBM Corporation

IBM Power Systems

Upgrade Considerations
There are two main areas that you need to consider OS & HA Software
 Change Controls: what is your ability to apply and test the updates ?
 Consider things like Interim Fixes locking down the system
Will they need to be reapplied?
Will they need to be rebuilt?

Operating System:
 Should you do AIX first or HA code?
Should you combine the upgrade
New OS requirements for HA
What is your back-out plan?
Alternate disk install
Mksysb

 BOS updates will typically require a


reboot (hence a disruption)

Cluster Software Code:


 What type of Migration
Snapshot Migration
Rolling Migration
Non-Disruptive Update

 Evaluate source to target level


Can you perform a NDU update?
New minimum OS requirements
New required settings
IP Multicasting, Hostname restrictions
Required topology changes

2010 IBM Corporation

IBM Power Systems

AIX Upgrade Flow in Clustered Environment


Hypothetical Example 2 Node Cluster running AIX 7.1
Active Production Environment
- Operating System @ AIX 7.1.0.0
You can start the upgrade on either node
but obviously an update to the node
hosting the application would cause a
disruption to operations

Starting Point Standby System


Operating System @ AIX 7.1.0.0
- Stop Cluster Services
- OS update TL1 & SPs
- Reboot
Reintegrate into cluster with AIX 7.1.1.5

- Stop with Takeover


- Acquire Resource Group / Application
- OS update TL1 & SPs
- Reboot
- Reintegrate into cluster with AIX 7.1.1.5
Standby System running New Level

- Issue rg_move back or continue to run on


the standby System

Common Question: Can the cluster run with the nodes running different levels?
5

2010 IBM Corporation

IBM Power Systems

Flow of PowerHA Software Upgrade


Hypothetical Example 2 Node Cluster HA version 5.5 to 6.1
Active Production Environment
- HA Version 5.5
- UNMANAGE resources
- Application is still running
- smit update_all
- HA Level & Patches
- Be mindful of new base filesets
- smit clstart
- Start scripts will get reinvoked

Starting Point Standby System


HA Version @ HA Version 5.5

We advise against stopping the


cluster with the UNMANAGE option
on more than one node at a time.
Note that it can be done but there
are various factors to consider

Node Running at New 6.1 version


- Application still active
- UNMANAGE resources
- smit update_all
- smit clstart
Node Running Version 6.1
Common Question: How long can the cluster run in a mixed mode ? What operations are supported ?
6

2010 IBM Corporation

IBM Power Systems

Client Scenario Database Binary Upgrade


Scenario:
- Client had an environment running independent Oracle databases in a mutual takeover cluster
configuration. They wanted to update the Oracle binaries one node at a time and wanted to avoid
an unexpected fallover during the process. They wished to UNMANAGE cluster resources on all
nodes at the same time.

Lessons Learned:
 Do not do an upgrade of the cluster filesets while unmanaged on all nodes
This would recycle the clstrmgrES daemon and the cluster would lose its internal state
 Application monitors are not suspended when you UNMANAGE the resources
If you manually stop the application and forget about the monitors existing application
monitors could auto-restart it or initiate a takeover depending on your configuration
 Application Start scripts will get invoked again on restart of cluster services
Be aware of what happens when you invoke your start script while already running, or
comment out the scripts prior to restarting cluster services
 Leave the Manage Resources attribute set to Automatic
Otherwise it will continue to show the RG as UNMANAGED until you do an RG move
ONLINE
7

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror: Cluster Startup Behavior

 What is the Best Practice ?

All currently supported


releases perform a
cluster verification on
start up and will validate
whether the node can
enter the cluster

Cluster Services are


set to automatically
start up on boot up

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror - Cluster Start up Behavior


 The cluster manager daemon is now running all of the time
# clshowsrv -v
Status of the RSCT subsystems used by HACMP:
Subsystem
Group
PID
Status
cthags
cthags
4980948
active
ctrmc
rsct
4063376
active
Status of the HACMP subsystems:
Subsystem
Group
PID
clstrmgrES
cluster
4915234
clcomd
caa
6422738

# lssrc -ls clstrmgrES | grep state


Current state: ST_STABLE

Status
active
active

Status of the optional HACMP subsystems:


Subsystem
Group
PID
Status
clinfoES
cluster
8847544
active

- Default Start up behavior is false


- Verify Cluster should be left to true

 Settings can be altered within the cluster panels:

2010 IBM Corporation

IBM Power Systems

So how do you start up Cluster Services ?


 smitty sysmirror  System Management  PowerHA SystemMirror Services  Start / Stop
 smitty clstart (FastPath)
 clmgr start cluster
clmgr online node nodeA
clmgr start node nodeA

 IBM Systems Director Plug-In

10

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror: Cluster Stop Options


 What is the purpose of each option ?

For non-disruptive updates


stop services on only one
node at a time to allow for
one node to retain the status
of the cluster resources

You cannot Non-Disruptively upgrade from pre-version 7.X to newer releases


The upgrade from 7.1.0 to 7.1.1 is also disruptive
11

2010 IBM Corporation

IBM Power Systems

UNMANAGE Resource Group Feature in PowerHA


 Function used for Non-Disruptive Updates (one node at a time)
Previously known as the Forced Stop
 HA Daemons will continue to run but resources will not be monitored

Application
Monitors will
continue to run.
Depending on the
implementation it
might be wise to
suspend monitors
prior to this
operation

12

2010 IBM Corporation

IBM Power Systems

Moving Resources between Nodes


 clRGmove g <RGname> n <nodename> -m
 clmgr move rg <RGname> node=<nodename>

If multiple RGs are selected


the operation and resources
will be processed
sequentially

 IBM Systems Director Plug-In


 smitty cl_admin

13

2010 IBM Corporation

IBM Power Systems

Types of Available RG Dependencies


 Parent Child Dependencies

 Made Available in V5.2

 Location Dependencies

 Made Available in V5.3

Online on Same Node


Online on Different Nodes
Online on Same Site

 Start After & Stop After

 Made Available in V7.1.1

Most of this is old news, but the use of dependencies can affect where and how
the resources get acquired. More importantly it can affect the steps required to
move resource groups and more familiarity with the configuration is required
14

2010 IBM Corporation

IBM Power Systems

Moving Resource Groups with Dependencies


 Invoked
clRGmove g <RGname> n <nodename> -m

15

2010 IBM Corporation

IBM Power Systems

Automatic Corrections on Verify & Sync


There are Verify & Sync options in the first two
paths, however, note that they do not include
the Auto-Corrective option. You need to follow
the Custom Cluster Configuration Path for that.

The custom path will allow to make corrective


actions only if ALL cluster nodes are not running
cluster services. By default it will not perform
any corrective actions.

16

2010 IBM Corporation

IBM Power Systems

Automatic Nightly Cluster Verification


 By Default the cluster will run a nightly Verification check at midnight

Be aware of the
clcomd changes for
version 7 clusters

 The clutils.log file should show the results of the nightly check

17

2010 IBM Corporation

IBM Power Systems

Cluster Custom Verification Methods


 Cluster Verification is made up of a bunch of data collectors
 Checks will return PASSED or FAILED
Will often provide more details than what is reported in the smit.log output

 Custom Verification Methods may be defined to run during the Verify / Sync operations

Note: Automatic verify & sync on node start up does not include any custom verification methods

18

2010 IBM Corporation

IBM Power Systems

Adding Custom Verification Methods


Problem Determination Tools > PowerHA SystemMirror Verification > Configure Custom Verification Method
 Add a Custom Verification Method and press Enter

Output in smit.log and clverify.log files:


Currently Loaded Interim Fixes:
NODE mutiny.dfw.ibm.com
PACKAGE
INSTALLER LABEL
======================================================== =========== ==========
bos.rte.security
installp passwdLock
NODE munited.dfw.ibm.com
PACKAGE
INSTALLER LABEL
======================================================== =========== ==========
bos.rte.security
installp passwdLock
Please Ensure that they are consistent between the nodes!
19

2010 IBM Corporation

IBM Power Systems

Custom Verification Methods


 Custom methods should be in a common path between the cluster members
ie. /usr/local/hascripts/custom_ver_check.sh

 The Methods are stored in the cluster ODM stanzas

 Script Logic & Return Codes


How fancy do you want to get

20

#!/bin/ksh
echo "Currently Loaded Interim Fixes:"
clcmd emgr -P
echo "Please Ensure that they are consistent between the nodes!"

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror: Cluster Snapshots


 /usr/es/sbin/cluster/snapshots/ <snapshotname>.info
<snapshotname>.odm
Snapshot files:
Snapshot C .odm
Snapshot
cluster B .odm
ODM stanzas
cluster
Snapshot A .odm
ODM stanzas
cluster
ODM stanzas

21

Snapshot C .info
Snapshot
cluster
reportB .info

 Snapshots are saved


off automatically any
time a Verify / Sync
operation is invoked

Snapshot
cluster
reportA .info
cluster report

Cluster Configuration

Cluster Report & CLI output

HACMPcluster
...infoT

<html tags>
cllsnode
T..

HACMPnode
TinfoT

cllscf
T..

HACMPadapter
TinfoT.

cllsif
T..

 The .info file is not


necessary in order to
able to restore the
configuration
 The snapshot menu will
ask for a <name> and a
<description> as the
only required fields
 The snapshot upgrade
migration path requires
the entire cluster to be
down
2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror: Changing the Hostname


 CAA does not currently support changing a systems hostname
Basically means do not attempt to do this in a Version 7.X cluster

Inet0 - hostname

Inet0 - hostname
Only the service IP should
be swapping between nodes
# lscluster output
TT.
UUID as well

Service IP
Volume Group
/filesystems

start.sh

#!/bin/ksh
set new Hostname 

stop.sh

#!/bin/ksh
unset hostname

Application
Controller

The same is true for


the cluster repository
disk. The UUID is
stored hence you
should not attempt to
replicate the volume
or create an mirrors
to the volume
caa_private volume
group.

* This is restriction currently under evaluation by the CAA development team and may
be lifted in a future update
22

2010 IBM Corporation

IBM Power Systems

Naming requirements in V7 clusters


 The COMMUNICATION_PATH has to resolve to the hostname IP
In prior releases the CP could be any path to the node

 Node name can be different than the hostname


 The use of a - is not supported in the node name
We had clients further highlight this limitation by using clmgr to create the
cluster. If a node name is not specified and the hostname has a - the default
node name assigned will also try to use a -
ksh restrictions were removed to allow the use of a - in service IP labels so
both V6.1 and V7.X support their use in the name

23

2010 IBM Corporation

IBM Power Systems

Changes to Node outbound traffic


There were changes made to AIX & PowerHA alias processing

Cluster running HA V6.1 SP7


with AIX 6.1 TL2
Service IP Alias is listed after
persistent & base address

Cluster running HA V7.1 SP3


with AIX 7.1 TL1 SP4
Service IP Alias is automatically
listed before the base address.
Note that no persistent IP is
configured in this environment

24

2010 IBM Corporation

IBM Power Systems

Number of Resources & Fallover Times


Common Questions:
Will the number of disks or volume groups affect my fallover time?
Should I configure less larger luns or more smaller luns?

Versions 6.1 and earlier allowed Standard VGs or Enhanced Concurrent VGs
Version 7.X require the use of ECM volume groups

Your Answers:
 Standard VGs would require an openx call against each physical volume
Processing could take several seconds to minutes depending on the number of LUNs

 ECM VGs are varied on all nodes (ACTIVE / PASSIVE)


It takes seconds per VG

 Parallel processing will attempt to varyon on all VGs in parallel

25

2010 IBM Corporation

IBM Power Systems

Number of Resource Groups


 RG Decisions beyond: Startup Fallover & Fallback behavior
NODE A

RG1 (NodeA, NodeB)


Service IP
VG1
APP Server 1

NODE B

RG2 (NodeB, NodeA)


Service IP
VG2
APP Server 2

Further Options
 1 RG vs. Multiple RGs
Selective Fallover behavior (VG / IP)

 RG Processing
Parallel vs. Sequential

RG3 (NodeA, NodeB)


Service IP
VG3
APP Server 3

RG4 (NodeB, NodeA)


Service IP
VG4
APP Server 4

 Delayed Fallback Timer


When do you want to fail back

 RG Dependencies
Parent / Child, Location
Start After / Stop After

Best Practice:
Always try to keep it simple, but stay current with new features and take advantage
of existing functionality to avoid added manual customization.
26

2010 IBM Corporation

IBM Power Systems

Filesystem Definitions in a Resource Group


 Should you explicitly define the filesystems in a Resource Group?
 PowerHA default behavior is to mount ALL

 Reasons to explicitly define:


Nested Filesystems
Only mount Filesystems specified

 Scenario:
10 Filesystems in volume group & only 1 defined in RG
HA processing will only mount the one FS

27

What are the implications going


forward if you add new Filesystems
via CSPOC and forget to append
them to the resource group
definition?

2010 IBM Corporation

IBM Power Systems

Event Processing of resources


 Resource Groups are processed in Parallel unless you implement RG dependencies or
set a customized serial processing order (HA 4.5 +)
 The new process_resources event script is organized around job types: ACQUIRE,
RELEASE, ONLINE, OFFLINE, DISKS, TAKEOVER_LABELS, APPLICATIONS and more
i.e. JOB_TYPE = VGS

Invoked during Parallel Processing:


 acquire_svc_addr
 acquire_takeover_addr
 node_down
 node_up
 release_svc_addr
 release_takeover_addr
 start_server
 stop_server

28

* Be mindful of this with the implementation of Pre/Post Events

Not invoked:
 get_disk_vg_fs
 node_down_local
 node_down_remote
 node_down_local_complete
 node_down_remote_complete
 node_up_ local
 node_up_remote
 node_up_local_complete
 node_up_remote_complete
 release_vg_fs

2010 IBM Corporation

IBM Power Systems

Defining Pre / Post Events


 Pre/Post-Event Commands are NOT the same thing as User Defined Events

A custom Event will never


get invoked unless you
explicitly define it as a Pre or
Post event command to an
existing Cluster Event

29

2010 IBM Corporation

IBM Power Systems

User Defined Events - UDE


 This option allows you to exploit RMC resource monitors to trigger EVENTs
 Familiarize yourself with the lsrsrc command
A Practical Guide for Resource Monitoring and Control - SG24-6615

Notes:
 Recycle cluster services after updating UDE events
 Scripts must exist on all cluster nodes: (Path, permissions)
 Logic in recovery program can be configured to send
notification, append more space, etcT
 Can specify multiple values in Selection String field
 Actions logged in clstrmgr.debug and hacmp.out files

30

# odmget HACMPude
HACMPude:
name = "Herrera_UDE_event"
state = 0
recovery_prog_path =
"/usr/local/hascripts/Herrera_UDE
recovery_type = 2
recovery_level = 0
res_var_name = "IBM.FileSystem"
instance_vector = "Name = \"/\""
predicate = "PercentTotUsed > 95"
rearm_predicate = "PercentTotUsed < 70"

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror: File Collections


 Introduced in HA 5.2
Ability to automatically push files every 10 min from source node specified
Default collections created but not enabled by default

 Configuration_Files

/etc/hosts
/etc/services
/etc/snmpd.conf
/etc/snmpdv3.conf
/etc/rc.net
/etc/inetd.conf
/usr/es/sbin/cluster/netmon.cf
/usr/es/sbin/cluster/etc/clhosts
/usr/es/sbin/cluster/etc/rhosts
/usr/es/sbin/cluster/etc/clinfo.rc

 SystemMirror_Files

Pre, Post & Notification


Start & Stop scripts
Scripts specified in monitors
Custom pager text messages
SNA scripts
Scripts for tape support
Custom snapshot methods
User defined events

 Not intended to maintain users & passwords between cluster nodes

31

2010 IBM Corporation

IBM Power Systems

File Collections Application script Scenario


# smitty sysmirror  System Management  File Collections

If set to yes files


will be propagated
every 10 minutes

Node A
/usr/local/hascripts/app*

/usr/local/hascripts/app*

#!/bin/ksh
Application Start Logic

#!/bin/ksh
Application Start Logic

RED Updates
#!/bin/ksh
Application Stop Logic

BLUE Logic
#!/bin/ksh
Application Stop Logic

RED Updates

32

Node B

Blue Logic

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror - User & Group Administration


# smitty sysmirror  System Management  Security and Users

 Can select
Local (files)
LDAP

 Select Nodes by
Resource Group
No selection
means all nodes

 Users will be
propagated to all of
the cluster nodes
applicable
 Password command
can be altered to
ensure consistency
across al nodes

33

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror - User Passwords (clpasswd)


# smitty sysmirror  System Management  Security and Users  Passwords in a PowerHA SystemMirror cluster

 Optional List of
Users whose
passwords will be
propagated to all
cluster nodes
passwd
command is
aliased to
clpasswd

 Functionality
available since
HACMP 5.2
(Fall 2004)

34

2010 IBM Corporation

IBM Power Systems

Repository Disk Failure

35

2010 IBM Corporation

IBM Power Systems

Pager Notification Events


 As long as sendmail is enabled you can easily receive EVENT notification
smitty sysmirror  Custom Cluster Configuration  Events  Cluster Events
 Remote Notification Methods  Add a Custom Remote Notification Method

Sample Email:
From: root 10/23/2012 Subject: HACMP
Node mhoracle1: Event acquire_takeover_addr occurred at Tue Oct 23 16:29:36 2012, object =

36

2010 IBM Corporation

IBM Power Systems

Pager Notification Methods


HACMPpager:
methodname = "Herrera_notify"
desc =
Lab Systems Pager Event"
nodename =
"connor kaitlyn"
dialnum =
"mherrera@us.ibm.com"
filename =
"/usr/es/sbin/cluster/samples/pager/sample.txt"
eventname =
"acquire_takeover_addr config_too_long
event_error node_down_complete node_up_complete"
retrycnt =
3
timeout =
45
# cat /usr/es/sbin/cluster/samples/pager/sample.txt
Node %n: Event %e occurred at %d, object = %o

 Action Taken: Halted Node Connor


Sample Email:
From: root 09/01/2009 Subject: HACMP
Node kaitlyn: Event acquire_takeover_addr occurred at Tue Sep 1 16:29:36 2009, object =

Attention:
Sendmail must be working and accessible via the firewall to receive notifications
37

2010 IBM Corporation

IBM Power Systems

Online Planning Worksheets Discontinued in Version 7


 The fileset is still there, but the content is no longer there

There is a push to
leverage IBM Systems
Director which will guide
you through the step by
step configuration of the
cluster

38

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror Deadman Switch (CAA)


 Version 7 cluster software changes the old behavior

Recent Client Failure Scenario

TT..............

- Repository Disk LUN had been


locked and had not been
responsive for days. Client was
unaware and standby node had a
problem. Primary system was
brought down when it was unable
to write to repository disk

 CAA DMS tunable (deadman_mode) allows two different actions


Assert (crash) the system (default behavior)
Generate AHAFS event

39

2010 IBM Corporation

IBM Power Systems

LVM Dynamic Updates


 The cluster is easy to set up, but what about changes going forward
 ECM Volume Groups (required at HA V7)
New lvs will get pushed across, filesystems will not
LV updates get pushed across but do not update the /etc/filesystems.
Lazy Update would resolve this issue
ECM Limitations lifted for:
reorgvg & chvg -g size changes

 Cluster Import Option


Correcting out of sync timestamps  auto-corrections or import

 Built-In Lazy Update

40

2010 IBM Corporation

IBM Power Systems

CSPOC allows for a multitude of DARE operations


 The Cluster Single Point of Control options facilitate dynamic operations
# smitty cl_admin

Follow these panels to


dynamically add or remove
resources from the cluster or
perform resource group
movements between nodes

There are CSPOC specific


logs in the HA cluster that will
provide details in the event of
a problem

41

2010 IBM Corporation

IBM Power Systems

CSPOC: Storage & LVM Menus

42

2010 IBM Corporation

IBM Power Systems

Tunable Failure Detection Rate in 7.1.1


 Note that the SMIT menu to alter values was missing prior to HA 7.1.1 SP1

Attributes stored
in HACMPcluster
object class

 Checking current settings:


root@mhoracle1 /> clctrl -tune -o node_down_delay
sapdemo71_cluster(07552a84-057b-11e1-b7cb-46a6ba546402).node_down_delay = 10000
root@mhoracle1 /> clctrl -tune -o node_timeout
sapdemo71_cluster(07552a84-057b-11e1-b7cb-46a6ba546402).node_timeout = 20000

 Modifying via command line:


clmgr modify cluster HEARTBEAT_FREQUENCY= 10000 GRACE_PERIOD=5000
*** The settings will take effect only after the next sync
43

2010 IBM Corporation

IBM Power Systems

FDR Comparison to Version 6.1 & earlier versions


RSCT (topsvcs)

CAA

Heartbeat settings can be defined for each


network type (nim).

Heartbeat settings are same for all networks


in the cluster.
One perspective is that we only support
Ethernet networks.

The settings for heartbeat are


Grace period
Failure Cycle
Interval between Heartbeats

The settings for heartbeat are


Grace period (5 - 30 Seconds)
Failure cycle (1 - 20 seconds)

The combination of heartbeat rate and failure


cycle determines how quickly a failure can be
detected and may be calculated using this
formula:
(heartbeat rate) * (failure cycle) * 2 seconds

Failure cycle is the time that another node


may consider the adapter to be DOWN if it
receives no incoming heartbeats.
Actual heartbeat rate is calculated
depending on the Failure cycle.

Grace period is the waiting time period after


detecting the Failure before it is reported.

Grace period is the waiting time period after


detecting the Failure before it is reported.

*** Note that HA 7.1.0 had self-tuning failure detection rate


2010 IBM Corporation

IBM Power Systems

Application Monitoring within PowerHA SystemMirror


 Some are provided in Smart Assistants
ie. cluster.es.assist.oracle  /usr/es/sbin/cluster/sa/oracle/sbin/DBInstanceMonitor

 A Monitor is bound to the Application Controller


Example OracleDB

Startup
Monitor
Only
invoked on
application
startup

Confirm the
startup of the
application
New
Application
Startup Mode
in HA 7.1.1

45

Process
Monitor

Custom
Monitor

60 sec
interval

60 sec
interval

Long Running Monitors will


continue run locally with the
running application

Checks the
process table

Invokes the
custom logic

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror: Application Startup 7.1.1


 The cluster invokes the start script but doesnt confirm its success
 Consider at least an application start up monitor
Resource Group A
Service IP

Enhancement was introduced in HA Version 7.1.1


- Application start may be set to run in the foreground

Volume Group
/filesystems

start.sh
Application
Controller

stop.sh

Start up Monitor

Long-Running Monitor

46

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror HMC Definition


There was no
SDMC support.
No longer much
of an issue

Information
stored in HA
ODM object
classes

Multiple HMC
IPs may be
defined
separated by a
space

Food for Thought: How many DLPAR operations can be handled at once?
47

2010 IBM Corporation

IBM Power Systems

PowerHA SystemMirror Integrated DLPAR Menu


Add Dynamic LPAR and CoD Resources for Applications
HMC

Type or select values in entry fields.


Press Enter AFTER making all desired changes.
[TOP]
* Application Controller Name

[Entry Fields]
Application_svr1

* Minimum number of processing units


* Desired number of processing units

HMC IPs are


defined and stored
in a different HA
panel

[ 0.00]
[ 0.00]

* Minimum number of CPUs


* Desired number of CPUs

[0]
[0]

#
#

* Minimum amount of memory (in megabytes)


* Desired amount of memory (in megabytes)

[0]
[0]

#
#

* Use CoD if resources are insufficient?

[no]

* I agree to use CoD resources


(Using CoD may result in extra costs)

[no]

You must ensure that


* CoD enablement keys are activated
* CoD resources are not used for any other purpose
48

2010 IBM Corporation

IBM Power Systems

The many uses of the clmgr utility


 V7 Clustering introduces many applications for this command
# clmgr add cluster clmgr_cluster REPOSITORY=hdisk2 CLUSTER_IP=228.1.1.36
# clmgr add node clmgr2

Add a new cluster


Add a new node

# clmgr add network net_ether_01 TYPE=ether


# clmgr add interface clmgr2b2 NETWORK=net_ether_02 NODE=clmgr2 INTERFACE=en1
# clmgr add persistent clmgr1p1 NETWORK=net_ether_01 NODE=clmgr1
# clmgr add service_ip clmgrsvc1 NETWORK=net_ether_01
# clmgr add application_controller test_app1 STARTSCRIPT="/home/apps/start1.sh"
STOPSCRIPT="/home/apps/stop1.sh" STARTUP_MODE=background

Add an Application
Controller

# clmgr add volume_group test_vg1 NODES="clmgr1,clmgr2" PHYSICAL_VOLUMES=hdisk3


TYPE=original MAJOR_NUMBER=35 ACTIVATE_ON_RESTART=false
# clmgr add resource_group clmgr_RG1 NODES="clmgr1,clmgr2" STARTUP=OHN
FALLOVER=FNPN FALLBACK=NFB VOLUME_GROUP=test_vg
SERVICE_LABEL=clmgrsvc1 APPLICATIONS=test_app1

Add a new
Resource Group

# clmgr verify cluster CHANGES_ONLY=no FIX=yes LOGGING=standard


# clmgr sync cluster CHANGES_ONLY=no FIX=yes LOGGING=standard

Verify / Sync cluster

# clmgr online cluster WHEN=now MANAGE=auto BROADCAST=true CLINFO=true

Start Cluster Services

# clmgr modify clusterNAME=my_new_cls_label


# clmgr manage application_controller suspend test_app1 RESOURCE_GROUP="clmgr_RG1"
# clmgr manage application_controller resume test_app1 RESOURCE_GROUP="clmgr_RG2"

49

Change cluster name


Suspend / Resume
Application Monitors
2010 IBM Corporation

IBM Power Systems

Summary
 There are some notable differences between V7 and HA 6.1 and earlier
Pay careful attention to where some of the options are available
Appended Summary Chart of new features to the presentation

 Version 7.1.2 Scheduled GA on Nov 9th


Brings Enterprise Edition to V7 clusters

 This session is an attempt to make you aware of available options in


PowerHA
Take my recommendations with a grain of salt!

 Take advantage of integrated features & interfaces like:

Application monitoring infrastructure


File Collections
Pre/Post Events and User Defined Events
Pager Notification Methods
New clmgr CLI

SG24-8030
50

2010 IBM Corporation

IBM Power Systems

Summary Chart
New Functionality & Changes
New CAA Infrastructure

Disk Fencing Enhancements


Rootvg System Event
Disk rename Function
Repository Disk Resilience

IP Multicast based Heartbeat Protocol


HBA Based SAN Heartbeating
Private Network Support
Tunable Failure Detection Rate
New Service IP Distribution Policies
Full IPV6 Support

Backup Repository Disks

New Application Startup Mode


Exploitation of JFS2 Mount Guard
Adaptive Fallover
New RG Dependencies

51

Smart Assistants (Application Integration)


SAP Live Cache with DS or SVC
MQ Series

7.1.1
7.1.1

DR Capabilities

7.1.2
7.1.0
7.1.0
7.1.0
7.1.1
7.1.2
7.1.1
7.1.1
7.1.0
7.1.0

Stretch & Linked Clusters


DS8000 Hyperswap

7.1.2
7.1.2

Management
New Command Line Interface
clcmd
clmgr utility
lscluster
IBM Systems Director Management

7.1.0

7.1.0

Start After, Stop After

Federated Security

7.1.X

7.1.1

RBAC, EFS & Security System Administration

Extended Distance Clusters

XIV Replication Integration


XP12000, XP24000
HP9500
Storwize v7000
SVC 6.2

(12/16/2011)
(11/18/2011)
(8/19/2011)
(9/30/2011)
(9/30/2011)

2010 IBM Corporation

IBM Power Systems

Questions?

Thank you for your time!


52

2010 IBM Corporation

IBM Power Systems

Additional Resources
 PowerHA SystemMirror 7.1.1 Update SG24-8030

http://www.redbooks.ibm.com/redpieces/abstracts/sg248030.html?Open

 PowerHA SystemMirror 7.1 Redbook SG24-7845

 Removed from Download site

http://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/sg247845.html?Open

 Disaster Recovery Redbook


SG24-7841 - Exploiting PowerHA SystemMirror Enterprise Edition for AIX
http://www.redbooks.ibm.com/abstracts/sg247841.html?Open

 RedGuide: High Availability and Disaster Recovery Planning: Next-Generation Solutions for Multi
server IBM Power Systems Environments

http://www.redbooks.ibm.com/abstracts/redp4669.html?Open

 PowerHA SystemMirror Marketing Page

http://www-03.ibm.com/systems/power/software/availability/aix/index.html

 PowerHA SystemMirror Wiki Page

http://www-941.ibm.com/collaboration/wiki/display/WikiPtype/High+Availability

53

2010 IBM Corporation

Das könnte Ihnen auch gefallen