Sun Cluster 3.2 Cheat Sheet

Sun Cluster 3.
2 - Cheat Sheet Page 1 of 12
Sun Cluster Cheat Sheet

This cheatsheet contains common commands and information for both Sun Cluster 3.1 and 3.2, there is some missing information and over
time I hope to complete this i.e zones, NAS devices, etc
Also both versions of Cluster have a text based GUI tool, so don't be afraid to use this, especially if the task is a simple one
• scsetup (3.1)
• clsetup (3.2)
Also all the commands in version 3.1 are available to version 3.2
Daemons and Processes
At the bottom of the installation guide I listed the daemons and processing running after a fresh install, now is the time explain what these
processes do, I have managed to obtain informtion on most of them but still looking for others.
Versions 3.1 and 3.2

This is used by cluster kernel threads to execute userland commands (such as the run_reserve and dofsck
commands). It is also used to run cluster commands remotely (like the cluster shutdown command).
clexecd This daemon registers with failfastd so that a failfast device driver will panic the kernel if this daemon is
killed and not restarted in 30 seconds.
This daemon provides access from userland management applications to the CCR. It is automatically restarted if it
cl_ccrad is stopped.
The cluster event daemon registers and forwards cluster events (such as nodes entering and leaving the cluster).
cl_eventd There is also a protocol whereby user applications can register themselves to receive cluster events.
The daemon is automatically respawned if it is killed.
cluster event log daemon logs cluster events into a binary log file. At the time of writing for this course, there
cl_eventlogd is no published interface to this log. It is automatically restarted if it is stopped.
This daemon is the failfast proxy server.The failfast daemon allows the kernel to panic if certain essential
failfastd daemons have failed
The resource group management daemon which manages the state of all cluster-unaware applications. A failfast driver
rgmd panics the kernel if this daemon is killed and not restarted in 30 seconds.
This is the fork-and-exec daemon, which handles requests from rgmd to spawn methods for specific data services. A
rpc.fed failfast driver panics the kernel if this daemon is killed and not restarted in 30 seconds.
This is the process monitoring facility. It is used as a general mechanism to initiate restarts and failure action
scripts for some cluster framework daemons (in Solaris 9 OS), and for most application daemons and application
rpc.pmfd fault monitors (in Solaris 9 and10 OS). A failfast driver panics the kernel if this daemon is stopped and not
restarted in 30 seconds.
http://www.datadisk.co.uk/html_docs/sun/sun_cluster_cs.htm 9/15/2010
Sun Cluster 3.2 - Cheat Sheet Page 2 of 12
Public managment network service daemon manages network status information received from the local IPMP daemon
pnmd running on each node and facilitates application failovers caused by complete public network failures on nodes. It
is automatically restarted if it is stopped.
Disk path monitoring daemon monitors the status of disk paths, so that they can be reported in the output of the
cldev status command. It is automatically restarted if it is stopped.
scdpmd Multi-threaded DPM daemon runs on each node. It is automatically started by an rc script when a node boots. It
monitors the availibility of logical path that is visiable through various multipath drivers (MPxIO), HDLM,
Powerpath, etc. Automatically restarted by rpc.pmfd if it dies.
Version 3.2 only

This daemon serves as a proxy whenever any quorum device activity requires execution of some command in userland
qd_userd i.e a NAS quorum device
cl_execd
ifconfig_proxy_serverd
rtreg_proxy_serverd
is a daemon for the public network management (PMN) module. It is started at boot time and starts the PMN service.
cl_pnmd It keeps track of the local host's IPMP state and facilities inter-node failover for all IPMP groups.
scprivipd This daemon provisions IP addresses on the clprivnet0 interface, on behalf of zones.
This daemon monitors the state of Solaris 10 non-global zones so that applications designed to failover between
sc_zonesd zones can react appropriately to zone booting failure
It is used for reconfiguring and plumbing the private IP address in a local zone after virtual cluster is created,
cznetd also see the cznetd.xml file.
This is the "fork and exec" daemin which handles requests from rgmd to spawn methods for specfic data services.
rpc.fed Failfast will hose the box if this is killed and not restarted in 30 seconds
scqdmd The quorum server daemon, this possibly use to be called "scqsd"
pnm mod serverd
File locations
Both Versions (3.1 and 3.2)

man pages /usr/cluster/man
/var/cluster/logs
log files /var/adm/messages
Configuration files (CCR, eventlog, etc) /etc/cluster/
Cluster and other commands /usr/cluser/lib/sc
Version 3.1 Only
sccheck logs /var/cluster/sccheck/report.<date>
Cluster infrastructure file /etc/cluster/ccr/infrastructure
Version 3.2 Only

sccheck logs /var/cluster/logs/cluster_check/remote.<date>
Cluster infrastructure file /etc/cluster/ccr/global/infrastructure
Command Log /var/cluster/logs/commandlog
SCSI Reservations
scsi2:
/usr/cluster/lib/sc/pgre -c pgre_inkeys -d /dev/did/rdsk/d4s2
Display reservation keys
scsi3:
/usr/cluster/lib/sc/scsi -c inkeys -d /dev/did/rdsk/d4s2
scsi2:
/usr/cluster/lib/sc/pgre -c pgre_inresv -d /dev/did/rdsk/d4s2
determine the device owner
scsi3:
/usr/cluster/lib/sc/scsi -c inresv -d /dev/did/rdsk/d4s2
Command shortcuts
In version 3.2 there are number of shortcut command names which I have detailed below, I have left the full command name in the rest of
the document so it is obvious what we are performing, all the commands are located in /usr/cluster/bin
shortcut
cldevice cldev
cldevicegroup cldg
clinterconnect clintr
clnasdevice clnas
clquorum clq
clresource clrs
clresourcegroup clrg
clreslogicalhostname clrslh
clresourcetype clrt
clressharedaddress clrssa
Shutting down and Booting a Cluster
3.1 3.2
cluster shutdown -g0 -y
##other nodes in cluster
scswitch -S -h <host>
shutdown -i5 -g0 -y
shutdown entire cluster
## Last remaining node
scshutdown -g0 -y
scswitch -S -h <host> clnode evacuate <node>

shutdown single node shutdown -i5 -g0 -y shutdown -i5 -g0 -y
reboot a node into non-cluster mode ok> boot -x ok> boot -x
Cluster information
3.1 3.2
cluster list -v
Cluster scstat -pv cluster show
cluster status
clnode list -v
Nodes scstat –n clnode show
clnode status
cldevice list
Devices scstat –D cldevice show
cldevice status
clquorum list -v
Quorum scstat –q clquorum show
clqorum status
clinterconnect show
Transport info scstat –W
clinterconnect status
clresource list -v
Resources scstat –g clresource show
clresource status
clresourcegroup list -v
scsat -g
Resource Groups scrgadm -pv
clresourcegroup show
clresourcegroup status
clresourcetype list -v
Resource Types clresourcetype list-props -v
clresourcetype show
IP Networking Multipathing scstat –i clnode status -m
Installation info (prints packages and version) scinstall –pv clnode show-rev -v
Cluster Configuration
3.1 3.2
Release cat /etc/cluster/release
Integrity check sccheck cluster check
Configure the cluster (add nodes, add data scinstall
services, etc) scinstall
Cluster configuration utility (quorum, data scsetup clsetup
sevices, resource groups, etc)
Rename cluster rename -c <cluster_name>
Set a property cluster set -p <name>=<value>
## List cluster commands
cluster list-cmds
## Display the name of the cluster

cluster list
List
## List the checks
cluster list-checks
## Detailed configuration
cluster show -t global
Status cluster status
Reset the cluster private network settings cluster restore-netprops <cluster_name>
Place the cluster into install mode cluster set -p installmode=enabled
Add a node scconf –a –T node=<host><host> clnode add -c <clustername> -n <nodename> -e endpoint1,endpoin
Remove a node scconf –r –T node=<host><host> clnode remove
Prevent new nodes from entering scconf –a –T node=.
scconf -c -q node=<node>,maintstate
Put a node into maintenance state Note: use the scstat -q command to verify
that the node is in maintenance mode, the
vote count should be zero for that node.
scconf -c -q node=<node>,reset
Get a node out of maintenance state Note: use the scstat -q command to verify
that the node is in maintenance mode, the
vote count should be one for that node.
Node Configuration
3.1 3.2
clnode add [-c <cluster>] [-n <sponsornode>] \
-e <endpoint> \
Add a node to the cluster -e <endpoint>
<node>
## Make sure you are on the node you wish to remove
Remove a node from the cluster clnode remove
Evacuate a node from the cluster scswitch -S -h <node> clnode evacuate <node>
Cleanup the cluster configuration (used after clnode clear <node>
removing nodes)
## Standard list
clnode list [+|<node>]
List nodes
## Destailed list
clnode show [+|<node>]
Change a nodes property clnode set -p <name>=<value> [+|<node>]

Status of nodes clnode status [+|<node>]
Admin Quorum Device
Quorum devices are nodes and disk devices, so the total quorum will be all nodes and devices added together. You can use the scsetup
(3.1)/clsetup(3.2) interface to add/remove quorum devices or use the below commands.
3.1
scconf –a –q globaldev=d11
Adding a SCSI device to the quorum clquorum add [-t <type>] [-p <name>=<value>] [+
Note: if you get the error message "uable to scrub device"
use scgdevs to add device to the global device namespace.
Adding a NAS device to the quorum n/a clquorum add -t netapp_nas -p filer=<nasdevice>
Adding a Quorum Server n/a clquorum add -t quorumserver -p qshost<IPaddres
Removing a device to the quorum scconf –r –q globaldev=d11 clquorum remove [-t <type>] [+|<devicename>]
## Evacuate all nodes ## Place the cluster in install mode
cluster set -p installmode=enabled
## Put cluster into maint mode
Remove the last quorum device scconf –c –q installmode ## Remove the quorum device
clquorum remove <device>
## Remove the quorum device
scconf –r –q globaldev=d11 ## Verify the device has been removed

clquorum list -v
## Check the quorum devices
scstat –q
## Standard list
clquorum list -v [-t <type>] [-n <node>] [+|<de
## Detailed list
List clquorum show [-t <type>] [-n <node>]
## Status
clquorum status [-t <type>] [-n <node>] [+|<dev
scconf –c –q reset
Resetting quorum info clquorum reset
Note: this will bring all offline quorum devices online
## Obtain the device number

Bring a quorum device into maintenance scdidadm –L clquorum enable [-t <type>] [+|<devicename>]
mode (3.2 known as enabled) scconf –c –q globaldev=<device>,maintstate
Bring a quorum device out of maintenance scconf –c –q globaldev=<device><device>,reset clquorum disable [-t <type>] [+|<devicename>]
mode (3.2 known as disabled)
Device Configuration
3.1
Check device cldevice check [-n <node>] [+]
Remove all devices from node cldevice clear [-n <node>]
## Turn on monitoring
cldevice monitor [-n <node>] [+|<device>]
Monitoring
## Turn off monitoring
cldevice unmonitor [-n <node>] [+|<device>]
Rename cldevice rename -d <destination_device_name>

Replicate cldevice replicate [-S <source-node>]
Set properties of a device cldevice set -p default_fencing={global|pathcou
## Standard display
cldevice status [-s <state>] [-n <node>]
Status
## Display failed disk paths
cldevice status -s fail
## Standard List
cldevice list [-n <node>] [+|<device>]
Lists all the configured devices including scdidadm –L
paths across all nodes. ## Detailed list
cldevice show [-n <node>] [+|<device>]
List all the configured devices including paths scdidadm –l see above
on node only.
Reconfigure the device database, creating scdidadm –r
cldevice populate
new instances numbers if required. cldevice refresh [-n <node>] [+]
Perform the repair procedure for a particular scdidadm –R <c0t0d0s0> - device cldevice repair [-n <node>] [+|<device>]
path (use then when a disk gets replaced) scdidadm –R 2 - device id
Disks group
3.1
Create a device group n/a cldevicegroup create
Remove a device group n/a cldevicegroup delete <devgrp
Adding scconf -a -D type=vxvm,name=appdg,nodelist=<host>:<host>,preferenced=true cldevicegroup add
Removing scconf –r –D name=<disk group> cldevicegroup remove
Set a property cldevicegroup set [
## Standard list
cldevicegroup list [
List scstat
## Detailed configuration re
cldevicegroup show [
status scstat cldevicegroup status [

adding single node scconf -a -D type=vxvm,name=appdg,nodelist=<host> cldevicegroup add
Removing single node scconf –r –D name=<disk group>,nodelist=<host> cldevicegroup remove
Switch scswitch –z –D <disk group> -h <host> cldevicegroup switch
Put into maintenance mode scswitch –m –D <disk group> n/a
take out of maintenance mode scswitch -z -D <disk group> -h <host> n/a
onlining a disk group scswitch -z -D <disk group> -h <host> cldevicegroup online <devgrp
offlining a disk group scswitch -F -D <disk group> cldevicegroup offline <devgr
Resync a disk group scconf -c -D name=appdg,sync cldevicegroup syn [
Transport Cable
3.1
Add clinterconnect add <endpoint>,<endpoint>
Remove clinterconnect remove <endpoint>,<endpoint>
Enable scconf –c –m endpoint=<host>:qfe1,state=enabled clinterconnect enable [-n <node>] [+|<endpoint>,<en
scconf –c –m endpoint=<host>:qfe1,state=disabled
Disable clinterconnect disable [-n <node>] [+|<endpoint>,<e
Note: it gets deleted
## Standard and detailed list
List scstat
clinterconnect show [-n <node>][+|<endpoint>,<endpo
Status scstat clinterconnect status [-n <node>][+|<endpoint>,<end
Resource Groups
3.1
Adding (failover) scrgadm -a -g <res_group> -h <host>,<host> clresourcegroup create <res_group>
Adding (scalable) clresourcegroup create -S <res_group>

Adding a node to a resource group clresourcegroup add-node -n <node> <res_group>
## Remove a resource group

clresourcegroup delete <res_group>
Removing scrgadm –r –g <group>
## Remove a resource group and all its resource
clresourcegroup delete -F <res_group>
Removing a node from a resource group clresourcegroup remove-node -n <node> <res_grou

changing properties scrgadm -c -g <resource group> -y <propety=value> clresourcegroup set -p Failback=true + <name=va
Status scstat -g clresourcegroup status [-n <node>][-
Listing scstat –g clresourcegroup list [-n <node>][-r
Detailed List scrgadm –pv –g <res_group> clresourcegroup show [-n <node>][-r <resource][
Display mode type (failover or scalable) scrgadm -pv -g <res_group> | grep 'Res Group mode'
## All resource groups

clresourcegroup offline +
Offlining scswitch –F –g <res_group> ## Individual group

clresourcegroup offline [-n <node>] <res_group>
clresourcegroup evacuate [+|-n <node>]
## All resource groups

clresourcegroup online +
Onlining scswitch -Z -g <res_group>
## Individual groups
clresourcegroup online [-n <node>] <res_group>
Evacuate all resource groups from a node clresourcegroup evacuate [+|-n <node>]
(used when shutting down a node)
scswitch –u –g <res_group>
Unmanaging clresourcegroup unmanage <res_group>
Note: (all resources in group must be disabled)
Managing scswitch –o –g <res_group> clresourcegroup manage <res_group>

Switching scswitch –z –g <res_group> –h <host> clresourcegroup switch -n <node> <res_group>
Suspend n/a clresourcegroup suspend [+|<res_group>]
Resume n/a clresourcegroup resume [+|<res_group>]
Remaster (move the resource group/s to their n/a clresourcegroup remaster [+|<res_group>]
preferred node)
Restart a resource group (bring offline then n/a clresourcegroup restart [-n <node>]
online)
Resources
3.1
Adding failover network resource scrgadm –a –L –g <res_group> -l <logicalhost> clreslogicalhostname create
Adding shared network resource scrgadm –a –S –g <res_group> -l <logicalhost> clressharedaddress create
scrgadm –a –j apache_res -g <res_group> \
adding a failover apache application and -t SUNW.apache -y Network_resources_used = <logicalhost>
attaching the network resource -y Scalable=False –y Port_list = 80/tcp \
-x Bin_dir = /usr/apache/bin
scrgadm –a –j apache_res -g <res_group> \
adding a shared apache application and -t SUNW.apache -y Network_resources_used = <logicalhost>
attaching the network resource -y Scalable=True –y Port_list = 80/tcp \
-x Bin_dir = /usr/apache/bin
scrgadm -a -g rg_oracle -j hasp_data01 -t SUNW.HAStoragePlus \ clresource create -t HAStorage
Create a HAStoragePlus failover resource > -x FileSystemMountPoints=/oracle/data01 \ -p FilesystemMountPoints=<mount
> -x Affinityon=true -p Affinityon=true <rs-hasp>
scrgadm –r –j res-ip
Removing clresource delete [-g <res_group>][
Note: must disable the resource first
## Changing
clresource set -t <type>
changing or adding properties scrgadm -c -j <resource> -y <property=value>
## Adding
clresource set -p <name>+=<value>
clresource list [-g <res_group>][

List scstat -g
## List properties
clresource list-props [-g <res_group
scrgadm –pv –j res-ip
Detailed List clresurce show [-n <node>] [
scrgadm –pvv –j res-ip
Status scstat -g clresource status [-s <state>][
Disable resoure monitor scrgadm –n –M –j res-ip clresource monitor [-n <node>] [
Enable resource monitor scrgadm –e –M –j res-ip clresource unmonitor [-n <node>] [
Disabling scswitch –n –j res-ip clresource disable <resource>
Enabling scswitch –e –j res-ip clresource enable <resource>

Clearing a failed resource scswitch –c –h<host>,<host> -j <resource> -f STOP_FAILED clresource clear -f STOP_FAILED <res
Find the network of a resource scrgadm –pvv –j <resource> | grep –I network
## offline the group ## offline the group
scswitch –F –g rgroup-1 clresourcegroup offline <res_group>
## remove the resource ## remove the resource

Removing a resource and resource group scrgadm –r –j res-ip clresource [-g <res_group>][
## remove the resource group ## remove the resource group

scrgadm –r –g rgroup-1 clresourcegroup delete <res_group>
Resource Types
3.1
Adding (register in 3.2) scrgadm –a –t <resource type> i.e SUNW.HAStoragePlus clresourcetype register <type>
Register a resource type to a node n/a clresourcetype add-node -
Deleting (remove in 3.2) scrgadm –r –t <resource type> clresourcetype unregister
Deregistering a resource type from a node n/a clresourcetype remove-node
Listing scrgadm –pv | grep ‘Res Type name’ clresourcetype list [<type>]
Listing resource type properties clresourcetype list-props
Show resource types clresourcetype show [<type>]
Set properties of a resource type clresourcetype set [-p <name>=<value

Sun Cluster 3.2 Cheat Sheet

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Sun Cluster 3.2 Cheat Sheet

Hochgeladen von

Copyright:

Verfügbare Formate

Sun Cluster 3.

2 - Cheat Sheet Page 1 of 12

Sun Cluster Cheat Sheet

Daemons and Processes

Versions 3.1 and 3.2

Version 3.2 only

Both Versions (3.1 and 3.2)

Version 3.2 Only

Shutting down and Booting a Cluster

scswitch -S -h <host> clnode evacuate <node>

## Display the name of the cluster

Change a nodes property clnode set -p <name>=<value> [+|<node>]

Admin Quorum Device

scconf –r –q globaldev=d11 ## Verify the device has been removed

## Obtain the device number

Rename cldevice rename -d <destination_device_name>

status scstat cldevicegroup status [

Adding (failover) scrgadm -a -g <res_group> -h <host>,<host> clresourcegroup create <res_group>

Adding (scalable) clresourcegroup create -S <res_group>

## Remove a resource group

Removing a node from a resource group clresourcegroup remove-node -n <node> <res_grou

## All resource groups

Offlining scswitch –F –g <res_group> ## Individual group

clresourcegroup evacuate [+|-n <node>]

## All resource groups

Managing scswitch –o –g <res_group> clresourcegroup manage <res_group>

clresource list [-g <res_group>][

Disabling scswitch –n –j res-ip clresource disable <resource>

Enabling scswitch –e –j res-ip clresource enable <resource>

## remove the resource ## remove the resource

## remove the resource group ## remove the resource group

Set properties of a resource type clresourcetype set [-p <name>=<value

Das könnte Ihnen auch gefallen