Sie sind auf Seite 1von 59

Known Problems

Other Docs Other Docs


SeisSpace System
Administration
2 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Conguring SeisSpace
Configuring some General properties
SeisSpace has a master conguration le similar to the ProMAX
cong_le where an administrator can set certain properties for the site
installation.
The master conguration le for SeisSpace is the
PROWESS_HOME/etc/prowess.properties le.
The administrator may want to edit this le to set some installation defaults
and them make it write only by that administrative user.
It may also be useful to move the PROWESS_HOME/etc directory to the
install directory /apps/SSetc similar to how you would copy the
PROMAX_HOME/etc directory out of the install so that your
conguration settings do not get deleted if you were to reinstall the
product.
You can point to the external etc directory using the environment variable
PROWESS_ETC_HOME set in the client startup environment or script.
See PROWESS_HOME/etc/SSclient for an example.
Product lists
The administrator can set up the list of Products that are available. The list
includes ProMAX 2D, ProMAX 3D, ProMAX 4D, ProMAX VSP,
ProMAX Field, ProMAX DEV and DepthCharge. There is a stanza in the
/apps/SeisSpace/etc/prowess.properties le that you can use to control the
list of available products that is presented to the users.
# Define the comma-separated list of products that are
available from
# the Navigator. Whether the user is actually able to switch
to a product
# depends upon whether a license for it is available. If the
product name is
# preceded by the negation symbol (!), then that product will
not be
# shown in the Navigator. You may use the Navigator
preferences to
# change the displayed products on a per-user basis.
#
# ALL - all product levels
# 2D, 3D, 4D, VSP, FIELD, DEPTHCHARGE, DEV
3 SeisSpace System Administration
Known Problems
Other Docs Other Docs
#
# The default is to show all products, except for ProMAX Dev
#
Navigator.availableProductLevels=ALL,!DEV
The default is to show all products except to make ProMAX Dev not
visible in the Product pull down list in the Navigator.
Saving Settings
There is the concept of "shared" information that is the same for all users
and managed by an administrator.
There is also the concept of "private" information that is only available to
an individual user.
"Shared" information is stored in the /apps/logs/netdir.xml le and
"private" information is stored by default in the users
/home/user/SeisSpace/.seisspace le, or you can specify a different
directory to store the .seisspace le in the SSclient startup script with the
PROWESS_PREFS_DIR environment variable.
There is a stanza in the /apps/SeisSpace/etc/prowess.properties le that
you can use to control how much "private" information the users can have.
################ ADMINISTRATION ################
# These options allow a system administrator to restrict
access
# of non-administrative users to administrative features.
When set
# to true, an administrative feature can only be used if a
user
# has logged in as admin. Note that these options are here in
# response to a customer request.
onlyAdminCanAddArchiveDataHomes=false
onlyAdminCanAddDataHomes=false
onlyAdminCanEditHostsList=false
onlyAdminCanDefineClusters=false
onlyAdminCanInitializeDatabase=true
If the administrator wants to restrict the users ability to add their own data
homes or hosts or cluster lists, these can be set to true. The options will
then be grayed out of the users pull down menu and rendered inoperative.
NOTE that the administrator will also need to restrict write access to this
le. The sitemanager and all clients will need to be restarted for a change to
this le to take effect.
4 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Note: The rst time you log in as administrator, you may elect to set the
administrator password. Only users that know the password can perform
administrative functions.
The password is stored in the netdir.xml le. If you forget the password
and need to reset it you can stop the sitemanager, edit the netdir.xml le
and remove the password line, then start the sitemanager again. The
administrator password will be "blank" again and you can reset it.
Job Submission Node/Joblet Specication
The default Queued job submission protocols combine the lisp menu
que_res_pbs.menu and some dialogs on the submit GUI for selecting the
number of node and number of joblets per node to use for the job. In some
installations, customers may implement their own queued job submission
menus and scripts. For some of these cases the customer site may wish to
disable the nodes and joblet/node count entries in the SeisSpace Job
Submit GUI and use their own menu parameters to build up the qsh les. In
this case it is possible to hide the GUI based entries by editing the
$PROWESS_HOME/etc/prowess.properties le and setting the
showJobletsSpec property to false in the following stanza:
# Whether or not show the joblet specification (number of
nodes, number of
# execs per node, number of joblets) in the Submit dialog;
the default
# is to show it. If you customize your queue menu and submit
scripts in such
# a way that you take responsibility for how to obtain the
number of joblets,
# you may want to hide these parameters in the Submit dialog.
com.lgc.prodesk.jobviewer.showJobletsSpec=true
Job Submission - Force use of Queued submit
A property is available in the PROWESS_HOME/etc/prowess.properties
le to force all jobs to be submitted using a Queued submit. This property
deactivates the direct job submit icon and removes the host and cluster
options from the job submit dialog. To implement this behavior un-
comment the line in the prowess.properties le:
# Whether or not to restrict job submission to only queues
i.e. local and cluster direct submit jobs
# are disabled from the UI. Default is false, meaning
local,cluster, and queue submits are allowed.
# com.lgc.prodesk.jobviewer.showQueuesOnly=true
5 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Job Submission - RMI port count - for more than 1000 simultaneous jobs
A property is available in the PROWESS_HOME/etc/prowess.properties
le to increase the number of RMI ports to look for if you plan to run more
than 1000 jobs simultaneously.
# Number of successive ports when creating an RMI registry
for communicating
# between the Exec and SuperExec. Default is 1000. The
minimum is 1000.
# Increase if you are running more than 1000 jobs
simultaneously.
#com.lgc.prowess.exec.NumberRegistryPorts=1000
Changing the default in JavaSeis Data Output to use primary storage
The default in the JavaSeis Data Output menu is to use secondary storage
for the trace data. You must use the parameter default ow method to set
this default. This method is documented in the SeisSpace User Guide -
Working with the navigator section.
Changing the default location to store the .user_headers les for user dened header lists
and setting options for user header hierarchy.
You can set the default location of where the .user_dened headers le is
stored to be either of Data_Home Area, or Line. The installation default is
Data_Home working under the thinking that in general as users add
headers, they will tend to be the same ones use multiple times for all lines.
You may select to set the default storage to be at the LINE level instead.
# The default location of the .user_headers file. The
.user_headers
# files contains user-defined headers. Set to
"DataHome","Area", or "Line" (case insensitive).
# Default is DataHome.
# TODO this should be set from the user preferences [dialog],
and whether it
# can be changed should be controlled by another property:
onlyAdminCanChangeDefaultUserHeaderLocation.
com.lgc.prodesk.navigator.defaultUserHeaderLocation=DataHome
You can also set a property to prevent users fromhaving the option to select
an alternate location.
# Switch to determine if users can store the user headers at
a location
# other than the above default location. The default is to
allow.
# If you are logged in as Admin, this will always be true.
com.lgc.prodesk.navigator.canChangeUserHeaderLocation=true
6 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Changing the location of the user parameter default ows.
You can set the default location of where the user parameter default
override ows are stored by setting a property in the prowess.properties
le.
# You can define a common location under which to create a
hierarchy for
#user parameter defaulting flows. For example, you can set
this to
# the value of $PROWESS_HOME/etc or $PROWESS_ETC_HOME if
this variable is set.
# The default is $HOME/SeisSpace or $PROWESS_PREFS_DIR if
this variable is set.
# the directory structure under this is
.../defaultdatahom/defaultarea/defaultline
#com.lgc.prodesk.navigator.userDefaultsDir=$PROWESS_ETC_HOME
Overview
Once you have started the SeisSpace Navigator, you can use the
Administrative dialogs from the Edit-Administration pull down menus to
continue setting up SeisSpace.
7 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Logging in as Administrator
1. Select Edit>Administration>Login As Administrator.
2. Click Set Password.
3. Leave the Old Password line blank and enter your new password
twice. Then click OK.
All users will now need to use the new password to gain administrative
privileges.
8 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Defining Hosts
There are two possible hosts lists, the shared host list set up by the
administrator and the personal host list set up by the user.
The Hosts list is the list of the machines on your network that can be used
to run remote jobs and dene clusters for parallel processing.
If you dene your hosts lists when you are logged on as Administrator, you
will dene hosts for all of the users (A shared hosts list). Otherwise, you
will be dening a personal, or "private" list of hosts.
"Shared" information is stored in the /apps/logs/netdir.xml le and
"private" information is stored in the users
/home/user/SeisSpace/.seisspace le. (or
PROWESS_PREFS_DIR/.seisspace le)
To begin, select Edit > Administrator > Dene Hosts. One of the
following dialog boxes will appear depending on if you are logged in as the
administrator or not:
Administrator Shared host list dialog
9 SeisSpace System Administration
Known Problems
Other Docs Other Docs
User personal host list dialog:
Enter the name of the host youd like to add into the large text box. If youd
like to add a range of hosts that differ by a number prepended or appended
name (for example: xyz1, xyz2, xzy3, xyz4, xyz5, etc.), enter the starting
host name in the Generate hosts from text eld (xyz1) and the ending
hostname in the Generate hosts to eld (xyz5). When you click Add, all
the host names within the range will be generated and added.
You can also dene hosts with a number embedded in the name. For
example: x1yz to x5yz.
Remove hosts by deleting their names from the editable list.
Click Save and Close to update your hosts list or Cancel to exit without
saving. The /apps/logs/netdir.xml le for shared lists or the users
homedir/SeisSpace/.seisspace le will be updated for a private host list.
This is also the list of hosts that will be shown in the job submit user
interface
Defining Clusters
There are two possible cluster lists, the shared cluster list set up by the
administrator and the personal cluster list set up by the user.
10 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Acluster is a logical name for a group of hosts to which you can distributed
submit jobs. If you dene the clusters when you are logged on as
Administrator, you will dene cluster denitions for all of the users.
Otherwise, you will be dening a personal, or "private" cluster denition.
"Shared" information is stored in the /apps/logs/netdir.xml le and
"private" information is stored in the users
/home/user/SeisSpace/.seisspace le. (or
PROWESS_PREFS_DIR/.seisspace le)
Duplicate names are not managed by SeisSpace. The cluster list to choose
from in the job submit user interface is a concatenation of the shared and
the personal lists. The shared cluster denitions are indicated as shared by
the check box.
To begin, select Edit > Administrator > Dene Clusters. One of the
following dialog boxes will appear depending on if you are logged in as the
administrator or not:
The general steps for adding a cluster are:
11 SeisSpace System Administration
Known Problems
Other Docs Other Docs
1. Enter the cluster name in the New: text box.
2. Click Add.
3. Enter starting and ending hosts information and click Add to generate
the list of hosts.
4. Click Save.
Below is an example after dening clusters for a shared clusters list:
12 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Below is an example after dening clusters for a personal clusters list :
Enter the name of the cluster youd like to add in the New text eld and
click Add. It will be added to the pulldown lists of clusters.
Then create a list of hosts for the cluster by editing directly into the large
text box. If youd like to add a range of hosts that differ by a number
prepended or appended name (for example: xyz1, xyz2, xzy3, xyz4, xyz5,
etc.), enter the starting host name in the Generate hosts from text eld
(xyz1) and the ending hostname in the Generate hosts to eld (xyz5).
When you click Add, all the hosts names within the range will be added.
You can also dene hosts with a number embedded in the name. For
example: x1yz to x5yz.
Remove hosts by deleting their names from the editable list.
13 SeisSpace System Administration
Known Problems
Other Docs Other Docs
To edit or remove an existing cluster, select it from the pulldown list of
clusters.
Click Save and Close to update your hosts list or Cancel to exit without
saving.
Adding a Data Home
A Data Home directory is the equivalent of ProMAX Primary Storage
directory.
There are two possible Data_home lists, the shared list set up by the
administrator with details stored in the netdir.xml le and the personal list
set up by the user with details stored in the users .seisspace le.
CAUTION: - avoid declaring the same DATA_HOME in both the shared
(logs/netdir.xml) le and your personal .seisspace les. A DATA_HOME
should only be specied in one location. If you do end up with duplicates,
you will be prompted with some options of how to resolve the duplication.
14 SeisSpace System Administration
Known Problems
Other Docs Other Docs
1. Begin by selecting Edit>Administration>Add Data Home.
The Add new Data Home dialog box appears.
Enter or select a pathname to the project you are adding. This path must
be accessible from all nodes in the cluster by exactly the same
pathname.
The pathname is equivalent to ProMAX primary storage where the
project/subproject hierarchy that is shown is equivalent to the ProMAX
Area/Line hierarchy in PROMAX_DATA_HOME.
15 SeisSpace System Administration
Known Problems
Other Docs Other Docs
2. If you wish, enter a name which SeisSpace will use to label the Data
Home. The idea here is that you may want to address a project by a
logical name instead of by a directory name. Your actual disk directory
may have a name similar to
/lesystem/disk/primarydir1/secondarydir2/group1/person2/promax_d
ata/marine_data. It may be easier to address this data as simply "marine
data" from the navigator. (MB3 > Properties can be used to show the
entire path to the aliased name for reference.)
3. If you wish, enter a character string that will be used as an additional
directory prex for JavaSeis datasets in secondary storage. DO NOT
use blanks, or slashes or other special characters, If JavaSeis Secondary
storage is specied as /a/b/c, the datasets for this data_home will use
directory /a/b/c/this_prex/area/line/dataset. If you leave this entry
blank, the datasets for this data_home will use directory
/a/b/c/area/line/dataset. This feature is designed to prevent potential
dataset overwriting in the case where you have the same area/line in
multiple data_homes using the same secondary storage directories.
4. ProMAX Environment Variable Editor.
At a minimum it is recommended that you specify values for
PROMAX_SCRATCH_HOME and PROMAX_ETC_HOME. Select
the variable and then click Edit to modify the settings.
You may add other variables here. Typical entries may be
PROMAX_MAP_COMPRESSION, or extended scratch partitions.
You can consult the ProMAX system administration guide for the list
of environment variables.
It is generally recommended to avoid having the same data home specied
as a shared project and have users specify it as a personal project. It is
possible to do this but you will get into situations where the projects are not
updated concurrently and you will get confused. There are also some
dialogs in place to help resolve the duplicate entries.
16 SeisSpace System Administration
Known Problems
Other Docs Other Docs
5. Click the checkbox for This data home should be visible to all users
if youd like all users to be able to access this data home. Note: this
option is only visible if you are logged in as the administrator.
A completed Data Home dialog should look similar to the following
example:
17 SeisSpace System Administration
Known Problems
Other Docs Other Docs
JavaSeis Secondary Storage
This option is used to set up a list of le systems to use for JavaSeis dataset
secondary storage.
If you dont do anything, JavaSeis datasets will use the same le systems as
ProMAX datasets for secondary storage as dened in the etc/cong_le. If
this is the desired behavior then you do not need to pursue the JavaSeis
Secondary Storage conguration. You will need to make sure that you
dont have a dir.dat le in any of the standard search paths. A dir.dat le
with lines with the #SS#/directory syntax will take precedence over the
etc/cong_le.
When the JavaSeis Secondary Storage dialog is rst started all of the text
boxes may be blank. For the top text box to be populated you must have a
dir.dat le in one of the following possible locations with #SS# lines in it:
PROWESS_DDF (direct path to dir.dat le)
OW_DDF (direct path to dir.dat le)
OW_PMPATH/dir.dat
OWHOME/conf/dir.dat
If you use the default for OWHOME, SeisSpace will use
$PROMAX_HOME/port/OpenWorks in a standard non-OpenWorks
ProMAX/SeisSpace installation. If you want to specify a different location
you can set either OW_PMPATH, OW_DDF or PROWESS_DDF in your
SSclient startup script.
An example of a dir.dat le can be found in your SeisSpace installation
$PROWESS_HOME/etc/conf directory.
This le is shown below for reference:
18 SeisSpace System Administration
Known Problems
Other Docs Other Docs
# Example of lines in a dir.dat le that SeisSpace understands for specifying optional
# secondary storage for JavaSeis datasets.
#
##########################################################
##########################################################
#SS#/d1/SeisSpace/js_virtual,READ_WRITE
#SS#/d2/SeisSpace/js_virtual,READ_WRITE
#SS#/d3/SeisSpace/js_virtual,READ_WRITE
#SS#/d4/SeisSpace/js_virtual,READ_WRITE
#SS#GlobalMinSpace=209715200
#SS#MinRequiredFreeSpace=209715200
#SS#MaxRequiredFreeSpace=107374182400
#
#
##########################################################
################ Documentation below #####################
##########################################################
#
# The SeisSpace navigator will optionaly search for a le in the data_home directory
# dened by the environment variable:
# JAVASEIS_DOT_SECONDARY_OVERRIDE
# This is a method where an administrator can change the secondary storage
# specication for a DATA_HOME for testing to do things like test new disk partitions
# before putting them into production without affecting the users.
#
# In production mode , the SeisSpace navigator will rst search for a .secondary le
# in the data_home directory:
#
# The SeisSpace navigator will rst search for a .secondary le in the data_home directory:
#
# rst: $DATA_HOME/.secondary (which is managed as part of the data_home properties)
#
# if no .secondary le exists, the next search will be for a dir.dat le using the following hiera:
#
# second: $PROWESS_DDF (direct path to dir.dat le)
# third: $OW_DDF (direct path to dir.dat le)
# fourth: $OW_PMPATH/dir.dat
# fth: $OWHOME/conf/dir.dat (Note that OWHOME=PROMAX_HOME/port/OpenWorks in
# a standard non-OpenWorks ProMAX/SeisSpace installation)
#
# if no dir.dat le is found, JavaSeis seconday storage will use the secondary storage denition
# in the PROMAX_ETC_HOME/cong_le
#
# sixth: ProMAX secondary storage listed in the cong_le for the project
#
# In the rst dir.dat le that is found, the le is checked to see if
# SeisSpace secondary storage has been dened. The expected format is;
#
##SS#/d1/SeisSpace/js_virtual,READ_WRITE
##SS#/d2/SeisSpace/js_virtual,READ_WRITE
##SS#/d3/SeisSpace/js_virtual,READ_WRITE
##SS#/d4/SeisSpace/js_virtual,READ_WRITE
19 SeisSpace System Administration
Known Problems
Other Docs Other Docs
#
# GlobalMinSpace --> A global setting used by the Random option, do not use this
# folder if there is less than GlobalMinSpace available
# (value specied in bytes -- 209715200 bytes = 200Mb)
##SS#GlobalMinSpace=209715200
#
# In this example 4 secondary storage locations are specied all with RW status
# with a global minimum disk space requirement of 200 Mb.
#
# Other attributes can be associated with the different directories:
# READ_WRITE --> available for reading existing data and writing new data
# READ_ONLY --> available for reading existing data
# OVERFLOW_ONLY --> avaliable as emergency backup disk space that is only used
# if all le systems with READ_WRITE status are full
#
# The data_home properties dialog can be used to make a .secondary le
# at the Data_Home level which will be used rst.
#
# There are two different policies that can be used to distribute the data over the le
# systems (folders) specied above: RANDOM and MIN_MAX.
#
#PolicyRandom
# Retrieve the up to date list of potential folders for secondary.
#
# From the list of potential folders get those that have the READ_WRITE
# attribute.
#
# If the list contains more than 0, generate a random number from 1 to N
# (where N=the number of folders) and return that folder index to be used.
#
# If the list of READ_WRITE folders is 0 then get the list of "OVERFLOW_ONLY" folders.
# If the list contains more than 0, generate a random selection of the folder index
# and return that folder index to be used.
#
# If there are O READ_WRITE folders and 0 OVERFLOW_ONLY folders then the job will fail.
#
#PolicyMinMax
# Uses the following values:
#
# MinRequiredFreeSpace --> Do not use this folder in the MIN_MAX policy if there is
# less than MinRequiredFreeSpace available
# (value specied in bytes -- 209715200 bytes = 200Mb)
# MaxRequiredFreeSpace --> Use this folder multiple times in the MIN_MAX policy if
# there is more than MaxRequiredFreeSpace available.
# (value specied in bytes -- 107374182400 bytes = 100Gb)
#
##SS#MinRequiredFreeSpace=209715200
##SS#MaxRequiredFreeSpace=107374182400
#
# Get the list of potential folders and computes the free space on each folder.
#
# for each folder in the list that has a READ_WRITE attribute check the free space.
# - If the free space is less than MinRequiredFreeSpace exclude it.
# (Not enough free space on this disk)
20 SeisSpace System Administration
Known Problems
Other Docs Other Docs
# - If the free space is greater than MinRequiredFreeSpace and less than MaxRequired
# add it to the list of candidates
# - If the free space is also greater than MaxRequiredFreeSpace add it as a candidate again.
# This will weight the allocation to disks with the most free space.
#
# From the list of candidates use the same random number technique as above.
#
# If there are no folders in the list of candidates then check for any possible overow folders.
# If folders are found use the random number technique to return an overow folder.
# If we dont have anything in overow we fail.
If you use the example le located in the
$PROMAX_HOME/port/OpenWorks/conf directory, and dont explicitly
set OWHOME in the startup script, this dialog would look as shown below:
21 SeisSpace System Administration
Known Problems
Other Docs Other Docs
The top section shows the directories found in the dir.dat le. Note: the
attributes (RO vs. RW) are not editable or shown here.
The second section shows the directory and attributes contents of a
.secondary le in this Data_Home directory if one exists.
The bottom section shows the Min/Max disk space usage policy settings in
the .secondary le if one exists. The defaults are shown if the .secondary
le does not exist. The defaults are 200 Mb for the mins and 100 Gb for the
max. (More detail on these above in the example dir.dat le example
shown)
If you do nothing else, JavaSeis datasets will use all of the le systems
listed in the dir.dat for secondary storage and distribution of the le extents.
You can also select a subset of these le systems on a "per Data Home"
basis so that different Data Homes can use different subsets of the le
systems listed in the dir.dat le. To do this, click MB1 on the le systems
you want to use for this Data Home in the top window and they will show
up in the lower window. You can choose to add attributes to the le
systems. Multiple directories can be chosen with the standard MB1,
CNTRL-MB1 and SHFT-MB1 mouse and keyboard bindings.
Read Write:
The default conguration allows for datasets to be both
read and written.
Read Only:
Do not write any new data to the selected le system(s).
Overow:
Only write data as an emergency backup if all of the other
le systems are full. This is designed to be used as an
emergency backup so that jobs dont fail when the main
secondary disk ll up.
Remove:
Remove the selected le system(s) from the list.
You can choose to set the min and max disk usage policy settings in the
.secondary le. The policy is chosen in the JavaSeis Data Output menu.
Min/Max Policy - Minimum (Mb)
There must be at least this much disk space available
before any extents are written to this folder.
Min.Max Policy - Maximum (MB)
22 SeisSpace System Administration
Known Problems
Other Docs Other Docs
If there is more disk space available than this value, this
folder is added to the list of available folders twice.
Min free space required for Random policy (Mb)
There must be at least this much disk space available
before any extents are written to this folder.
Click on the Update button(s) to set the conguration for this Data Home.
This conguration for the Data Home is stored in two les in the Data
Home directory. The .properties le will store all the properties from the
main properties dialog and the .secondary le will store the list of le
systems and the min/max policies to use for JavaSeis secondary storage.
If you delete all of the directories in the lower window and update, the
.secondary le will be deleted.
.secondary file OVERRIDE
For testing a new ler, or secondary storage disk conguration, an
administrator may want to temporarily override the production .secondary
le and use a test version.
The administrator can do this by making a copy of the .secondary le in the
data_home directory and pointing to the temporary copy with an
environment variable.
In a shell, cd to the data_home directory and copy the .secondary le and
manually edit it.
cp .secondary .secondary_test
vi .secondary_test
IF you have set the environment variable
JAVASEIS_DOT_SECONDARY_OVERRIDE in the navigator start up
script, or your user environment, then the le it points to MUST EXIST in
the data home that you are working in. If not the IO will refer back to the
original .secondary or the highest level dir.dat le that it nds.
IF the le dened as env variable
JAVASEIS_DOT_SECONDARY_OVERRIDE exists
THEN when you open the JavaSeis secondary folder conguration dialog
for a data home It will show the contents of that le and allow you to edit it
by adding directories in from the dir.dat list. You cannot add in other lines
manually from the GUI.
ELSE IF the le does not exist
23 SeisSpace System Administration
Known Problems
Other Docs Other Docs
THEN you will be shown a blank area in the .secondary edit part of the
GUI where you can repopulate it from the directories listed in the dir.dat
le that have the #SS# prex. When you update, the le will be created.
IF the variable is NOT set, the system will use the standard .secondary le
preferentially.
The IO code is updated so that if the variable is set it will use that le for
the secondary specication.
In a data home you may see a .secondary plus a .secondary_test le as an
example.
IF the .secondary_test le does not exist then the IO will use the standard
.secondary even if the env variable is set to .secondary_test
If you want to use the test le, you will need to set
JAVASEIS_DOT_SECONDARY_OVERRIDE to .secondary_test
Verifying Projects in the Navigator
In the Navigator, click on data home folder and then navigate the tree to see
projects (AREAS), subprojects (LINES), and Flows, Tables and Datasets.
24 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Removing Data Homes
To remove a data home from your SeisSpace Navigator rst select the
Project folder in the tree view and then select Edit > Administration >
Remove Data Home fromthe Navigator pulldown menu. Removing a data
home does not delete the data it only removes it fromthe list of data homes
dened in SeisSpace.
25 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Conguring the Users MPI environment
A .mpd.conf le must exist in each users home directory.
If one does not exist, the software will automatically generate it. If you
want to set this up manually you can do the following:
Create a $HOME/.mpd.conf le for each user. This le can be identical for
all users, or each user can have their own "secret word". Note: This is "dot"
mpd "dot"conf.
The requirements for the .mpd.conf le are:
this le exist in the users home directory,
is owned by the user, and
has permissions of 600 (rw for user only).
The le can be created with the following two commands:
$ echo "MPD_SECRETWORD=xyzzy" > $HOME/.mpd.conf
$ chmod 600 $HOME/.mpd.conf
after the le is created with the line of text and the permissions set, you
should see the following in the users home directory:
[user1@a1 user1]$ ls -al .mpd.conf
-rw------- 1 user1 users 19 Jun 23 13:38 .mpd.conf
[user1@a1 user1]$ cat .mpd.conf
MPD_SECRETWORD=xyzzy
Note: There are some cases where you may have rsh problems related to a
mismatch between a kerberos rsh and the normal system rsh. The system
searches to see if /usr/kerberos/bin is in your path. If the
/etc/prole.d.krb5.csh does not nd this in your path, the script will
prepend it to your path. To avoid this, add /usr/kerberos/bin to the end of
your path.
26 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Routing Issues
A special routing problem can occur if a Linux cluster mayor or manager
node has two ethernet interfaces: one for an external address and one for an
internal cluster IP address. If the mayor's hostname corresponds to the
external address, then the machine misidenties itself to other cluster
nodes. Those nodes will try to route through the external interface.
Quick Fix
You can use the internal address of the mayor node as the external address
of the mayor node.
% route add -net 146.27.172.254 netmask 255.255.255.255 gw 172.16.0.1
where 172.16.0.1 is the internal IP address of the mayor node and
146.27.172.254 is the external address of the mayor node.
Better Fix
Set the route on all cluster nodes to use the internal address of the mayor
for any unknown external address:
% route add -net 0.0.0.0 netmask 0.0.0.0 gw 172.16.0.1
This x makes the previous x unnecessary.
Adding Routes
Outside machines might not have a route to the cluster nodes. To add a
route to a PC needing a cluster node, set the route to use the external
address of the mayor node to all cluster node addresses:
% route add 172.16.0.0 mask 255.255.0.0 146.27.172.254
where 172.16.0.0 with a mask of 255.255.0.0 species the address
range of the cluster nodes and 146.27.172.254 is the external address of
the cluster mayor node.
27 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Diagnosing routing problems
To diagnose problems with routing on a cluster, check the following
information on the mayor node and on a worker node. You must have direct
routes to all other nodes:
% route
% route -n
% netstat -r
% netstat -rn
Make sure your nodes IP address is associated with the ethernet interface.
% ifconfig -a
Hardwire the correct association of IP addresses with hostnames. Use the
same le for all nodes, including the mayor.
% cat /etc/hosts
See how hostnames are looked up:
% cat /etc/nsswitch.conf
% cat /etc/resolv.conf
Use the lookup order hosts: les nis dns.
If you are not using DNS, then /etc/resolv.conf must be empty. If you are
using DNS, then the following lines must be present:
nameserver <ip address of name server>
search domain.com domain2.company.com
Cluster configuration considerations
When you get ready to set up a cluster you need to consider what
application components will be running on which components of the
cluster. For a cluster that is meant to primarily run ProMAX and SeisSpace
you can use the following recommendations. For other uses, you will have
to adapt these recommendations appropriately.
The main consideration is to not overload any particular component of the
cluster. For example, it is very easy to overload the Manager node with a
28 SeisSpace System Administration
Known Problems
Other Docs Other Docs
variety of cluster administration daemons as well as a variety of user
processes. For a ProMAX and SeisSpace installation you may want to
segregate the work as follows:
You may decide to run the following on the Manager:
the PBS/Torque server and scheduler
the FlexLM license manager
the SeisSpace sitemanager
You may decide to use a couple of the nodes as user "login" nodes to run:
the SeisSpace User Interface / Flow Builders
Interactive/direct submit ProMAX, Hybrid and SeisSpace jobs
You should only run the following on the "invisible" cluster nodes:
PBS - mom
ProMAX, Hybrid and SeisSpace jobs released fromthe queue or jobs
directed to run on those nodes.
Additional Considerations
In addition to the above, you will need to ensure that the manager node and
the "login" nodes are set up with multiple IP addresses so that they are
visible on both networks. The internal cluster network and the external user
network.
Running jobs on the manager should generally be avoided so that this node
can be available to do the system management work that it is intended to
do.
You want to avoid having a PBS-mom running on the "login" node(s) to
prevent jobs fromthe queue fromrunning on these nodes. The "login" node
should be reserved for interactive display jobs and small direct submit test
jobs.
29 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Managing Batch Jobs using Queues
Managing batch jobs for seismic data processing via queues provides the
following benets:
sequential release of serially dependent jobs
parallel release of groups of independent jobs
optimized system performance by controlling resource allocation
centralized management of system workload
Introduction to Batch Job Queues
Seismic data processing using SeisSpace or ProMAX on an individual
workstation or a Linux cluster can benet from using a exible batch
queuing and resource management software package. Batch queueing
software generally has three components; a server, a scheduler, and some
sort of executor (Mom). A generic diagram showing the relationship
between the various components of the Torque queuing software is
illustrated below.
Torque
Server
Torque
Scheduler
Torque
Mom
ProMAX
qmgr commands
SeisSpace UI
30 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Generic Queued Job Workow
1. A job is submitted to the queuing system server via a command like
"qsub".
2. The server communicates with the scheduler and requests the number
of nodes the job needs.
3. The scheduler gathers current node or workstation resource utilization
and reports back to the server which nodes to use.
4. The server communicates with the mom(s) to start the job on the
node(s) allocated.
Note that a single Linux workstation has one mom daemon, as the diagram
above shows, but the diagram for a Linux cluster can have hundreds to
thousands of compute nodes with one mom on each.
Torque and SGE (Sun Grid Engine) are typical of the available queuing
packages. For this release we tested and documented batch job queuing
using Torque. This package can be freely downloaded from
http://www.clusterresources.com/downloads/torque.
31 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Torque Installation and Configuration Steps
1. Download and install Torque source code
2. Set torque conguration parameters
3. Compile and link the Torque source code
4. Install the Torque executables and libraries
5. Congure the Torque server and mom
6. Test Torque Queue Submission
7. Start Torque server, scheduler, and mom at boot
8. Build the Torque packages for use in installing Torque on cluster
compute nodes, then install these packages
9. Integrate ProMAX and SeisSpace with Torque
10. Recommendations for Torque queues
Download and Install Torque Source Code
Landmark does not distribute Torque so you will have to download the
latest source tar bundle. which looks similar to torque-xx.yy.zz, from the
following URL:
http://www.clusterresources.com/downloads/torque
The latest version of Torque we tested is 2.3.3. on a RedHat 4 Update 5
system.
Note: PBS and Torque are used interchangeably throughout this document.
As the root user, untar the source code for building the Torque server,
scheduler, and mom applications.
> mkdir <some path>/apps/torque
> cd <some path>/apps/torque
> tar -zxvf <downloaded tar le location>/torque-xx.yy.zz.tar.gz
> cd torque-xx.yy.zz
If you decide you want to build the Torque graphical queue monitoring
utilities (recommended) xpbs and xpbsmon, there are some requirements.
Make sure tcl, tclx, tk, and their devel rpms are installed for the
architecture type of your system, such as i386 or x86_64. Since the tcl-
32 SeisSpace System Administration
Known Problems
Other Docs Other Docs
devel-8.*.rpm and tk-devel-8.*.rpm les may not be included with several
of the RHEL distributions, you may need to download them. There may be
other versions that work as well. Any missing RPMs will need to be
installed.
Here is an example of required RPMs from a RHEL 4.5 x86_64
installation:
[root@sch1 prouser]# rpm -qa | grep tcl-8
> tcl-8.4.7-2
[root@sch1 prouser]# rpm -qa | grep tcl-devel-8
> tcl-devel-8.4.7-2
[root@sch1 prouser]# rpm -qa | grep tclx-8
> tclx-8.3.5-4
[root@sch1 prouser]# rpm -qa | grep tk-8
> tk-8.4.7-2
[root@sch1 prouser]# rpm -qa | grep tk-devel-8
> tk-devel-8.4.7-2
Here is an example of required RPMs from a RHEL 5.2 x86_64
installation:
> rpm -qa | grep libXau-dev
libXau-devel-1.0.1-3.1
> rpm -qa | grep tcl-devel-8
tcl-devel-8.4.13-3.fc6
> rpm -qa | grep xorg-x11-proto
xorg-x11-proto-devel-7.1-9.fc6
> rpm -qa | grep libX11-devel
libX11-devel-1.0.3-8.el5
> rpm -qa | grep tk-devel
tk-devel-8.4.13-3.fc6
> rpm -qa | grep libXdmcp-devel
libXdmcp-devel-1.0.1-2.1
> rpm -qa | grep mesa-libGL-devel
33 SeisSpace System Administration
Known Problems
Other Docs Other Docs
mesa-libGL-devel-6.5.1-7.2.el5
> rpm -qa | grep tclx-devel
tclx-devel-8.4.0-5.fc6
Set Torque Conguration Parameters
We will now compile and link the server, scheduler, and mom all at the
same time, then later generate specic Torque "packages" to install on all
compute nodes, which run just the moms. There are many ways to install
and congure Torque queues and here we are presenting just one.
Torque queue setup for a single workstation is exactly the same as for the
master node of a cluster, except with some changes discussed later. You
should be logged into the master node as root if you are installing on a
Linux cluster, or logged into your workstation as root.
Here is RHEL 4.5 x86_64:
> ./congure --enable-mom --enable-server --with-scp --with-server-
default=<hostname of server> --enable-gui --enable-docs --with-
tclx=/usr/lib64
Here is RHEL 5.2 x86_64:
> ./congure --enable-mom --enable-server --with-scp --with-server-
default=<hostname of server> --enable-gui --enable-docs --with-
tcl=/usr/lib64 --without-tclx
Note that we pointed to /usr/lib64 for the 64-bit tclx libraries. This would
be /usr/lib on 32-bit systems.
With the use of "--with-scp" we are selecting ssh for le transfers between
the server and moms. This means that ssh needs to be set up such that no
passwords are required in both directions between the server and moms for
all users.
Compile and Link the Torque Source Code
We will now compile, link and install the torque binaries.
> make
Install the Torque Executables and Libraries
We will now install the Torque executables and libraries.
> make install
34 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Congure the Torque Server and Mom
Instructions for installing and conguring Torque in this document treat a
single workstation and the master node of a cluster the same, then discusses
where the conguration of a cluster is different.
Lets go ahead and setup some two example queues for our workstation or
cluster. The rst thing we will do is congure our master node or single
workstation for the Torque server and mom daemons.
> cd /var/spool/torque/server_priv
Now lets dene which nodes our queues will be communicating with. The
rst thing to do is to build the /var/spool/torque/server_priv/nodes le. This
le states the nodes that are to be monitored and submitted jobs to, the type
of node, the number of CPUs the node has, and any special node properties.
Here is an example nodes le:
master np=2 ntype=cluster promax
n1 np=2 ntype=cluster promax seisspace
n2 np=2 ntype=cluster promax seisspace
n3 np=2 ntype=cluster seisspace
.
.
nxx np=2 ntype=cluster seisspace
The promax and seisspace entries are called properties. It is possible to
assign queue properties that only submit jobs to nodes with that same
property. Instead of the entries n1, n2, etc., you would enter your
workstations hostname or the hostnames of your compute nodes.
Now lets initialize the pbs mom /var/spool/torque/mom_priv/cong le,
here is an example of what one would look like:
# Log all but debug events, but 127 is good for normal logging.
$logevent 127
# Set log size and deletion parameters so we dont ll /var
$log_le_max_size 1000
$log_le_roll_depth 5
# Make node unschedulable if load >4.0; continue when load drops <3.0
$ideal_load 3.0
35 SeisSpace System Administration
Known Problems
Other Docs Other Docs
$max_load 4.0
# Dene server node
$pbsserver <server hostname>
# Use cp rather than scp or rcp for local (nfs) le delivery
$usecp *:/export /export
The $max_load and $ideal_load parameters will have to be tuned for your
system over time, and are gauged against the current entry in the
/proc/loadavg le. You can also use the "uptime" command to see what the
current load average of the system is.
How many and what type of processes can the node handle before it is
overloaded? For example, if you have a quad-core machine then a
$max_load of 4 and an $ideal_load of 3.0 would be just ne. For the
$pbsserver be sure to put the hostname of your Torque server.
After a job is nished the stdout and stderr les are copied back to the
server so they can be viewed. The $usecp entry directs for which les
systems a simple "cp" command can be used rather than "scp" or "rcp".
The output of the "df" command shows what should go into the $usecp
entry. For example:
df
Filesystem 1K-blocks Used Available Use% Mounted on
sch1:/data 480721640 327473640 148364136 69% /data
The $usecp entry would be "$usecp *:/data /data"
Now lets start the Torque server so we can load its database with our new
queue conguration.
> /usr/local/sbin/pbs_server -t create
Warning - if you have an existing set of Torque queues, the "-t create"
option will erase those congured.
Now we need to add and congure some queues. We have documented a
simple script which should help automate this process. You can type these
instructions in by hand, or build a script to run. Here is what this script
looks like:
#!/bin/ksh
/usr/local/bin/qmgr -e <server name> << "EOF"
c q serial queue_type=execution
c q parallel queue_type=execution
36 SeisSpace System Administration
Known Problems
Other Docs Other Docs
s q serial enabled=true, started=true, max_user_run=1
s q parallel enabled=true, started=true
set server scheduling=true
s s scheduler_iteration=30
s s default_queue=interactive
s s managers="<username>@*"
s s node_pack=false
s s query_other_jobs=true
print server
EOF
When creating and conguring queues you typically are doing the
following:
Creating a queue and specifying its type: execution or route.
Enabling and starting the queue.
Dening any resource limitation, such as job runtime, or other prop-
erties for a queue.
Dening properties of the server, such as who can manage queues.
To type these in by hand start the Torque queue manager by typing:
> /usr/local/bin/qmgr
Now lets restart the Torque server and start the Torque scheduler and mom
on the master node or single workstation and test our installation.
> /usr/local/bin/qterm -t quick
> /usr/local/sbin/pbs_server
> /usr/local/sbin/pbs_sched
> /usr/local/sbin/pbs_mom
Now lets start the Torque GUIs xpbs and xpbsmon to see the status of our
queues and the Torque mom.
> /usr/local/bin/xpbs &
You should see a GUI similar to the following, if you built it.
37 SeisSpace System Administration
Known Problems
Other Docs Other Docs
> /usr/local/bin/xpbsmon &
You should see a GUI similar to the following, if you built it.
38 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Testing Torque Queue Submission
Before integrating ProMAX with Torque it is a good idea to test the Torque
setup by submitting a job (script) to Torque fromthe command line. Here is
an example script called pbs_queue_test:
#!/bin/ksh
#PBS -S /bin/ksh
#PBS -N pbs_queue_test
#PBS -j oe
#PBS -r y
#PBS -o <NFS mounted lesystem>/pbs_queue_output
#PBS -l nodes=1
######### End of Job ##########
hostname
echo ""
env
echo ""
cat $PBS_NODEFILE
You will need to modify the #PBS -o line of the script to direct the output
to an NFS mounted lesystem which can be seen by the master node or
single workstation. Submit the job to Torque as follows using a non-root
user:
> /usr/local/bin/qsub -q serial -m n <script path>/pbs_queue_test
If the job ran successfully, there should be a le called <NFS mounted
lesystem>/pbs_queue_output containing the results of the script.
Starting Torque Server, Scheduler, and Mom to start at boot
To start Torque daemons when the machines boot up, use the following
scripts for the master node and single workstation:
pbs_server, pbs_sched, and pbs_mom
The following /etc/init.d/pbs_server script starts pbs_server for Linux:
#!/bin/sh
#
# pbs_server This script will start and stop the PBS Server
#
# chkcong: 345 85 85
# description: PBS is a batch versitle batch system for SMPs and
clusters
#
# Source the library functions
39 SeisSpace System Administration
Known Problems
Other Docs Other Docs
. /etc/rc.d/init.d/functions
BASE_PBS_PREFIX=/usr/local
ARCH=$(uname -m)
AARCH="/$ARCH"
if [ -d "$BASE_PBS_PREFIX$AARCH" ]
then
PBS_PREFIX=$BASE_PBS_PREFIX$AARCH
else
PBS_PREFIX=$BASE_PBS_PREFIX

PBS_HOME=/var/spool/torque
# let see how we were called
case "$1" in
start)
echo -n "Starting PBS Server: "
if [ -r $PBS_HOME/server_priv/serverdb ]
then
daemon $PBS_PREFIX/sbin/pbs_server
else
daemon $PBS_PREFIX/sbin/pbs_server -t create

echo
;;
stop)
echo -n "Shutting down PBS Server: "
killproc pbs_server
echo
;;
status)
status pbs_server
;;
restart)
$0 stop
$0 start
;;
*)
echo "Usage: pbs_server {start|stop|restart|status}"
exit 1
esac
The following /etc/init.d/pbs_sched script starts pbs_sched for Linux:
#!/bin/sh
#
# pbs_sched This script will start and stop the PBS Scheduler
#
# chkcong: 345 85 85
# description: PBS is a batch versitle batch system for SMPs and
clusters
#
# Source the library functions
. /etc/rc.d/init.d/functions
40 SeisSpace System Administration
Known Problems
Other Docs Other Docs
BASE_PBS_PREFIX=/usr/local
ARCH=$(uname -m)
AARCH="/$ARCH"
if [ -d "$BASE_PBS_PREFIX$AARCH" ]
then
PBS_PREFIX=$BASE_PBS_PREFIX$AARCH
else
PBS_PREFIX=$BASE_PBS_PREFIX

# let see how we were called


case "$1" in
start)
echo -n "Starting PBS Scheduler: "
daemon $PBS_PREFIX/sbin/pbs_sched
echo
;;
stop)
echo -n "Shutting down PBS Scheduler: "
killproc pbs_sched
echo
;;
status)
status pbs_sched
;;
restart)
$0 stop
$0 start
;;
*)
echo "Usage: pbs_sched {start|stop|restart|status}"
exit 1
esac
The following /etc/init.d/pbs_mom script starts pbs_mom for Linux:
#!/bin/sh
#
# pbs_mom This script will start and stop the PBS Mom
#
# chkcong: 345 85 85
# description: PBS is a batch versitle batch system for SMPs and
clusters
#
# Source the library functions
. /etc/rc.d/init.d/functions
BASE_PBS_PREFIX=/usr/local
ARCH=$(uname -m)
AARCH="/$ARCH"
if [ -d "$BASE_PBS_PREFIX$AARCH" ]
then
PBS_PREFIX=$BASE_PBS_PREFIX$AARCH
else
41 SeisSpace System Administration
Known Problems
Other Docs Other Docs
PBS_PREFIX=$BASE_PBS_PREFIX

# let see how we were called


case "$1" in
start)
if [ -r /etc/security/access.conf.BOOT ]
then
cp -f /etc/security/access.conf.BOOT
/etc/security/access.conf

echo -n "Starting PBS Mom: "
daemon $PBS_PREFIX/sbin/pbs_mom -r
echo
;;
stop)
echo -n "Shutting down PBS Mom: "
killproc pbs_mom
echo
;;
status)
status pbs_mom
;;
restart)
$0 stop
$0 start
;;
*)
echo "Usage: pbs_mom {start|stop|restart|status}"
exit 1
esac
The following commands actually setup the scripts so the O/S will start
them at boot:
> /sbin/chkcong pbs_server on
> /sbin/chkcong pbs_sched on
> /sbin/chkcong pbs_mom on
Installing Torque On The Compute Nodes
Now that Torque seems to be working lets install it on the compute nodes.
To perform this we need to generate some Torque self-extracting scripts
called "packages". In these packages we need to also include Torque mom
systemstartup (init.d) scripts, as well momconguration information. Note
that this step is not necesary for the single workstation.
> cd <some path>/apps/torque-xx.yy.zz
> mkdir pkgoverride;cd pkgoverride
42 SeisSpace System Administration
Known Problems
Other Docs Other Docs
> mkdir mom;cd mom
> tar -cvpf - /var/spool/torque/mom_priv/cong | tar -xvpf -
> tar -cvpf - /etc/rc.d/init.d/pbs_mom | tar -xvpf -
> cd <some path>/apps/torque-xx.yy.zz;make packages
Now that although all the packages are generated, we only need to install
some of them on the compute nodes. Here is a list of all the packages:
torque-package-clients-linux-x84_64.sh
torque-package-devel-linux-x86_64.sh
torque-package-mom-linux-x86_64.sh
To install these packages you need to copy them to an NFS mounted
lesystem if the directory where they are stored is not visable to all
compute nodes. For example:
> cp *.sh <NFS mount lesystem>
Note that if you are using cluster management software such as XCAT,
Warewulf, or RocksClusters, you are better off to integrate the Torque mom
les and conguration into the compute node imaging scheme.
Install the packages by hand on each node, or if you have some type of
cluster management software such as XCAT, use that to install onto each
node.
> psh compute <NFS mounted lesystem>/torque-package-clients-linux-
x86_64.sh --install
> psh compute <NFS mounted lesystem>/torque-package-devel-linux-
x86_64.sh --install
> psh compute <NFS mounted lesystem>/torque-package-mom-linux-
x86_64.sh --install
> psh compute /sbin/chkcong pbs_mom on
> psh compute /sbin/service pbs_mom start
The xpbsmon application should refresh shortly showing the status of the
compute nodes, which should be "green" if the nodes are ready to accept
scheduled jobs.
43 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Connecting ProMAX and Torque
ProMAX by default is set to use Torque (PBS) queues. The
$PROMAX_HOME/etc/qcong_pbs le denes which Torque queues are
available for use, the name associations, the function to be called in
building a job execution script, and any variables which get passed to the
function script. You should modify this le to conform with the Torque
queues that you have created.
#
# PBS batch queues
#
name = serial
type = batch
description = "Serial Execution Batch Jobs"
function = pbs_submit
menu = que_res_pbs.menu
properties = local
machine = <torque_batch_server>
#
name = parallel
type = batch
description = "Parallel Execution Batch Jobs"
function = pbs_submit
properties = local
menu = que_res_pbs.menu
machine = <torque_batch_server>
The following is what the SeisSpace job submit window might resemble
with the conguration above:
44 SeisSpace System Administration
Known Problems
Other Docs Other Docs
If you have congured your queues for a cluster, and have conrmed that
they are working properly, you need to do a couple of things to disable the
master node from being used as a compute node.
1. Turn off the pbs_mom.
> /sbin/service pbs_mom stop
2. Disable the pbs_mom from starting at boot.
> /sbin/chkcong pbs_mom off
3. Remove the master node from the /var/spool/torque/server_priv/nodes
le.
Recommendations for Torque queues
Based on our batch job queue testing efforts we offer the following guide
lines for conguring your Torque batch job queues.
45 SeisSpace System Administration
Known Problems
Other Docs Other Docs
It is important that the queue does not release too many jobs at the
same time. You specify the number of available nodes and CPUs per
node in the /var/spool/torque/server_priv/nodes le. Each job is sub-
mitted to the queue with a request for a number of CPU units. The
default for ProMAX jobs is 1 node and 1 CPU or 1 CPU unit. That is,
to release a job, there must be at least one node that has 1 CPU un-
allocated.
There can be instances when jobs do not quickly release from the
queue although resources are available. It can take a few minutes for
the jobs to release. You can change the scheduler_iteration setting
the Torque qmgr command. The default is 600 seconds (or 10 min-
utes). We suggest a value of 30 seconds. Even with this setting, dead
time for up to 2 minutes have been observed. It can take some time
before the loadavg begins to fall after the machine has been loaded.
By default, Torque installs itself into the /var/spool/torque,
/usr/local/bin and /usr/local/sbin directories. Always address the
qmgr by its full name of /usr/local/bin/qmgr. The directory path
/usr/local/bin is added to the PATH statement inside the queue man-
agement scripts by setting the PBS_BIN environment variable. If you
are going to alter the PBS makeles and have PBS installed in a loca-
tion other than /usr/local, make sure you change the PBS_BIN envi-
ronment setting in the ProMAX sys/exe/pbs/* les, and in the
SeisSpace etc/SSclient script example.
Run the xpbs and xpbsmon programs, located generally in the
/usr/local/bin directory, to monitor how jobs are being released and
how the CPUs are monitored for availability. Black boxes in the xpb-
smon user interface indicate that the node CPU load is greater than
what has been congured, and no jobs can be spawned there until the
load average drops. It is normal for nodes to showas different colored
boxes in the xpbsmon display. This means that the nodes are busy and
not accepting any work. You can also modify the automatic update
time in the xpbsmon display. However, testing has shown that the
automatic updating of the xpbs display may not be functioning.
Landmark suggests that you read the documentation for Torque.
These documents include more information about the system and
ways to customize the conguration, and can be found on the Torque
website.
Torque requires that you have the hostnames and IP addresses in the
hosts les of all the nodes.
Note: hostname is the name of your machine; hostname.domainname
can be found in /etc/hosts, and commonly ends with .com:
ip address hostname.domain.com hostname
46 SeisSpace System Administration
Known Problems
Other Docs Other Docs
For DHCP users, ensure that all of the processing and manager nodes
always get the same ip address.
We present one method of installing and conguring Torque job queues.
There are many alternative methods that will be successful so long as the
following conditions exist:
Install Torque for all nodes of the cluster. The installation can be done
on each machine independently, or you can use a common NFS
mounted le system, or your cluster management software may con-
tain a precongured image.
Install all components including the server and scheduler on one
node. This is known as the server node and serves the other main pro-
cessing nodes. Normally this will be the cluster manager node. On a
single workstation the server, scheduler, and mom daemons are all
installed.
The following les must be the same on all installations on all
machines:
/var/spool/torque/server_name
/var/spool/torque/mom_priv/cong
These les are only used by the server and scheduler on the manager
machine:
/var/spool/torque/server_priv/nodes
The UID and GID for users must be consistent across the master and
compute nodes.
All application, data, and home directories must be mounted the same
on the master and compute nodes.
47 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Flat File-based Flow Replication
This section discusses how ow replication is implemented in SeiSpace. It
also discusses the where and when the at les are created and how theyre
stored and managed.
For more information about using the Flow Replication tools, please refer
to the chapter titled Replicating Flows in the Using SeisSpace guide.
In Flat File-based Flow Replication, all ow replication data are stored in
at les in the $PROMAX_DATA_HOME/AREA/LINE and
$PROMAX_DATA_HOME/AREA/LINE/FLOW directories
LINE/
replicaParms.txt tab-delimited le with replica parameters
editable in Excel; can be a symbolic link
replicaParms.txt~ a backup that is generated whenever a
repilcaParms.txt le is successfully loaded
replicaPrefs.xml some constants stored about the replica table
such as column width and display order
Starting with the 5000.0.1.0 release in early 2009 a le locking mechanism
was added to manager the replicaParms.txt le. When a user opens the
replica table and adds columns or changes values, that user will write a
lock le. Other users will not be able to make edits to the replica table until
the user who owns the lock le saves his/her work and releases the lock.
LINE/FLOW (template ow)
exec.#.pwow ---
exec.#.log | These are the les associated with the template
exec.#.qsh | There may be multiple versions depending on
| how many times the
exec.#.qerr | template was run to test it
packet.#.job ---
jobs.stat --- This le is not used at this time but
contains the status of the main ow
exec.r#.#.pwow --- These are the les associated with each
| replica ow
exec.r#.#.log | The rst # after the r is the sequence number
exec.r#.#.qsh --- The second # is the replica instance number.
(more detail on replica instances later)
replicas.#.stat --- a binary le that contains all of the job status
information replicas of a particular version
replicasInfo.xml --- a simple xml le that indicates that the ow
is a template
48 SeisSpace System Administration
Known Problems
Other Docs Other Docs
The general methodology follows the idea that it is possible that some of
the replicas will need to be rerun or rebuilt and then rerun for a variety of
different reasons. After you either rerun or rebuild/rerun some of the
replicas you will see multiple versions of the ow, printout and qsh les for
each instance of the replica and multiple replica.#.stat les. In the
following example replicas 1 and 2 have been built and run 4 times,
replicas 3 and 4 have been run 3 times etc...
$ ls exec.r1.*
exec.r1.2.log exec.r1.2.pwow exec.r1.2.qsh
exec.r1.3.log exec.r1.3.pwow exec.r1.3.qsh
$ ls exec.r2.*
exec.r2.2.log exec.r2.2.pwow exec.r2.2.qsh
exec.r2.3.log exec.r2.3.pwow exec.r2.3.qsh
$ ls exec.r3.*
exec.r3.1.log exec.r3.1.pwow exec.r3.1.qsh
exec.r3.2.log exec.r3.2.pwow exec.r3.2.qsh
$ ls exec.r4.*
exec.r4.1.log exec.r4.1.pwow exec.r4.1.qsh
exec.r4.2.log exec.r4.2.pwow exec.r4.2.qsh
$ ls exec.r5.*
exec.r5.0.log exec.r5.0.pwow exec.r5.0.qsh
exec.r5.1.log exec.r5.1.pwow exec.r5.1.qsh
$ ls exec.r6.*
exec.r6.0.log exec.r6.0.pwow exec.r6.0.qsh
exec.r6.1.log exec.r6.1.pwow exec.r6.1.qsh
$ ls exec.r7.*
exec.r7.0.log exec.r7.0.pwow exec.r7.0.qsh
$ ls exec.r8.*
exec.r8.0.log exec.r8.0.pwow exec.r8.0.qsh
$ ls exec.r9.*
exec.r9.0.log exec.r9.0.pwow exec.r9.0.qsh
$ ls exec.r10.*
exec.r10.0.log exec.r10.0.pwow exec.r10.0.qsh
49 SeisSpace System Administration
Known Problems
Other Docs Other Docs
$ ls -al *stat*
jobs.stat
replicas.0.stat
replicas.1.stat
replicas.2.stat
replicas.3.stat
Notice that the earlier numbered replicas such as 1 and 2 have instance
numbers 2 and 3 where as replica numbers 3 and 4 have instance numbers 1
and 2. There is a preference setting that can be used to put a limit on the
number of versions of replicas to keep. In this case the preference was set
to keep and automatically purge to 2 versions of the replica ows. The two
most recent are retained.
The job status information for all of these versions is stored in the different
replicas.#.stat les. The status that is shown in the Replica Job Table (RJT)
will be the status of the ow in the matching numbered stat le. The
replica.3.stat le will only have information for those ows that had a 3rd
instance. The stat les contain the Job status such as Complete, Failed,
User Terminated. The "Built" and "Unknown" status values are not stored.
A ow is marked as Built if there is no known status for it in the matching
stat le and the ow les exist on disk. If multiple versions of replicated
ows exist, the status that will be shown is the status in the stat le of the
highest numbered replica.
sequence number
1 1 1 1 1
1 2 3 4 5 6 7 8 9 0 1 2 3 4
---------------------------
. . . . x . . . x . . . . . 2.stat
. . . x x . . . x x x . . . 1.stat
x x x x x x x x x x x x x x 0.stat
If you delete a replica using the delete function in the RJT, all existing
instances of the replicated ows will be deleted and the job status will be
removed from all of the stat les. For example, if the replica ows for
sequence 5 are deleted, the status will be removed from all existing stat
les. The status of the ow will be set to "Unknown" until the replica is
rebuilt.
sequence number
1 1 1 1 1
1 2 3 4 5 6 7 8 9 0 1 2 3 4
---------------------------
. . . . . . . . x . . . . . 2.stat
. . . x . . . . x x x . . . 1.stat
x x x x . x x x x x x x x x 0.stat
50 SeisSpace System Administration
Known Problems
Other Docs Other Docs
If you delete all of the replicas, all of the replica ow folders will be
deleted but the replica.#.stat les will not be deleted. All of the status
values in all of the stat les will be deleted.
sequence number
1 1 1 1 1
1 2 3 4 5 6 7 8 9 0 1 2 3 4
---------------------------
. . . . . . . . . . . . . . 2.stat
. . . . . . . . . . . . . . 1.stat
. . . . . . . . . . . . . . 0.stat
If you make a new set of replicas the instance numbering will start at 0
again.
If there are no replica ows left in the template ow, you can safely delete
all of the stat les.
51 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Adding User Dened Table Types
Some of the larger external development facilities have added table types
for use with their ProMAX tools. In ProMAX you would add these table
types to the parm_headers le. To add these user dened tables to the table
list in the SeisSpace user interface, these table type denitions must be
added to the $PROWESS_HOME/etc/owbuilder/TableTypes.txt le.
An example is delivered with the installation.
; User-dened table types. Each line is of the form:
;
; Unique_3_Letter_Extension Description Primary Secondary Z1 ... Zn
;
; Use NULL to specify a primary or secondary key that must be queried
; Use semicolon or # to start a comment line; blank lines are ok too
; eg:
; FKP "FK Filter Polygon" NULL PLYINDEX F K
; gat "Miscellaneous Time Gate" NULL NULL START END
;FKP "FK Filter Polygon" NULL PLYINDEX F K
;eig "WAAVO Eigenvector Constraints 12 columns" CDP TIME R01 RSH1
RP1 R02 RSH2 RP2 R03 RSH3 RP3 EV1 EV2 EV3
You must restart the Navigator after you edit this le to see the changes.
52 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Process Promotion and Demotion
Dev/Alpha/Beta/Prod
SeisSpace supports the capability of having several versions of the same
module or process in different stages of development. In this case, you may
want to switch between the versions in a ow without having to build and
parameterize a new menu.
This capability only extends to SeisSpace module development. There is no
application for ProMAX tools.
An example scenario may be that you have a programthat you are working
on in your development environment and periodically you want to release a
version to production, but you want the development version to be
available as well so that a tester can test a new development version against
the current production version easily.
In this case a user can insert a process into a ow by choosing the process
from either the production processes list or the development processes list
from the developers tree. Then the user can switch back and forth and have
the menus update with like parameters and execute the different versions of
the program.
The examples below will use several typical scenarios to illustrate using
the process promotion/demotion capability/
Note: These examples assume that your SeisSpace development
environment has been congured using the
PROWESS_HOME/port/bin/Makeseisspace script.
First example - Simple single developer environment with two versions of a
module
A simple example for an external development site might look something
like this:
There are two "systems" that the users need access to simultaneously.
The customers standard Landmark-provided installation in a com-
mon shared directory.
The customers developers development system in the developers
home directory.
The standard Landmark system has no knowledge of the customers tool.
The developer has two versions, a "production" version and a "dev"
version.
53 SeisSpace System Administration
Known Problems
Other Docs Other Docs
The user will need to add a MY_PROWESS_HOME environment variable
to the SSclient start up script set of variables, where
MY_PROWESS_HOME is set to the developers home prowess
development directory.
In this case the development user is user1 and MY_PROWESS_HOME
would be set to /home/user1/prowess
You can look at the Example0AddAmplitude example program to see how
you might want to set this up.
This will be the rst example.
The developer would have the following directory structure in his/her home
directory:
[ssuser@nuthatch example0addamplitude]$ pwd
/home/ssuser/prowess/port/com/djg/prow-
ess/tool/example0addamplitude
[ssuser@nuthatch example0addamplitude]$ ls -alR *
Example0AddAmplitudeProc.java
Example0AddAmplitudeTool.java
Makele
dev:
Example0AddAmplitudeProc.java
Example0AddAmplitudeTool.java
Makele
There are two different versions of the menu and tool code; one version in
the main directory and another version in the "dev" subdirectory.
There are also two different makeles.
The one major change in the menu between the two versions is the package
name.
The production version of the *Proc.java le would have the rst line:
package com.djg.prowess.tool.example0addamplitude;
The dev version of the *Proc.java le would have the rst line:
package com.djg.prowess.tool.example0addamplitude.dev;
54 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Note: You want to make sure that getToolName is commented out.
// public String getToolName() {
// return "com.djg.prow-
ess.tool.example0addamplitude.Example0AddAmplitudeTool";
// }
In the production version of *Tool.java le you would have the rst lines:
// Each tool is in a java package with its proc (menu) le
package com.djg.prowess.tool.example0addamplitude;
and in the dev version of the *Tool.java le you would have the rst line:
// Each tool is in a java package with its proc (menu) le
package com.djg.prowess.tool.example0addamplitude.dev;
The production verion of the Makele le would have the PACKAGE line:
PACKAGE := com/djg/prowess/tool/example0addamplitude
The dev verion of the Makele le would have the PACKAGE line:
PACKAGE := com/djg/prowess/tool/example0addamplitude/dev
OPTION 1- make the two versions of the module available in the
developers Processes List:
Edit the PROWESS.xml le in the developers home/prowess/etc diretory:
[ssuser@nuthatch owbuilder]$ pwd
/home/ssuser/prowess/etc/owbuilder
[ssuser@nuthatch owbuilder]$ more PROWESS.xml
<parset name="SeisSpace">
<parset name="Developer Examples">
<parset name="Example 0 (Add Amplitude)">
<par name="name" type="string"> com.djg.prow-
ess.tool.example0addamplitude.Example0AddAmplitudeProc </par>
</parset>
<parset name="Example 0 (Add Amplitude)|dev">
<par name="name" type="string"> com.djg.prow-
ess.tool.example0addamplitude.dev.Example0AddAmplitudeProc
</par>
</parset>
</parset>
</parset>
Note: The "|dev" designation and the addition of "dev" to the PROC le
path name for the development version of the tool.
55 SeisSpace System Administration
Known Problems
Other Docs Other Docs
In this case there are two versions of the process in the processes list. The
user would be required to specically choose both of the processes and
have both menus in the ow and swap back and forth between them.
OPTION 2 - For the case where you want to swap between the menus but
only have one occurrence of the process in the ow the developer would
add a *Proc.xml le the his/her prowess/etc directory (the same directory
where the developers Processes list is.) This le will have a copy of this
individual tool stanza from the PROWESS.xml processes list in it
[ssuser@nuthatch owbuilder]$ pwd
/home/ssuser/prowess/etc/owbuilder
[ssuser@nuthatch owbuilder]$ ls -al
Example0AddAmplitudeProc.xml
PROWESS.xml
[ssuser@nuthatch owbuilder]$ more
Example0AddAmplitudeProc.xml
<parset name="Example 0">
<parset name="Example 0 (Add Amplitude)">
<par name="name" type="string"> com.djg.prow-
ess.tool.example0addamplitude.Example0AddAmplitudeProc </par>
</parset>
<parset name="Example 0 (Add Amplitude) | dev">
<par name="name" type="string"> com.ano.prow-
ess.tool.example0addamplitude.dev.Example0AddAmplitudeProc
</par>
</parset>
</parset>
Now with one process in the menu, the user can swap between the
production and development versions using the MB3>Versions menu on
that process. Note: This may be Ctrl-MB3>Versions if using the default
mouse bindings where MB3 toggles processes active to inactive.
The color code of the process with change and the icon designation will
also change showing the version of the tool that was selected.
Four versions of each module are supported: dev, alpha, beta, and the
default un-designated production version.
A fth "obsolete" designation can be used to ag a process or a version to
be obsolete.
56 SeisSpace System Administration
Known Problems
Other Docs Other Docs
Second example - Multiple developer environment with two versions of a
module
This is an extension of the rst example where there are more than one
developer, but each developer is working in the same mode of having a
couple different versions of a module available:
There are three "systems" that the users need access to simultaneously.
The standard LGC provided installation in a common shared direc-
tory.
The rst customers developers development system in the rst
developers home directory.
The second customers developers development systemin the second
developers home directory.
The standard Landmark system has no knowledge of the customers tools.
The developers tools have two versions, a "production" version and a
"dev" version.
A user within the customers system will need to add a
MY_PROWESS_HOME environment variable to the SSclient start up
script set of variables, where MY_PROWESS_HOME is set to the both
developers home prowess development directories in a ":" separated list.
In this case, the development users are user1 and user2 and
MY_PROWESS_HOME would be set to
/home/user1/prowess:/home/user2/prowess.
In this model if the two developers have the same module name and each
have a *PROC.xml le in their development trees, these les will be
concatenated and all versions in both les will be shown in the MB3-
versions menu.
Third example - Multiple developer environment with two versions of a mod-
ule and a customer "production environment"
There are at least three "systems" that the users need access to
simultaneously.
The standard LGC provided installation in a common shared direc-
tory
The customers primary production system in a common shared
directory
57 SeisSpace System Administration
Known Problems
Other Docs Other Docs
The customers developers development system in the developers
home directory (possibly more than 1).
The standard LGC system has no knowledge of the customers tool. The
customers tool has two versions, a "production" version and a developers
dev version.
There is very little difference between this scenario and the previous one.
The only difference is that the production version are in a "public" place
and will probably have a package denition related to the company name.
The user will just need to add the appropriate paths to the
MY_PROWESS_HOME variable to see the public locations.
In this model if the two developers have the same module name and each
have a *PROC.xml le in their development trees and there is a third le of
the same name in the customer "production" tree, these les will be
concatenated and all versions in both les will be shown in the
MB3>Versions menu.
Summary
Each developer can support up to 4 versions of a module: dev, alpha, beta
and a production version (the production version has no special
designation).
Processes in the Processes list can be listed in an obsolete status by using
the "obsolete" designation.
The example that is copied in fromthe Makeseisspace script can be used as
an example for setting the package names and the makele paths for each
version.
If multiple versions are put into the PROWESS.xml le which stores the
processes list in the ../etc/owbuilder directory, then the user can directly
select from the different versions of the module from the processes list.
If multiple versions are put into the toolProc.xml le in the
../etc/owbuilder directory, then the user can swap between processes using
the MB3>Versions menu. Note: This may be Ctrl-MB3> Versions if
using the default mouse bindings where MB3 toggles processes active to
inactive.
The users need to add the MY_PROWESS_HOME environment variable
to the SSclient script where this can be a single or colon (:) separated
multiple path. Each directory in the path will show up as a separate list of
processes in the processes list panel of the Flow Builder.
58 SeisSpace System Administration
Known Problems
Other Docs Other Docs
The user can choose the process fromany of the processes lists and if there
is a toolProc.xml le in one or all of the etc/owbuilder directories, then
the user can choose the version to use from the Ctrl-MB3>Versions
options. If there are multiple toolPROC.xml les in all of the directories,
these will be concatenated (on a per tool basis) and all options in all the
les will be presented in the version options.
It is important to keep the hierarchy of the directories in mind when
working with multiple versions of processes. If there are multiple
directories listed in MY_PROWESS_HOME, the rst instance wins.
Therefore, it is desirable that multiple developers not use the same tool
names. In addition, all toolProc.xml les in all etc/owbuilder directory for
all directories in the multi-pathed MY_PROWESS_HOME will be
concatenated.
Managing multiple help les for the different versions.
This is a copy of the self documented tooProc.xml le
PROWESS_HOME/etc/owbuilder/Example0AddAmplitudeProc.xml
<parset name="Example 0">
<parset name="Example 0 (Add Amplitude)|dev">
<par name="name" type="string">
examples.example0addamplitude.dev.Example0AddAmplitudeProc </par>
<!-- An example of using an absolute path for the location of the
help file.
Note the presence of a leading file separator character / and
file suffix. -->
<par name="help" type="string"> /lair/gwong/add_amplitude_dev.pdf
</par>
</parset>
<parset name="Example 0 (Add Amplitude)|alpha">
<par name="name" type="string">
examples.example0addamplitude.alpha.Example0AddAmplitudeProc </par>
<!-- An example of using the multi-pathed MY_PROWESS_HOME,
MY_PROMAX_HOME to locate the help file.
Note the absence of file separator character / and file suffix.
-->
<par name="help" type="string"> add_amplitude_alpha </par>
</parset>
<parset name="Example 0 (Add Amplitude)|beta">
<par name="name" type="string">
examples.example0addamplitude.beta.Example0AddAmplitudeProc </par>
<par name="help" type="string"> add_amplitude_beta </par>
</parset>
<parset name="Example 0 (Add Amplitude)">
<par name="name" type="string">
examples.example0addamplitude.Example0AddAmplitudeProc </par>
59 SeisSpace System Administration
Known Problems
Other Docs Other Docs
<!-- An example of using the multi-pathed MY_PROWESS_HOME,
MY_PROMAX_HOME to locate the help file
with the default name derived from the Proc i.e.
Example0AddAmplitude. Note the absence of
the help parameter.
-->
</parset>
</parset>

Das könnte Ihnen auch gefallen