Beruflich Dokumente
Kultur Dokumente
This reference was originally created for my own use as a systems programmer's
"survival tool", to accumulate essential information and references that I knew
I would have to refer to again, and quickly re-find it. In participating in the
ADSM-L mailing list, it became apparent that others had a similar need, and so
it made sense to share the information. The information herein derives from many
sources, including submissions from other TSM customers. This, the information
is that which everyone involved with TSM has contributed to a common knowledge
base, and this reference serves as an accumulation of that knowledge, largely
reflective of the reality of working with the TSM product as an administrator.
I serve as a compiler and contributor. This informal, "real-world" reference is
intended to augment the formal, authoritative documentation provided by Tivoli
and allied vendors, as frequently referenced herein. See the REFERENCES area at
the bottom of this document for pointers to salient publications.
In dealing with the product, one essential principle must be kept in mind, which
governs the way the product operates and restricts the server administrator's
control of that data: the data which the client sends to a server storage pool
will always belong to the client - not the server. There is no provision on the
server for inspecting or manipulating file system objects sent by the client.
Filespaces are the property of the client, and if the client decides not to do
another backup, that is the client's business: the server shall take no action
on the Active, non-expiring files therein. It is incumbent upon the server
administrator, therefore, to maintain a relationship with client administrators
for information to be passed when a filespace is obsolete and discardable, when
it has fallen into disuse.
Macintosh, shut down after backups Put into the ADSM prefs file:
"SCHEDCOMpleteaction Shutdown"
Macintosh backup file names Macintosh has traditionally used the
colon character (:) rather than slash
(/) or backslash (\) as its directory
designation character. Interestingly,
this persists into OS X, where the user
interface makes the directory character
seem to be the usual Unix slash (/); but
OS X invisibly translates that to and
from its usual colon (:). So, if you do
Query CONtent or the like at the TSM
server, you will see the actual colons
separating file path components.
Macintosh client components The following components are in the
Macintosh client package:
Backup: The interactive GUI for
backup, restore, archive, retrieve.
~2.8MB
Scheduler daemon: A background appl
that operates in sleep mode until it
is time to run a schedule, then starts
the Scheduler program. ~120KB
Scheduler program: Communicates with
the server for the next schedule to
run, and performs the scheduled
action, such as a backup or restore,
at the scheduled time. ~1.5MB
Macintosh disaster recovery Simply take some kind of removable disk
(Syquest, ZIP, ...) with enough capacity
and put a minimal version of MacOS (with
TCP/IP support) and ADSM on it.
Macintosh files, back up from NT Yes, ADSM can do this, via NT
"Services for Macintosh". NT can access
Macintosh file systems, and from NT you
can then back them up. BUT: ADSM
version 2 cannot handle the resource
fork portion of the files (ADSM v3 can).
V.2 restorals thus bring the files
back as "flat files".
See: Services for Macintosh;
USEUNICODEFilenames
Macintosh files, restore to NT The Mac files must be restored to a
directory managed by "Services for
Macintosh". Also make sure that
Services for Macintosh is up and
running.
Macintosh icons, effects of moving In the Mac client V3 manual, Chapter 3,
page 13, it says: "Simply moving an
icon makes the file appear changed. ADSM
records the change in icon position to
minimize the problem of multiple icons
occupying the same space after the files
are restored. If only the attributes of
a file or folder have changed, and not
the data, only the attributes are backed
up. You may have multiple versions of
the same file with the only difference
between them being the icon position or
color."
Macintosh OS X scheduler Via dsmcad. It's started from the script
/Library/StartupItems/dsmcad/dsmcad when
Mac OS X boots. You should see a
/usr/bin/dsmcad running. If checking
with the GUI client, you'll need to use
'TSM Backup for Administrators' rather
than the plain 'TSM Backup': the latter
will only show other users' backed up
directories, not their files.
MACRO TSM server command used to invoke a
user-programmed set of TSM commnds, as
a package, with variable substitution.
Syntax:
'MACRO MacroName [Substitutionvalues]'
where the the macro file name is
case-sensitive and Substitutionvalues
fill in percent-signed numbers, in
numerical order by invocation order.
Example of Substitution Variables: %1,
%2, %3.
Note that the %variables will be filled
in only if they are "exposed": if they
are in quotes, the macro will not
perform substitution, as quoted values
are taken as literals. This is
inconvenient as we often want to employ
an SQL Select in a macro, and in SQL, a
string must be in single quotes. The one
way around this is to feed the value
itself as a quoted string.
Redirection: Works. Note that the
facility does not perform variables
substitution on a redirection output
destination name, so the following will
*not* work: q n %1 > /tmp/qn.%1
Note that you cannot run a macro via an
Administrative Schedule - but you can
via a Client Schedule, via ACTion=Macro
with OBJects naming the macro...which
means that the schedule must be
associated with a node and that its dsmc
sched process causes the macro to run.
(Consider instead using Server Scripts.)
The TSM manuals are obscure as to where
macro files are supposed to be located.
In actuality, they can be:
- In the directory where the dsmadmc
command was invoked, whereby you can
invoke the macro simply by its base
name, as in:
MACRO mymacro
- In any system directory, whereby you
need to invoke the macro by full
path name, as in:
MACRO /usr/local/adsm/mymacro
One convenient practice would be to
create a standard macros directory, and
then 'cd' there before invoking
'dsmadmc', thus allowing you to invoke
the macros with short names.
Note that you do not need eXecute
permission to be set on macro files, in
that *SM will load and interpret them.
An unusual factor is that TSM keeps
going back to the macro as it performs
it, even if the macro is simple and
certainly involves no looping: changing
the content of the macro during a
"more..." screen transition, for
example, will result in an "ANR2000E
Unknown command" error message.
Ref: Admin Guide chapter "Automating
Server Operations", Using Macros
See also: /* */; Server scripts
Magic Number You will run into occasional TSM server
messages referring to "magic number".
This amounts to a checksum number which
TSM generated and stored in the database
at the time it put the file object into
its storage pool (wrote it to media),
to assure data integrity. When at some
time in the future TSM may be called
upon to retrieve the object from that
media, it generates a checksum from the
retrieved file data and checks that it
matches what it originally had for the
object. An error indicates that the
data could be read from the media
without hardware/OS detection of an
error, but nevertheless there is a
discrepancy. The data is thus deemed
corrupted and hopeless: you need to
perform a Restore Volume or the like to
get a usable copy of the object.
How did the data go bad? The most likely
cause is between TSM and the tape head:
Faulty hardware, erroneous firmware, bad
SCSI cables, network infrastructure
problems, and the like can all result in
bad data ending up on the media.
Magstar Product line acronym: Magnetic storage
and retrieval.
Name supplanted in 2002 by
See also: IBM TotalStorage; TotalStorage
Magstar MP IBM's name for its 3570 and 3575
technology.
MAILprog Client System Options file (dsm.sys)
option to specify who gets mail, and via
what mailer program, when a password
expires and a new password is generated.
Can be used when PASSWORDAccess Generate
is in effect. Code within the
SErvername section of definitions.
Format: "Mailprog /mail/pgmname User_Id"
See also: PASSWORDAccess; PASSWORDDIR
MAKesparsefile See: Sparse files, handling of
MAM Medium Auxiliary Memory: An Auxiliary
Memory residing on a medium, for
example, a tape cartridge.
Some tape technologies - e.g., AIT and
LTO (Ultrium) - use cartridges equipped
with Medium Auxiliary Memory (MAM), a
non-volatile memory used to record
medium identification and usage info.
This is typically accessed via an RF
interface and does not require reading
the tape itself. In a library not
equipped with a mobile MAM reader, it is
necessary to load the cartridge into the
drive to read the MAM via the drive's
MAM reader.
Ref: http://www.t10.org/ftp/t10/
document.99/99-347r0.pdf
Mammoth tape drive Exabyte Corp. 8mm (helical scan) tape
drive with SCSI-2 fast interface, wide
or narrow, with SE or differential as an
option. Introduced in 1996, aimed at the
midrange server market.
Capacity: 20 GB, native/uncompressed;
40 GB compressed.
Transfer Rate: 10.5 GB per hour,
native/uncompressed; 360 MB/min
compressed rate.
Technology is similar to AIT-1.
Mammoth-2 tape drive Exabyte 8mm tape drive (helical scan),
with multiple channels, with error
correction and ALDC compression.
Form factor: half-height, 5.25"
Capacity: 60 GB native; up to 150 GB
compressed.
Transfer rate: 12 MB/s native; up to 30
GB/s with compression.
Cartridge tape contains a section of
cleaning fabric which the drive uses as
needed.
Technology is similar to AIT-2.
Managed Server See: Enterprise Configuration and Policy
Management
MANAGEDServices Windows client option for having CAD
cause the client scheduler, and web
client, to run rather than have them
hang around as memory-holding processes.
Syntax: MANAGEDServices
{[schedule] [webclient]}
See also: CAD
Management class A policy object that contains a
collection of (HSM) space management
attributes and backup and archive Copy
Groups. The space management attributes
contained in a Management Class
determine determine whether HSM-managed
files are eligible for automatic or
selective migration. The attributes in
the backup and archive Copy Groups
determine whether a file is eligible for
incremental backup and specify how ADSM
manages backup versions of files and
archived copies of files.
The management class is typically chosen
for users by the node root administrator
(via 'ASsign DEFMGmtclass') but can
alternately be selected as the third
token on the INCLUDE line in the
include-exclude options file, or via the
DIRMc Client Systems Option File option,
or the ARCHMc 'dsmc archive' command
line option. However, automatic
migration occurs *only* for the default
management class; for the incl-excl
named management class you have to
manually incite migration.
Management class, choose Is accomplished by specifying the
mangement class as the third token on a
client Include option.
Format: Include FileSpec MgmtClassName
To have all backups use the management
class, code:
Include * MgmtClassName
To have specific file systems use the
management class, do like:
Include /fsname/.../* MgmtClassName
Ref: Client B/A manual
Management class, copy See: COPy MGmtclass
Management class, default As the name implies, this is the
management class which will be used by
default. Can be overridden via the third
token on the INCLUDE line in the
include-exclude options file. However,
automatic migration occurs *only* for
the default management class; for the
incl-excl named management class you
have to manually incite migration.
Management class, default, establish 'ASsign DEFMGmtclass DomainName SetName
ClassName'
To make this change effective you then
need to do:
'ACTivate POlicyset DomainName SetName'
Management class, define 'DEFine MGmtclass DomainName SetName
ClassName
[SPACEMGTECH=AUTOmatic|
SELective|NONE]
[AUTOMIGNOnuse=Ndays]
[MIGREQUIRESBkup=Yes|No]
[MIGDESTination=poolname]
[DESCription="___"]'
Note that except for DESCription, all of
the optional parameters are Space
Management Attributes for HSM.
Management class, delete 'DELete MGmtclass DomainName SetName
ClassName'
Management class, query 'Query MGmtclass [[[DomainName]
[SetName] [ClassName]]] [f=d]'
See also: Management classes, query
Management class, SQL queries It is: CLASS_NAME
Management class, update See: UPDate MGmtclass
Management class for HSM, select HSM uses the Default Management Class
which is in force for the Policy Domain,
which can be queried from the client via
the dsmc command 'Query MGmtclass'.
You may override the Default Management
Class and select another by coding an
Include-Exclude file, with the third
operand on an Include line specifying
the Management Class to be used for the
file(s) named in the second operand.
Management class used by a client 'dsmc query mgmtclass' or
'dsmc query options' in ADSM ('dsmc show
options' in TSM).
Management class used in backup Shows up in 'dsmc query backup', whether
via command line or GUI.
Management classes, display in detail 'dsmmigquery -M -D'
Management classes, query from client 'dsmc Query Mgmtclass [-DETail]'
Reports the default management class
and any management classes specified
on INCLude statements in the
Include/Exclude file.
Management classes, unused, identify You can perform queries like the
following, for Archives and Backups:
SELECT DOMAIN_NAME, CLASS_NAME FROM
MGMTCLASSES WHERE CLASS_NAME NOT IN
(SELECT DISTINCT(CLASS_NAME) FROM
ARCHIVES)
MANUAL (libtype) See: Manual library
Manually Ejected category 3494 Library Manager category code FFFA
for a tape volume which was in the
inventory but in a re-inventory was
not found in the 3494. Thus, the 3494
thinks that someone reached in and
removed it. This category is typically
induced by having to extricate a damaged
tape from the robot. See "Purge Volume"
category to eliminate such an entry.
Manual library No, it's not a library full of manuals;
it's a library whose volumes are to be
mounted manually, by people responding
to mount messages. It is distinguished
by LIBType=MANUAL in DEFine LIBRary; and
the tape device will be of "mt" type,
rather than "rmt" (*SM driver).
A shop running this type of operation
will usually have an operations terminal
running the *SM administrative client in
Mount Mode (dsmadmc -mountmode), simply
for the operators to see and respond to
mount requests. Outstanding mount
requests can be checked via Query
REQuest. Such requests are answered with
the REPLY command acknowledging a
specific request number, to signify that
the action requested has been performed
by the operator such that *SM can
proceed.
Manuals See: TSM manuals
"Many small files" problem The name of the challenge where backups
involve a large number of small files,
which stresses the TSM database due to
the heavy updating and number of
database entries, and the client's
memory and processing power in
performing an Incremental backup.
See "Database performance" for ways to
mitigate the impact on the TSM database
and optimize performance.
Other possible approaches:
- To somewhat reduce Backup time,
consider using -INCRBYDate backup,
which eliminates getting a long list
of files from the server, massaging it
in client memory, and then comparing
as the file system is traversed. (But
see the INCRBYDate entry for side
effects.)
- Another Backup time reduction scheme:
With some client file systems it may
be known in what area updating occurs,
as in the case of a company doing
product testing which creates
thousands of results files in
subdirectories named by product and
date. Here you can tailor your backup
to go directly at those directories
and skip the rest of the file system,
where you know that little or nothing
has changed.
- Journal-Based Backups may be a good
alternative on Windows.
- Consider 'dsmc Backup Image' (q.v.),
to back up the physical image of a
volume (raw logical volume) rather
than individually backing up the files
within it.
- Some customers pre-combine many small
files on the client system, as with
the Unix 'tar' command or personal
computer file bundling packages, thus
reducing the quantity to a single
bundle file.
- If regulations require you to keep
files for a certain period, consider
using Backup Sets rather than doing
full backups.
- Consider a "divide and conquer"
approach, using parallel backup
processes to operate on separate areas
of a file system housing many small
files, to reduce the overall time to
perform the backup. You may employ a
'dsmc i' for each major top-level
directory, to back up into the same
TSM server filespace, or use the
VIRTUALMountpoint option to cause the
file system to be treated as multiple
filespaces. Naturally, this can be
effective only if your disk and I/O
path can meet the demands.)
Your retention policies need to be
reasonable: don't arbitrarily retain a
year's worth of versions, but rather
keep as much as is really needed to
recover files.
Make sure you are running regular,
unlimited expirations, else your TSM
database will balloon.
The backup of small files is also
problematic with tape drives with poor
start-stop characteristics (see
Backhitch).
The condition of the directory in which
the small files exist can also slow
things down: see "Backup performance".
Consider turning on client tracing to
identify the specific problem area.
Master Drive An informal name for the first, SMC
drive in a SCSI library, such as the
3584. (Remove that drive and you suffer
ANR8840E trying to interact with the
library.)
MATCHAllchar Client option to specify a character to
be used as a match-all wildcard
character. The default is an asterisk
(*).
MATCHOnechar Client option to specify a character to
be used as a match-one wildcard
character. The default is a question
mark (?).
MAX SQL statement to yield the largest
number from all the rows of a given
numeric column.
See also: AVG; COUNT; MIN; SUM
MAXCAPacity Devclass keyword for some devices
(principally, File) to specify the
maximum size of any data storage files
defined to a storage pool categorized by
this device class.
MAXCAPACITY, if set to other than 0
determines the maximum amount of data
ADSM will put to a tape, ESTCAPACITY, if
MAXCAPACITY is not set, is an estimate
used for some calculations for
reclamation and display, but does not
determine when a tape is full.
On VM and MVS servers MAXCAPACITY is the
maximum amount of data that ADSM will
put on a tape, but if the tape becomes
physically full, or has certain errors,
it will be marked full before it reaches
that capacity. The capacity reported by
ADSM does not consider compression. If
client compression is used, or if the
data is not very compressible (backups
of zip files, for examples) then ADSM
will report a full tape will a smaller
capacity. Most tape manufacturers give
their tape capacity assuming compression
(I think normally around 3/1), so if you
are sending already compressed data, you
will not be able to reach the stated
capacities.
MAXCMDRetries Client System Options file (dsm.sys)
option to specify the maximum number of
times you want the client scheduler to
attempt to process a scheduled command
which fails.
Default: 2
Do not confuse with the Copy Group
SERialization parameter, which governs
attempts on a busy file, not session
reattempts.
Maximum command retries 'Query STatus'
Maximum mounts See: MOUNTLimit
Maximum Scheduled Sessions 'Query STatus' output reflecting the
number of schedule sessions possible, as
controlled by the 'Set MAXSCHedsessions'
command percentage of the the Maximum
Sessions value seen in 'Query STatus'.
Default: 50% of Maximum Sessions.
MAXMIGRATORS HSM: New in 4.1.2 HSM client, per the
IP22148.README.HSM.JFS.AIX43 file:
Starting with this release, dsmautomig
starts parallel sessions to the TSM
server that allows to migrate more than
one file at a time. The number of
parallel migration sessions is
recognized by the dsmautomig process
specific option that can be configured
in the dsm.sys file:
MAXMIGRATORS <number of parallel
migration sessions>
(default = 1, min = 1, max = 20)
Make sure that sufficient resources are
availabale on the TSM server for
parallel migration. Avoid to set the
MAXMIGRATORS option higher than number
of sessions on the TSM server can be
used for storing data.
maxmountpoint You mean MAXNUMMP (q.v.)
MAXNUMMP TSM 3.7+ server REGister Node, UPDate
Node parameter to limit the number of
concurrent mount points, per node, for
Archive and Backup operations. Prevents
a client from taking too many tape
drives at one time. Affects
parallelization.
This option is ignored for Restore and
Retrieve operations.
Code 0 - 999. Default: 1
Warning: A value of 0 will result in
ANS1312E message and immediate
termination of a backup/archive session;
but restore/retrieve will not be
impeded.
Warning: Upgrading to 3.7, with its
attendant database conversion, results
in the MAXNUMMP value being 0!
The RESOURceutilization should not
exceed MAXNUMMP.
Ref: TSM 3.7 Technical Guide, 6.1.2.3
See also: KEEPMP; MOUNTLimit;
Multi-session client; REGister Node
MAXPRocess Operand in 'BAckup STGpool',
'MOVe NODEdata', 'RESTORE STGpool', and
'RESTORE Volume' to parallelize the
operation - tempered by the number of
tape drives. Note that the "process"
implications in the name harks back to
the days when server taks were performed
by individual processes: in these modern
times, MAXPRocess is figurative and
actually governs the number of threads.
MAXRecalldaemons Client System Options file (dsm.sys)
option to specify the maximum number of
dsmrecalld daemons which may run at one
time to service HSM recall requests.
Default: 20
MAXRECOncileproc Client System Options file (dsm.sys)
option to specify the maximum number of
reconcilliation processes which HSM can
start automatically at one time.
Default: 3
MAXSCRatch Operand in 'DEFine STGpool' to govern
the use of scratch tapes in the storage
pool. Specifies the maximum number of
scratch volumes that may be taken for
the storage pool, cumulatively. That is,
each volume taken from the scratch pool
is still known as a scratch volume, as
reflected in the Query Volume "Scratch
Volume?" value, and will return to the
scratch pool when emptied. The
MAXSCRatch value is thus the storage
pool's quota limit.
Setting MAXSCRatch=0 prevents use of
scratch volumes, an intentional special
case when you want to have the storage
pool use on volumes specifically
assigned to it, via 'DEFine Volume'. If
MAXSCRatch is greater than 0 and you
have also DEFine'd volumes into the
storage pool, the DEFine'd volumes will
be used first, then scratches.
Msgs: ANR1221E
MAXSCRatch, query 'Query STGpool ... Format=Detailed';
look for the value associated with
"Maximum Scratch Volumes Allowed".
MAXSCRatch and collocation ADSM will never allocate more than
'MAXSCRatch' volumes for the storage
"raw logical volume" "lock files" /tmp
pool: collocation becomes defeated when
the scratch pool is exhausted as ADSM
will then mingle clients. When a new
client's data is to be moved to the
storage pool, ADSM will first try to
select a scratch tape, but if the
storage pool already has 'MAXSCRatch'
volumes then it will select the tape
with the lowest utilization in the
storage pool.
MAXSessions Server options definition (dsmserv.opt).
Specifies the number of simultaneous
client sessions. The MAXSessions value
is incremented by prompted sessions,
polling sessions, and admin sessions.
When an attempt is made to prompt a
client there is a 1 minute delay for
response from that client. The next
client to be prompted is not prompted
until either the first client responds
or the 1 minute delay elapses. So if you
have many prompted clients, be sure your
schedule starttime duration is large
enough to accomodate 1 minute delays.
Typically the client will start as soon
as prompted, so you may have prompted
clients that are not "loaded" and
consequently the entire delay is used
waiting for a client that is not going
to respond. Even if you are maxed out
on the MAXSessions value, you can always
start more administrative clients.
Default: 25 client sessions
Ref: Installing the Server...
See also: Multi-session Client;
"Set MAXSCHedsessions <%Sched>", whereby
part of this total MAXSessions value is
devoted to Schedule sessions; SETOPT
MAXSessions server option, query 'Query OPTion', see "Maximum Scheduled
Sessions".
MAXSize STGpool operand to define the maximum
size of a Physical file which may be
stored in this pool. (Remember that
Physical size refers to the size of an
Aggregate, not the size of a Logical
file from the client file system. See
"Aggregates".)
Limiting the size of a file eligible
for a given pool in a hierarchy causes
larger files to skip that storage pool
and try the next one down in the
hierarchy. If the file is too big for
any pool in the hierarchy, it will not
be stored.
The file's size, as reported by the
operating system, is compared to the
storage pool's MAXSize value PRIOR TO
compression.
Value can be specified as "NOLIMIT"
(which is the default), or a number
followed by a unit type: K for
kilobytes, M for megabytes, G for
gigabytes, T for terabytes.
Examine current values via server
command 'Query STGpool Format=Detailed'.
Msgs: ANS1310E
See also: Storage pool space and
transactions
MAXThresholdproc Client System Options file (dsm.sys)
option to specify the maximum number of
HSM threshold migration processes which
can start automatically at one time.
Default: 3
Maximum sessions, define "MAXSessions" definition in the server
options file.
Maximum sessions, get 'Query STatus'
MB Megabyte: To be considered equal to
1024 x 1024 = 1,048,576 in TSM.
(Note that disk makers base their
sizings on 1000, not 1024...to make
their offerings seem more capacious.)
See also: Kilobyte
MBps Megabytes per second, a data rate
typically used with tape drives.
Mbps Megabits per second, a data rate
typically associated with data
communication lines.
Media Access Status Element of Query SEssion F=D report.
"Waiting for access to output volume
______ (___ seconds)" may reflect the
volume name that the session was waiting
for when it started - but that may no
longer be the actual volume needed. For
example: an Archive session fills the
disk storage pool in a hierarchy where
tape is the next level, and so a
migration process is incited...and so
the client is waiting on the tape which
the migration process is migrating to.
Then that tape fills. Migration goes on
to a fresh tape, but the archive session
still shows waiting for access to the
original tape.
When neither Query Process nor Query
Session F=D show the volume identified
in "Waiting for access...", it can be
due to a backup of HSM-managed space
where that volume is feeding the backup
directly from the storage pool rather
than the client, as HSM backups operate
where the HSM space is on the *SM
server. Query Session F=D shows only the
output volume, not the implicit input.
"Current output volume(s): ______,(470
Seconds)" is an undocumented form, which
seems to reflect how long the tape has
been idle, as for example when the
client is looking for the next candidate
file to back up. This impression is
reinforced by the Seconds value dropping
back to zero periodically. If that HSM
backup cannot mount either the input or
output volumes for lack of drives, the
field will report two "Waiting for mount
point..." instances, which looks odd but
makes perfect sense.
Media fault message ANR8359E Media fault ... (q.v.)
Media Type IBM 34xx tape cartridges have an
external one-character ID, as follows:
'1' Cartridge System Tape (CST): 3490
'E' Enhanced Capacity Cartridge System
Tape (ECCST): 3490E
'J' Magstar 3590 tape cartridge (HPCT)
'K' Magstar 3590 tape cartridge (EHPCT)
See also: CST; ECCST; HPCT
Media TSM db table to intended to report
volumes managed via the MOVe MEDia cmd.
Columns: VOLUME_NAME, STATE
(MOUNTABLEINLIB, MOUNTABLENOTINLIB),
UPD_DATE (YYYY-MM-DD HH:MM:SS.000000),
LOCATION, STGPOOL_NAME, LIB_NAME, STATUS
(EMPTY, FILLING, FULL, ACCESS (READONLY,
etc.), LRD (YYYY-MM-DD HH:MM:SS.000000).
(LRD is Last Reference Date.)
MEDIA1 A less-used designation for 3490 base
cartridge technology. See CST.
MEDIA2 A less-used designation for 3490E
cartridge technology. See ECCST.
MEDIA3 A less-used designation for 3590
cartridge technology.
mediaStorehouse 199901 product from Manage Data Inc.
which functions as an ADSM proxy client
to service backup and restore of
network-client data via CORBA wherever
the user currently happes to be (based
upon userid). www.managedata.com
Media Wait (MediaW) "Sess State" value in 'Query SEssion'
for when a sequential volume (tape) is
to be mounted to serve the needs of that
session with a client and the session
awaits completion of that mount. This
could mean waiting either for a mount
point or a volume in use by another
session or process. Another cause is
the tape library being unavailable, as
in a 3494 in Pause mode.
When using a TDP, refer to its User Guide
regarding multi-session (Stripes), where
you will probably need to enable
collation by filespace.
Recorded in the 24th field of the
accounting record, and the
"Pct. Media Wait Last Session" field of
the 'Query Node Format=Detailed' server
command.
See also: Communications Wait;
Idle Wait; SendW; Run; Start
Medium changer, list contents Unix: 'tapeutil -f /dev/____ inventory'
Windows: 'ntutil -t tape_ inventory'
See: ntutil; tapeutil
Medium Mover (SCSI commands) 3590 tape drive: Allows the host to
control the movement of tape cartridges
from cell to cell within the ACF
magazine, treating it like a mini
library of volumes.
Megabyte See: MB
Memory limits See: Unix Limits
Memory-mapped I/O You mean Shared Memory (q.v.)
MEMORYEFficientbackup ADSMv3+ Client User Options file
(dsm.opt) option specifies a more memory
conserving algorithm for processing
incremental backups, backing up one
directory at a time, and using less
memory. This obviously occurs at (great)
expense of backup performance.
Choices:
No Your client node uses the faster,
more memory-intensive method when
it processes incremental backups.
Yes Your client node uses the method
that uses less memory when
processing incremental backups -
BUT WITH A BIG PERFORMANCE PENALTY.
Note: This option can also be defined on
the server.
Msgs: ANS1030E
See also: LARGECOMmbuffers
Message explanation You can do 'help MsgNumber' to get info
about a message. For example: with
message ANR8776W, you can simply do
'help 8776'.
Message filesets (TSM AIX server) tivoli.tsm.msg.en_US.devices
tivoli.tsm.msg.en_US.server
tivoli.tsm.msg.en_US.webhelp
Message interval "MSGINTerval" definition in the server
options file.
MessageFormat Definition in the server options file.
Specifies the message headers in all
lines of a multi-line message. Possible
option numbers:
1 - Only the first line of a multi-line
message contains the header.
2 - All lines of a multi-line message
contain headers.
Default: 1
Ref: Installing the Server...
MessageFormat server option, query 'Query OPTion'
Messages, suppress Use the Client System Options file
(dsm.sys) option "Quiet".
See also: VERBOSE
MGMTCLASSES SQL Table for Management Classes.
Columns: DOMAIN_NAME, SET_NAME,
CLASS_NAME, DEFAULT, DESCRIPTION,
SPACEMGTECHNIQUE, AUTOMIGNONUSE,
MIGREQUIRESBKUP, MIGDESTINATION,
CHG_TIME, CHG_ADMIN, PROFILE
MGSYSLAN Managed System for LAN license.
MIC Memory-in-Cassette: Sony's non-volatile
memory chip in their AIT cartridge.
See: AIT; MAM
Microcode, acquire Call 1-800-IBM-SERV and request the
latest microcode for your device.
Microcode, install Can use tapeutil or ntutil (Tape Drive
Service Aids): select "Microcode
Load"...
- position to equivalent /dev/rmtx and
hit Enter;
- at "Enter Filename" enter the
filename of your new firmware;
- press F7
- download of firmware to the drive
begins; successful download will be
displayed (message "Operation
completed successfully!")
- press F10 and enter q to exit
tapeutil/ntutil.
Microcode in tape drive Run /usr/lpp/adsmserv/bin/mttest...
select 1: manual test
select 1: set device special file
e.g.: /dev/rmt0
select 20: open
select 46: device information or select
37: inquiry
MICROSECONDS See: DAYS
Microsoft Cluster Server Environment See IBM article swg21109932
scheduled backups, verify
Microsoft Exchange See: Exchange; TDP for Exchange
MIGContinue ADSMv3+ Stgpool keyword to specify
whether *SM is allowed to migrate files
that have not exceeded the MIGDelay
value. Default: Yes.
Because of the MIGDelay parameter, it is
now possible for *SM to complete a
migration process and not meet the low
migration threshold. This can occur if
the MIGDelay parameter value prevents
*SM from migrating enough files to
satisfy the low migration threshold. The
MIGContinue parameter allows system
administrators to specify whether ADSM
is allowed to migrate additional files.
Exploitation note: This setting allows a
very nice archival scheme to be
implemented. Say you run a time sharing
system, and when users leave you archive
their home directories as a tar file in
a storage pool. But you only want to
keep the most recent year's worth of
data there, and want anything older to
be written to separate tapes that can be
ejected from the tape library when they
fill. You can set MIGDelay=365 and
MIGContinue=No. This will keep recent
files in the "current" storage pool and,
when you drop the HIghmig value to cause
migration to the "oldies" storage pool
below it, files more than a year old
will go there. Neat.
See also: MIGDelay; Migration
MIGDelay ADSMv3+ Stgpool keyword to specify the
minimum number of days that a file must
remain in a storage pool before the file
becomes eligible for migration from the
storage pool. The number of days is
counted from the day that the file was
stored in the storage pool or retrieved
by a client, whichever is more recent.
(The NORETRIEVEDATE server option
prevents retrieval date recording.)
This parameter is optional.
Allowable values: 0 to 9999 (27.39 yrs)
Default: 0, which means migration is
not delayed, which causes migration to
be determined purely in terms of
occupancy level.
See also: MIGContinue; NORETRIEVEDATE
MIGFILEEXPiration Client System Options file (dsm.sys)
HSM option to specify the number of days
that copies of migrated/premigrated
files are kept on the server after they
are modified on or deleted from the
client file system. That is, the
no-longer-viable migrated copy of the
file in the HSM server is removed while
the original remains intact on the
client and a new, migrated copy of a
modified file may now be present on the
ADSM server. Note that the expiration
clock starts ticking after
reconciliation is run on the file
system; and that HSM takes care of its
own expiration, rather than it being
done in EXPIre Inventory.
Default: 7 (days)
MIGPRocess Operand of 'DEFine STGpool' and
'UPDate STGpool' to specify the number
of processes to be used for migrating
files from the (disk) storage pool to a
lower storage pool in the hierarchy of
storage pools. (You cannot specify this
operand on sequential (tape) storage
pools, in that tape is traditionally a
final destination.) Default: 1 process.
Note that it pertains to migrating from
a disk storage pool down to tape: you
cannot specify migration *from* tape.
Migration occurs with one process per
node, moving *all* of the data for one
node before going on to the data for
another node. The order of nodes
processed is per largest amount of data
in the disk storage pool. See APAR
IX77884. This means that if only one
node session is active, you will get
just one migration process, regardless
of the MIGPRocess value.
%Migr (ADSMv2 server) See: Pct Migr
Migrate files (HSM) 'dsmmigrate Filename(s)'
Migrate Install Usually refers to an upgrade of the TSM
server, in place, installing new TSM
server software on a system which had
been running an earlier TSM.
Ref: Quick Start manual
See also: dsmserv UPGRADEDB
migrate-on-close recall mode A mode that causes HSM to recall a
migrated file back to its originating
file system only temporarily. If the
file is not modified, HSM returns the
file to a migrated state when it is
closed. However, if the file is
modified, it becomes a resident file.
You can set the recall mode for a
migrated file to migrate-on-close by
using the dsmattr command, or set the
recall mode for a specific execution of
a command or series of commands to
migrate-on-close by using the dsmmode
command. Contrast with normal recall
mode and read-without-recall recall
mode.
Migrated file A file that has been copied from a local
file system to ADSM storage and replaced
with a stub file on the local file
system. Contrast with resident file and
premigrated file.
See also: Leader data; Stub file
Migrated file, accessibility 'dsmmode -dataACCess=n' (normal) makes
migrated files appear resident, and
allow them to be retrieved.
'dsmmode -dataACCess=z' makes migrated
files appear to be zero-length, and
prevents them from being retrieved.
Migrated file, display its recall 'dsmattr Filename'
mode
Migrated file, set its recall mode 'dsmattr -recallmode=n|m|r Filename'
(HSM) where recall mode is one of:
- n, for Normal
- m, for migrate-on-close
- r, for read-without-recall
Migrated files, HSM, list from client 'dsmls'
'dsmmigquery -SORTEDMigrated'
(this takes some time)
Migrated files, HSM, list from server 'Query CONtent VolName ...
Type=SPacemanaged'
Migrated files, HSM, count In dsmreconcile log.
MIgrateserver HSM: Client System Options file
(dsm.sys) option to specify the name of
the ADSM server to be used for HSM
services (file migration - space
management). Code at the head of the
dsm.sys file, not in the server stanzas.
Cannot be overridden in dsm.opt or via
command line. Using -SErvername on the
command line does not cause
MIgrateserver to use that server.
Default: server named on DEFAULTServer
option.
Migration A concept which occurs in several places
in ADSM:
Storage pools: Refers to migrating files
from one level to a lower level in a
storage pool hierarchy when the Pct Migr
value (Query STGpool report) reaches the
specified threshhold percentage
(HIghmig), mitigated by other control
values such as MIGDelay and
NORETRIEVEDATE.
Occurs with one process per node
(regardless of the MIGPRocess value),
moving *all* of the data for one node
before going on to the data for another
node - or before again checking the
LOwmig value. The order of nodes
processed is per largest amount of data
in the disk storage pool.
Priority: Will wait for a Move Data
process to complete, and then take a
tape drive before any additional waiting
Move Data processes start.
By using the ADSMv3 Virtual Volumes
capability, the output may be stored on
another ADSM server (electronic
vaulting).
HSM: The process of copying a file from
a local file system to ADSM storage and
replacing the file with a stub file on
the local file system.
See also: threshold migration; demand
migration; selective migration
See: DEFine STGpool; HIghmig; LOwmig;
MIGDelay, NORETRIEVEDATE
Migration, Auto, manually perform for HSM: 'dsmautomig [FSname]'
file system
Migration, prevent at start-up To prevent migration from occurring
during a problematic TSM server restart,
add the following (undocumented) option
to the server options file:
NOMIGRRECL
Migration, prevent over time To prevent migration from occurring
during normal TSM operation, do
'UPDate STGpool <PoolName>
MIGContinue=No MIGDelay=9999'
This says that the server is not to
migrate files unless the files satisfy
the migration delay time, and that delay
time is maximized (27.39 years), which
in combination prevents migration.
Migration, storage pool files General ADSM concept of migrating a
storage pool's files down to the next
storage pool in a hierarchy when a given
pool exceeds its high threshold value.
Migration, storage pool files, query 'Query STGpool [STGpoolName]'
Migration, storage pool files, set The high migration threshold is
specified via the "HIghmig=N" operand of
'DEFine STGpool' and 'UPDate STGpool'.
The low migration threshold is specified
via the "LOwmig=N" operand.
Note that LOwmig is effectively
overridden to 0 when CAChe=Yes is in
effect for the storage pool, because
ADSM wants to cache everything once
migration is triggered.
Migration and reclamation As a TSM server pool receives data, the
server checks to see if migration is
needed. This migration causes cascading
checks as the next stgpool in the
hierarchy receives data. When the bottom
of the storage pool hierarchy is
reached, the migration checking thread
will initiate reclamation checking
against this lowest level stgpool if it
is a sequential stgpool. If there are
multiple sequential storage pools within
the storage pool hierarchy, reclamation
processing will start on the lowest
hierarchy position and proceed to the
next level storage pool in the
hierarchy.
Migration candidate considerations Too small? A file will not be a
(HSM) candidate for migration if its size is
smaller than the stub file size (as
revealed in 'dsmmigfs query').
Management class proper? As installed,
HDM will not migrate files unless they
have been backed up.
'dsmmigquery FSname'
Migration candidates, list (HSM) 'dsmmigquery FSname'
Migration candidates list (HSM) A prioritized list of files that are
eligible for automatic migration at the
time the list is built. Files are
prioritized for migration based on the
number of days since they were last
accessed (atime), their size, and the
age and size factors specified for a
file system. Note that time of last
access is a measure of demand for the
file, so is used as a basis rather than
modification time.
Can be rebuilt by the client root user
command:
'dsmreconcile [-Candidatelist]
[-Fileinfo]'
See: candidates
Migration in progress? 'Query STGpool ____ Format=Detailed'
"Migration in Progress?" value.
Migration not happening That is, migration from a higher level
storage pool to a lower one in a
storage pool hierarchy is not
happening.
- The presence of server option
NOMIGRRECL will prevent it.
Migration not happening (HSM problem) See: HSM migration not happening
Migration performance The migration of data from one storage
pool to a lower one - particularly to
tape - is limited by:
- Your collocation specification, which
can cause many tapes to be mounted as
files are "delivered" to their
appropriate places in the next storage
pool.
- The *SM database is in the middle of
the action, so its cache hit ratio
performance is important with many
small files.
- Long mount retention periods can
prolong processing in having to wait
for an idle tape to be dismounted
before the next one can be mounted.
- The MOVEBatchsize and MOVESizethresh
server option values will govern how
much data moves in each server
transaction.
- The performance of your tape
technology is also a factor.
- In moving from disk to tape, realize
that the conflicting characteristics
of the two media can hamper
performance... Disk is a bit-serial
medium which has to perform seeks to
get to data. Tape is a byte-parallel
medium which is always ready to write
when in streaming mode, where its
transfer rate is typically much faster
than disk. If the tape to wait for the
disk to provide data, the tape drive
is forced into start/stop mode, which
particularly worsens throughput in
some tape technologies.
- With caching in effect, there will be
more disk seek time to step over older
cached files in migrating new files,
while the receiving tape drive waits.
See: MOVEBatchsize, MOVESizethresh
Migration Priority A number assigned to a file in the
Migration Candidates list (candidates
file), computed by:
- multiplying the number of days since
the file was last accessed by the age
factor;
- multiplying the size of the file in
1-KB blocks times the size factor;
- add those two products to produce the
priority score (Migration Priority).
This ends up in the first field of the
candidates file line.
See: candidates
Migration processes, number of Code on "MIGPRocess=N" keyword of
'DEFine STGpool' and 'UPDate STGpool'.
Default: 1.
See: MIGPRocess
Migration storage pool (HSM) Specified via
'DEFine MGmtclass MIGDESTination=StgPl'
or
'UPDate MGmtclass MIGDESTination=StgPl'.
Default destination: SPACEMGPOOL.
Migration vs. Backup, priorities Backups have priority over migration.
MIGREQUIRESBkup (HSM) Mgmtclass parameter specifying that a
backup version of a file must exist
before the file can be migrated.
Default: Yes
Query: 'Query MGmtclass' and look for
"Backup Required Before Migration".
See also: Backup Required Before
Migration; RESToremigstate
MIM (3590) Media Information Message. Sent to
the host system. AIX: appears in Error
Log.
Severity 1 indicates high temporary
read/write errors were detected
(moderate severity).
Severity 2 indicates permanent
read/write errors were detected (serious
severity).
Severity 3 indicates tape directory
errors were detected (acute severity).
Ref: "3590 Operator Guide" manual
(GA32-0330-06) esp. Appendix B
"Statistical Analysis and Reporting
System User Guide"
See also: SARS; SIM
MIN SQL statement to yield the smallest
number from all the rows of a given
numeric column.
See also: AVG; COUNT; MAX; SUM
MINRecalldaemons Client System Options file (dsm.sys)
option to specify the minimum number of
dsmrecalld daemons which may run at one
time to service HSM recall requests.
Default: 3
See also: MAXRecalldaemons
MINUTE(timestamp) SQL function to return the minutes value
from a timestamp.
See also: HOUR(), SECOND()
MINUTES See: DAYS
Mirror database Define a volume copy via:
'DEFine DBCopy Db_VolName Copy_VolName'
MIRRORRead DB server option, query 'Query OPTion'
MIRRORRead LOG|DB Normal|Verify Definition in the server options file.
Specifies the mode used for reading
recovery log pages or data base log
pages. Possibilities:
Normal: read one mirrored volume to
obtain the desired page;
Verify: read all mirror volumes for a
page every time a recovery log
or database page is read, and
if an invalid page is
encountered, to resync with
valid page from other volume
(decreases performance but
assures readability).
This should be in effect when a
(standalone) dsmserv auditdb is
run.
Default: Normal
Ref: Installing the Server...
MIRRORRead LOG server option, query 'Query OPTion'
MIRRORWrite DB server option, query 'Query OPTion'
MIRRORWrite LOG|DB Sequential|Parallel Definition in the server options file.
Specifies how mirrored volumes are
accessed when the server writes pages to
the recovery log or data base log during
normal processing. "Sequential" is
"conditional mirroring" such that data
won't be written to a mirror copy until
successfully written to the primary.
Default: Sequential for DB;
Parallel for LOG
Comments: *SM Sequential mirroring *is*
better than RAID because of the danger
of partial page writes - which *do*
occur in the real world as hardware and
human defects evidence themselves. RAID
will perform the partial writing in
parallel, thus resulting in a corrupted
database if the writing is interrupted,
whereas *SM Sequential mirroring will
leave you with a recoverable database -
by simple resync, not "recovery". That
is, RAID is just as problematic as *SM
Parallel mirroring.
Mirroring of the *SM database is much
debated. You could let the hardware or
operating system perform mirroring
instead, but you lose the advantaged of
the *SM application mirroring - which
also include being able to put the
mirrors on any arbitrary volume, not in
a single Volume Group as AIX insists.
Ref: Installing the Server...
MIRRORWrite LOG server option, query 'Query OPTion'
Missed Status in Query EVent output indicating
that the scheduled startup window for
the event has passed and the schedule
did not begin. When you have SCHEDMODe
PRompted and have a client schedule set
up for the node, then it is missed if
the server couldn't contact the client
within the time window.
The dsmsched.log will typically show
"Scheduler has been stopped."
One mundane cause of Missed is that the
client scheduler process already has a
(long-running) session underway, as in
the case of a backup which runs much
longer than expected because of a lot of
new data in the file system, which runs
well past the start time for the next
session.
See also: Failed; Schedule, missed
Mobile Backups See: Adaptive differencing; SUBFILE*
MODE A TSM server Copy Group attribute that
specifies whether a backup should be
performed for an object that was not
modified since the last time it was
backed up.
Choices:
MODified The default, almost always
used. Causes a file to be backed up
only if it has changed since the last
backup. In general, TSM considers a
file changed if any of the following is
true:
- The date last modified is different
- The file size is different
- The file owner is different
- The file permissions are different
Criteria may vary by platform,
particularly in Windows.
ABSolute Specifies that file system
objects are to be backed up regardless
of whether they have been modified.
Putting this choice into effect for one
backup is a technique for performing a
full backup of a file system.
See also: ABSolute; MODified;
Backup, which files are backed up
MODE (-MODE) Client option used in conjunction with
Backup Image to specify the type of file
system style backup that should be used
to supplement the last image backup.
Choices:
Selective The default. Causes the
usual image backup to be performed, to
distinguish from the Incremental
choice.
(The name of this choice is unfortunate
in that it invites confusion with the
standard TSM Selective backup, which
this choice has nothing to do with. The
name of this choice should have been
"Image".
Incremental Only back up files whose
modification timestamp is later than
that of the last image backup. This is
accomplished via an -INCRBYDate backup,
whose nature means that deleted files
cannot be detected and head toward
expiration on the server, and nor can
files whose attributes have changed be
detected for backup. If there was no
prior image backup, this Incremental
choice will be ignored as an erroneous
specification, and a full image backup
will be performed, as if Selective had
instead been the choice.
See also: dsmc Backup Image
MODified A backup Copy Group attribute that
indicates that an object is considered
for backup only if it has been changed
since the last backup. An object is
considered changed if the date, size,
owner, or permissions have changed.
(Note that the file will be physically
backed up again only if TSM deems the
content of the file to have been
changed: if only the attributes (e.g.,
Unix permissions) have been changed,
then TSM will simply update the
attributes of the object on the server.)
See also: MODE
Contrast with: ABSolute
See also: Backup, which files are backed
up; SERialization (another Copy Group
parameter)
Monitoring products See: TSM monitoring products
MONTHS See: DAYS
Mount in progress Server command: 'SHow ASM'
Mount limit See: MOUNTLimit
Mount message See: TAPEPrompt
Mount point, keep over whole session? The 'REGister Node' operand KEEPMP
controls this.
Mount point queue Server command: 'SHow ASQ'
Mount point wait queue IBM internal term for how ADSM
prioritizes server tasks needing tapes.
MOVe Datas have a higher priority than
some other tasks.
Mount points Defined globally in DEVclass MOUNTLimit
Restricted thereunder via REGister Node
parameters KEEPMP and MAXNUMMP,
governing the number of mount points
available for other sessions.
See: KEEPMP; MAXNUMMP; MOUNTLimit
Mount points, maximum See: MOUNTLimit
Mount points, report active 'SHow MP'
Mount request timeout message ANR8426E on a CHECKIn LIBVolume.
Mount requests, pending 'Query REQuest' (q.v.).
Via Unix command:
'mtlib -l /dev/lmcp0 -qS'
Mount requests, service console See: -MOUNTmode
Mount Retention Output field in report from
'Query DEVclass Format=Detailed'.
Value is defined via MOUNTRetention
operand of 'DEFine DEVclass' command.
See also: KEEPMP; MAXNUMMP; MOUNTLimit;
MOUNTRetention
Mount retention period, change See: MOUNTRetention
Mount tape Via Unix command:
'mtlib -l /dev/lmcp0 -m -f /dev/rmt?
-V VolName' # Absolute drivenm
'mtlib -l /dev/lmcp0 -m -x Rel_Drive#
-V VolName' # Relative drive#
(but note that the relative drive
method is unreliable).
Note that there is no ADSM command to
explicitly mount a tape: mounts are
implicit by need.
Once mounted, it takes 20 seconds for
the tape to settle and become ready for
processing.
See also: Dismount tape
Mount tape, time required For a 3590 tape drive:
If a drive is free, it takes a nominal
32 seconds for the 3494 robot to move to
the storage cell containing the tape,
carry the tape to the drive, load the
tape, and have it wind within the drive.
Wind-on time itself is about 20 seconds.
Note that if you have two tape drives
and your mount request is behind another
which is just starting to be processed,
you should expect your mount to take
twice as long, or about 64 seconds.
To rewind, dismount, mount a new tape in
that drive, and position it can take 120
seconds.
If a mount is taking an usually long
time, it could mean that the library has
a cleaning tape mounted, cleaning the
drive. Or the tape could be defective,
giving the drive a hard time as it tries
to mount the tape.
MOuntable DRM media state for volumes containing
valid data and available for onsite
processing.
See also: COUrier; COURIERRetrieve;
NOTMOuntable; VAult; VAULTRetrieve
MOUNTABLEInlib State for a volume that had been
processed by the MOVe MEDia command: the
volume contains valid data, is
mountable, and is in the library.
See also: MOVe DRMedia
MOUNTABLENotinlib State for a volume that had been
processed by the MOVe MEDia command: the
volume may contain valid data, is
mountable, but is not in the library (is
in its external, overflow location).
See msg ANR1425W.
See also: MOVe DRMedia
Mounted, is a tape mounted in a drive? The 3494 Database "Device" column will
show a drive number if the tape is
mounted, and a Cell number of "_ K 6",
where '_' is the wall number. If the
Cell number says "Gripper", the tape is
in the process of being mounted.
Mounted volumes Server command: 'SHow ASM'
MOUNTLimit (mount limit) Operand in 'DEFine DEVclass', to specify
the maximum number of concurrent mounts
within that device class (which is the
same as the maximum for the library
definition associated with that device
class). This is the maximum number of
tape drives which can be used at one
time, among all the tape drives you
have. Usually, you would have your
MOUNTLimit value be equal to the number
of drives you have, so that all of them
may be used at the same time, to fully
service all your clients.
Affects BAckup STGpool, etc.
It should be set no higher than the
number of physical drives you have
available. In ADSMv3+, you can specify
"MOUNTLimit=DRIVES", and ADSM will then
dynamically adjust the MOUNTLimit.
However, IBM recommends (as in the 5.2.2
AIX Admin Guide) that you explicitly
specify the mount limit instead of using
MOUNTLimit=DRIVES.
Default: 1.
Note that MOUNTLimit is an absolute
limit, which sets an upper bound for
related configuration parameters
RESOURceutilization and MAXNUMMP.
-MOUNTmode Command-line option for *SM
administrative client commands
('dsmadmc', etc.) to have all mount
messages displayed at that terminal.
No administrative commands are accepted.
See also: -CONsolemode; dsmadmc
Ref: Administrator's Reference
MOUNTRetention Devclass operand, to specify how long,
in minutes (0-9999), to retain an idle
sequential access volume before
dismounting it. Default: 60 (minutes).
The value should be long enough to allow
for re-use of same mounted tape within
a reasonable time, but not so long that
the tape could end up trapped in the
drive upon an operating system shutdown
which does not give *SM the opportunity
to dismount it. (Always shut *SM down
cleanly if possible.) Another reason to
keep mount retention fairly short is
that having a tape left in a drive only
delays a mount for a new request, in
that the stale tape must be dismounted
first: this is a big consideration in
restorals, particularly of a large
quantity of data as for a whole file
system, in which case it would be worth
minimizing the MOUNTRetention when such
a job runs. Also, the drive mechanism
stays on while tape is mounted, so adds
wear.
Keep mount retention short when
collocation is employed, to prevent
waiting for dismounts, given the
elevated number of mounts involved.
But keep the retention value sufficient
to cover client think time during file
system backups.
Msgs: ANR8325I for dismount when
MOUNTRetention expires.
See also: KEEPMP; MAXNUMMP; MOUNTLimit
MOUNTRetention, query 'Query DEVclass Format=Detailed' and
look for "Mount Retention" value.
Mounts, current 'SHow MP'.
Or Via Unix command:
'mtlib -l /dev/lmcp0 -qS'
for the number of mounted drives;
'mtlib -l /dev/lmcp0 -vqM'
for details on mounted drives.
Mounts, maximum See: MOUNTLimit
Mounts, monitor Start an "administrative client session"
to control and monitor the server from a
remote workstation, via the command:
'dsmadmc -MOUNTmode'. If having a human
operator perform mounts, consider
setting up a "mounts operator" admin ID
and a shell script which would invoke
something to the effect of:
'dsmadmc -ID=mountop -MOUNTmode
-OUTfile=/var/log/ADSM-mounts.YYYYMMDD'
and thus log all mounts.
Ref: Administrator's Reference
Mounts, pending Via ADSM: 'Query REQuest' (q.v.).
Via Unix command:
'mtlib -l /dev/lmcp0 -qS'
Mounts, historical SELECT * FROM SUMMARY WHERE
ACTIVITY='TAPE MOUNT'
Mounts count, by drive See: 3590 tape mounts, by drive
MOUNTWait DEVclass and CHECKIn LIBVolume command
operand specifying the number of minutes
to wait for a tape to mount, on an
allocated drive.
Note that this pertains only to the time
taken for a tape to be mounted by tape
robot or operator once a tape mount
request has been issued, and has been
honored by the library. Example: a task
requires a tape volume which is not in
the library. It does not pertain to a
wait for a tape *drive* when for example
one incremental backup is taking up all
tape drives and another incremental
backup comes along needing a tape drive.
Default: 60 min.
Advice: The MOUNTWait value should be
larger than the MOUNTRetention to assure
that idle volumes have a chance to
dismount and free drives before the
MOUNTWait time expires.
MOVe Data Server command to move a volume's viable
data to volume(s) within the same
sequential access volume storage pool
(default) or a specified sequential
access volume storage pool. (MOVe Data
cannot be used on DISK devtype (Random
Access) storage pools. The source
storage pool may be a disk pool, with
the target being the defined
NEXTstgpool, whereby MOVe Data
essentially will accomplish what
migration does, but physically rather
than logically.
Copy storage pool volume contents can
only be moved to other volumes in the
same copy storage pool: you cannot move
copy storage pool data across copy
storage pools.
MOVe Data can effectively reclaim a tape
by compacting the data onto another
volume. Syntax:
'MOVe Data VolName [STGpool=PoolName]
[RECONStruct=No|Yes] [Wait=No|Yes]'
RECONStruct is new with TSM 5.1, and
allows the vacated space within
aggregates to be reclaimed, thus
allowing Move Data to be the equivalent
of Reclamation. The reconstruction does
incur more time. And, again, this can be
done only on sequential access storage
pools.
The "from" volume gets mounted R/O.
By default, data is moved by copying
Aggregates as-is: unlike Reclamation,
MOVe Data does not reclaim space where
logical files expired and were logically
deleted from *within* an Aggregate. (Per
1998 APAR IX82232: RECONSTRUCTION DOES
NOT OCCUR DURING MOVE DATA: "MOVe Data
by design does not perform
reconstruction of aggregates with empty
space. Although this was discussed
during design, it was decided to only
perform reconstruction during
reclamation. A major reason for this
decision was performance as
reconstruction of aggregates requires
additional overhead that MOVe Data does
not; hence requires additional time to
complete.")
Like Reclamation, MOVe Data brings
together all the pieces of each
filespace, which means it has to skip
down the tape to get to each piece. (The
portion of a filespace that is on a
volume is called a Cluster.)
In addition, if the target storage pool
is collocated, each cluster may ask for
a new output tape, and TSM isn't smart
enough to find all the clusters that are
bound for a particular output tape and
reclaim them together. Instead it is
driven by the order of filespaces on the
input tape, so the same output tape may
be mounted many times.
In doing a MOVe Data, *SM attempts to
fill volumes, so it will select the most
full available volume in the storage
pool. Note that the data on the volume
will be inaccssible to users until the
operation completes.
During the move, the 'Query PRocess'
"Moved Bytes" reflects the data in
uncompressed form.
Ends with message ANR1141I (which fails
to report byte count).
May be preempted by higher priority
operation - see message ANR1143W - but
may not preempt the lower priority
reclamation process (msg ANR2420E).
(Move Data has a higher priority on what
IBM internally refers to as the Mount
point wait queue.)
See also: AUDit Volume; NOPREEMPT;
Pct Util; Reclamation
Move Data, find required volumes Move Data would obviously involve the
subject volume itself, and any volumes
containing files that spanned into (the
front of) or out of (the back of) the
volume. This would be identifiable by
the Segment number in Query CONtent
_volname_, or the corresponding Select,
being other than 1/1. For spanning
files, you would then have to perform a
Content table search on the related
segment. (A tape in Filling status would
obviously have no span-out-of segment on
another volume.)
Move Data, offsite volumes When (copy storage pool) volumes are
marked "ACCess=OFfsite", TSM knows not
to use those volumes, to instead use
onsite copy storage pool volumes
containing the same data (from the same
primary storage pool). Naturally, the
files on one offsite volume may be found
on any number of onsite volumes, so
multiple mounts may be expected,
accompanied by a bunch of TSM "think
time" between volumes.
See also: ANR1173E
MOVe Data and caching disk volumes Doing a Move Data on a cached disk pool
volume has the effect of clearing the
cache. This is obvious, when you think
about it, as the cache represents data
that is already in the lower storage
pool in the hierarchy...that data has
been "pre-moved".
MOVe Data performance Move Data operations can be expected to
involve considerable repositioning as
the source tape is processed, to skip
over full-expired Aggregates. Whether
your tape technology is good at
start-stop operations will affect your
throughput.
See also: BUFPoolsize; MOVEBatchsize;
MOVESizethresh
MOVe DRMedia DRM server command to move disaster
recovery media offsite and back onsite.
Will eject the volumes out of the
library before transitioning the volumes
to the destination state. Syntax:
'MOVe DRMedia VolName
[WHERESTate=MOuntable|
NOTMOuntable|COUrier|
VAULTRetrieve|COURIERRetrieve]
[BEGINDate=date] [ENDDate=date]
[BEGINTime=time] [ENDTime=time]
[COPYstgpool=StgpoolName]
[DBBackup=Yes|No]
[REMove=Yes|No|Bulk]
[TOSTate=NOTMOuntable|
COUrier|VAult|COURIERRetrieve|
ONSITERetrieve]
[WHERELOcation=location]
[TOLOcation=location]
[CMd=________]
[CMDFilename=file_name]
[APPend=No|Yes]
[Wait=No|Yes]'
Do not do a MOVe DRMedia where a
MOVe MEDia is called for.
REMove=BUlk is not supposed to result in
a Reply required on SCSI libraries, but
may: the workaround is Wait=Yes.
MOVe MEDia ADSMv3 command to deal with a full
library by moving storage pool volumes
to an external "overflow" location,
typically named on the OVFLOcation
operand of Primary and Copy Storage
Pools. (Think "poor man's DRM".) Unlike
with Checkout, the volume remains
requestable and ultimately mountable,
via an outstanding mount request. (Note
that, internally, MOVe MEDia actually
performs a Checkout Libvolume, as
indicated in its ANR6696I message.)
Syntax:
'MOVe MEDia VolName STGpool=PoolName
[Days=NdaysSinceLastUsage]
[WHERESTate=MOUNTABLEInlib|
MOUNTABLENotinlib]
[WHERESTATUs=FULl,FILling,EMPty]
[ACCess=READWrite|READOnly]
[OVFLOcation=________]
[REMove=Yes|No|Bulk]
[CMd="command"]
[CMDFilename=file_name]
[APPend=No|Yes]
[CHECKLabel=Yes|No]'
By default, moving a volume out of the
library causes it to be made ReadOnly,
and moving it back into the library
causes it to be made ReadWrite.
If you are moving a volume back into a
library (MOUNTABLENotinlib) and it is
not empty, you must specify
WHERESTATUs=FULl for the command to
work, else get ANR6691E error.
OVFLOcation can be used to override that
specification had by the storage pool.
Do not do a MOVe MEDia where a
MOVe DRMedia is called for.
This command moves whole volumes, not
the data within them.
Note that a MOVe MEDia will hang if a
LABEl LIBVolume is running.
After doing MOVe MEDia to move the
volume back into the library:
- The volume will be READWrite, rather
than the READOnly that is conventional
for a moved-out volume;
- Query MEDia no longer shows the volume
(Query Volume does), until CHECKIn is
done;
- You must do a CHECKIn LIBVolume to get
the volume back into play.
What happens when there are more than 10
tapes to go to the 3494 Convenience I/O
Station? TSM moves one at a time, then
an Intervention Required shows up ("The
convenience I/O station is full"): when
you empty the I/O station, the Int Req
goes away, and TSM resumes ejecting
tapes. No indication of the condition
shows up in the Activity Log.
Watch out for ANR8824E message condition
where the request to the library is
lost: the volume will probably have
actually been ejected from the library,
but the MOVe MEDia updating of its
status to MOUNTABLENotinlib would not
have occurred, leaving it in an
in-between state.
Msgs: ANR8762I; ANR2017I; ANR0984I;
ANR0609I; ANR0610I; ANR6696I; ANR8766I;
ANR6683I; ANR6682I; ANR0611I;
ANR0987I (completion)
See also: Overflow Storage Pool;
OVFLOcation; Query REQuest
Ref: Admin Guide, "Managing a Full
Library"
MOVe NODEdata TSM 5.1+ server command to move data for
all filespaces for one or more nodes.
As with the 'MOVe Data' command, when
the source storage pool is a primary
pool, you can move data to other volumes
within the same pool or to another
primary pool; but when the source
storage pool is a copy pool, data can
only be moved to other volumes within
that copy pool (so the TOstgpool
parameter is not usable).
This command can operate upon data in a
storage pool whose data format is NATIVE
or NONBLOCK.
As of 2003/11 the Reference Manual fails
to advise what the Tech Guide does: that
the Access mode of the volumes must be
READWRITE or READONLY, which precludes
OFFSITE and any possibility of onsite
volumes standing in for the offsite
vols.
Cautions: As of 2003/05, the command may
report success though that was not the
case, as in specifying a non-existant
filespace.
Ref: TSM 5.1 Technical Guide
MOVEBatchsize Definition in the server options file.
Specifies the maximum number of client
files that can be grouped together in a
batch within the same server transaction
for storage pool backup/restore,
migration, reclamation, or MOVe Data
operations. Specify 1-1000 (files).
Default: 40 (files).
TSM: If the SELFTUNETXNsize server
option is set to Yes, the server sets
the MOVEBatchsize option to its maximum
values to optimize server throughput.
Beware: A high value can cause severe
performance problems in some server
architectures when doing 'BAckup DB'.
MOVEBatchsize, query 'Query OPTion'; look for
"MoveBatchSize".
MOVESizethresh Definition in the server options file.
Specifies a threshold, in megabytes, for
the amount of data moved as a batch
within the same server transaction for
storage pool backup/restore, migration,
reclamation, or MOVe Data operations.
Specify 1-500 (MB)
Default: 500 (megabytes).
TSM: If the SELFTUNETXNsize server
option is set to Yes, the server sets
the MOVESizethresh option to its maximum
values to optimize server throughput.
MOVESizethresh and MOVEBatchsize Server data is moved in transaction
units whose capacity is controlled by
the MOVEBatchsize and MOVESizethresh
server options. MOVEBatchsize specifies
the number of files that are to be moved
within the same server transaction, and
MOVESizethresh specifies, in megabytes,
the amount of data to be moved within
the same server transaction. When either
threshold is reached, a new transaction
is started.
MOVESizethresh, query 'Query OPTion'; seek "MoveSizeThresh".
MP1 Metal Particle 1 tape oxide formulation
type, as used in the 3590.
Lifetime: According to Imation studies
(http://www.thic.org/pdf/Oct00/
imation.jgoins.001003.pdf)
"All Studies Conclude that Advanced
Metal Particle (MP1) Magnetic Coatings
Will Achieve a Projected Magnetic Life
of 15-30 Years. Media will lose 5% -
10% of its magnetic moment after 15
years. Media resists chemical
degradation even after direct exposure
to extreme environments."
MPTIMEOUT TSM4.1 server option for 3494 sharing.
Specifies the maximum time in seconds
the server will retry before failing the
request. The minimum and maximum values
allowed are 30 seconds and 9999 seconds.
Default: 30 seconds
See also: 3494SHARED; DRIVEACQUIRERETRY
MSCS Microsoft Cluster Server.
MSGINTerval Definition in the server options file.
Specifies the number of minutes that the
ADSM server waits before sending
subsequent message to a tape operator
requesting a tape mount, as identified
by the MOUNTOP option.
Default: 1 (minute)
Ref: Installing the Server...
MSGINTerval server option, query 'Query OPTion'
MSI (.msi file suffix) Designates the Microsoft Software
Installer.
Note that such files are on the CD-ROM,
not in the online download area (which
has .exe, .TXT, and .FTP files).
If you copy the files from the CD for
alternate processing, be aware that
Microsoft does not support running an
MSI from a mapped network drive when you
are connect to a server via remote
desktop to terminal server.
MSI (Microsoft Installer) return codes See item 21050782 on the IBM web site
("Microsoft Installer (MSI) Return Codes
for Tivoli Storage Manager Client &
Server").
msiexec command Invokes the Microsoft Software Installer
as for example
msiexec /i "Z:\tsm_images\TSM_BA_Client
\IBM Tivoli Storage Manager Client.msi"
to install from the CD-ROM or network
drive containing the installation image.
See: Windows client manual
mt See: /dev/mt
MT0, MT1 Tape drive identifiers on Windows 2000.
Example: MT0.0.0.2 for a 3590E drive in
a 3494 library.
mt_._._._ Designation for a tape drive in a
Windows configuration, using Fibre
Channel, as in mt0.0.0.5, where the
encoding means "magnetic tape device,
Target ID 0, Lun 0, Bus 0, with the
final digit being auto assigned by
Windows based on the time of first
detection.
mtadata Exchange server: Message Transfer Agent
data, as in \exchsrvr\mtadata
mtevent Command provided with 3494 Tape Library
Device Driver, being an interface to the
MTIOCLEW function, to wait for library
events and display them.
Usage: mtevent -[ltv?]
-l[filename] Library special filename,
i.e. "/dev/lmcp0".
-t[timeout] Wait for asychronous
library event, for the
specified # of seconds.
If omitted, the program
will wait indefinitely.
-? this help text.
NOTE: The -l argument is required.
mtlib Command provided with 3494 Tape Library
Device Driver to manually interact with
the Library Manager. For environments:
AIX, SGI, Sun, HP-UX, Windows NT/2000.
Do 'mtlib -\?' to get usage info - but
beware that its output fails to show the
legal combinations of options as the
Device Drivers manual does.
-L is used to specify the name of a
file containing the volsers to be
processed - and only with the -a and
-C operands. This is handy for
resetting Category Code values in a
3494 library, via like: 'mtlib -l
/dev/lmcp0 -C -L filename -t"012C"'
-v (verbose) will identify each element
of the output, which makes things
clearer than the "quick" output
which is produced in the absence of
the -v option.
Specify category codes as hex numbers.
(Remember that this is a library
physical command: it knows nothing about
TSM or what is defined in your TSM
system.)
If command fails because "the library is
offline to the host", it indicates
either that the host is not defined in
the 3494's LAN Hosts allowance list, or
that the host is not on the same subnet
as the 3494 in the unusual case that the
subnet is defined as Not Routed.
A mount (-m) may take a considerable
time and then yield:
"Mount operation Error - Internal error"
due to the tape being problematic, but
the mount will probably work.
Ref: "IBM SCSI Tape Drive, Medium
Changer, and Library Device Drivers:
Installation and User's Guide"
(GC35-0154)
mttest Undocumented command for performing
ioctl operations and set's on a tape
drive.
/usr/lpp/adsmserv/bin/mttest. Syntax:
'mttest <-f batch-input-file> <-o
batch-output-file> <-d special-file>'
MTU Maximum Transmission Unit: the hardware
buffer size of an Ethernet card, as
revealed by 'netstat -i'. This is the
maximum size of the frame/packet that
can be transmitted by the adapter.
(Larger packets need to be subdivided to
be transmitted.)
The standard Ethernet MTU size is 1500.
Note that this maximum packet size is a
constraining factor for processes which
use ethernet. For example, a single
process can max out a 10Mb ethernet
card, but it can only drive a 100Mb card
about 2.5x faster because the measly
packet size is so constraining. To make
full use of higher-speed ethernets,
then, one must have multiple processes
feeding them. (10Mb, 100Mb, and gigabit
ethernet all use the same format and
frame size.)
See: TCPNodelay
Multi-homed client See: TCPCLIENTAddress
Multi-session Client TSM 3.7 facility which multi-threads, to
(Multi session client) start multiple sessions, in order to
transfer data more quickly. This will
work for the following program
components: Backup-archive client
(including Enterprise Management Agent,
formerly Web client) Backup and Archive
functions. This new functionality is
completely transparent: there is no
need to switch it on or off. The TSM
client will decide if a performance
improvement can be gained by starting an
additional session to the server. This
can result in as many as five sessions
running at one time to read files and
send them to the server. (So says the
B/A client manual, under "Performing
Backups Using a GUI", "Displaying Backup
Processing Status".)
Types of threads:
- Compare: For generating the list of
backup or archive candidate files,
which is handed over to the Data
Transfer thread. There can be one or
more simultaneous Compare threads.
- Data Transfer: Intereacts with the
client file system to read or write
files in the TSM operation, performs
compression/decompression, handles
data transfer with the server, and
awaits commitment of data sent to the
server. There can be one or more
simultaneous Data Transfer threads.
- Monitor: The multi-session governor.
Decides if multiple sessions would be
beneficial and initiates them.
The number of sessions possible is
governed by the RESOURceutilization
client option setting and server option
MAXSessions.
Mitigating factors: Using collocation,
only one data transfer session per file
space will write to tape at one time:
all other data transfer sessions for the
file space will be in Media Wait state.
Under TSM 3.7 Unix, with "PASSWORDAccess
Generate" in effect, a non-root session
is single-threaded because the TCA does
not support multiple sessions.
Multi-session Client is supported with
any server version; but if the server is
below 3.7, the limit is 2 sessions.
Considerations: Multiple accounting
records for multiple simultaneous
sessions from one command invocation.
Ref: TSM 3.7 Technical Guide, 6.1
See also: MAXNUMMP; MAXSessions;
RESOURceutilization; TCA; Threads,
client
Multi-Session Restore TSM 5.1 facility which allows the
backup-archive clients to perform
multiple restore sessions for No Query
Restore operations, increasing the speed
of restores. (Both server and client
must be at least 5.1.) This is similar
to the multiple backup session feature.
Elements:
- RESOURceutilization parameter in
dsm.sys
- MAXNUMMP setting for the node
definition in the server
- MAXSessions parameter in dsmserv.opt
The efficacy of MSR is obviously limited
by the number of volumes which can be
used in parallel.
From an IBM System Journal article:
"During a large-scale restore operation
(e.g., entire file space or host), the
TSM server notifies the client whether
additional sessions may be started to
restore data through parallel transfer.
The notification is subject to
configuration settings that can limit
the number of mount points (e.g., tape
drives) that are consumed by a client
node, the number of mount points
available in a particular storage pool,
the number of volumes on which the
client data are stored, and a parameter
on the client that can be used to
control the resource utilization for TSM
operations. The server prepares for a
large-scale restore operation by
scanning database tables to retrieve
information on the volumes that contain
the client's data. Every distinct volume
found represents an opportunity for a
separate session to restore the data.
The client automatically starts new
sessions, subject to the afore-mentioned
constraints, in an attempt to maximize
throughput."
Additional info:
http://www.ibm.com/support/
docview.wss?uid=swg21109935
See also: DISK; Storage pool, disk,
performance
Multi-threaded session See: Multi-session Client
Multiple servers See: Servers, multiple
Multiple sessions See: MAXNUMMP; Multi-session Client;
RESOURceutilization
Multiprocessor usage TSM uses all the processors available to
it, in a multi-processor environment.
One customer cited having a 12-processor
system, and TSM used all of them.
MVS Multiple Virtual Storage: IBM's
mainframe operating system, descended
from OS/MFT and OS/MVT (multiple fixed
or variable number of tasks). Because
the operating system was so tailored to
a specific hardware platform, MVS was a
software product produced by the IBM
hardware division. MVS evolved into
OS/390, for the 390 hardware series.
MVS server performance Turn accounting off and you will likely
see a dramatic improvement in
performance. Especially boost the
TAPEIOBUFS server option.
See also: Server performance
MySQL database, back up to TSM See Redpaper "Backing Up Linux Databases
with the TSM API".
There are 29 fields, which are delimited by commas (,) - intended to facilitate
importing the records into a spreadsheet. Each record ends with a new-line
character. The following describes the record format, based upon information in
the Admin Guide, with supplementary information based upon observations. (The
accounting record format changes little over the years.)
Field Contents
1 Server product version. Through ADSMv2, this integer was all there
was to distinguish servers, and was called "Product level". As of
ADSM 3.1, this field's purpose was changed to be "Product version",
and fields pair 30 and 31 were added (q.v.).
Example: "3", as in TSM 3.7.2.
2 Server product sublevel. Example: "15".
3 Product name, "ADSM". (This has not changed, though the product has
transitioned to "TSM" and then "ITSM".)
4 Date of accounting (mm/dd/yyyy) - which is to say, when the session
ended. Corresponds to session-end ANR0403I message date in the TSM
server Activity Log. Has leading zeroes (e.g., 06/23/2004). Note
that the format of this field is immutable, and not affected by
locale settings, such as DATEformat. See also field 21.
5 Time of accounting (hh:mm:ss) - which is to say, when the session
ended. Corresponds to session-end ANR0403I message time in the TSM
server Activity Log. Has leading zeroes (e.g., 06:44:17).
See also field 21.
6 Node name of *SM client. Always upper case. Example: "SERVER1".
7 Client owner name (populated in Unix). Will contain a Unix username
where the session is associated with a user, most commonly with
Archive/Retrieve operations. Otherwise is null.
See: Trusted Communication Agent
8 Client Platform (operating system). Examples: "AIX", "IRIX",
"Linux", "Linux86", "SUN SOLARIS", "WinNT".
Will also reflect name of API program rather than operating system.
9 Authentication method used. Example: "1".
10 Communication method used for the session. Example: "Tcp/Ip".
(There is no further definition of this field provided anywhere.)
11 Normal server termination indicator (Normal="1", Abnormal="00").
Is Abnormal (0) when: the session is terminated by the client, as
via Ctrl-C (ANR0480W); or the session is terminated by the server
due to exceeding IDLETimeout, as when a user just leaves a dsm or
dsmc session idle (ANR0482W); or the client exceeded the COMMTimeout
value (ANR0481W). A schedule involving a failed status does not
seem to cause an Abnormal termination.
12 Archive: Number of archive database objects inserted during the
session. Example: "341".
13 Archive: Amount of data (size), in kilobytes, sent by the client to
the server. Example: "1944135".
14 Retrieve: Number of objects retrieved during the session.
15 Retrieve: Amount of data (size), in kilobytes, retrieved.
16 Backup: Number of backup database objects inserted during the
session. Example: "14408".
17 Backup: Amount of backup file data (size), in kilobytes, sent by the
client to the server, destined for a server storage pool.
This field is also known as the Backup Thread field.
Example: "32177666". See also field 20 comments.
18 Restore: Number of backup database objects retrieved during the
session.
19 Restore: Amount of data (size), in kilobytes, retrieved.
20 Session KB: Amount of data, in kilobytes, communicated between the
client node and the server, in both directions, during the session.
Includes overhead as well as storage pool data. The number
corresponds to message ANE4961I total bytes transferred.
Example: "32229930".
Notes: If, in a Backup, the value in this field is much greater than
the value in field 17, and field 17 is not zero, then the record
reflects a Consumer session which probably involved a lot of retries
on busy files. If field 17 is zero and this field 20 has a high
number, the value largely reflects an unqualified Incremental Backup
where there was a large inventory list of Active files which the
server sent to the client at the beginning of the session.
21 Duration of the session, in seconds. Example: "18838".
You might subtract this from the field 4,5 values to determine when
the session started - which should correspond to the ANR0406I
session started message in the TSM Activity Log.
22 Amount of idle wait time during the session, in seconds.
See quickfact on: Idle wait
23 Amount of communications wait time during the session, in seconds.
See quickfact on: Communications Wait
24 Amount of media wait time during the session, in seconds.
25 Client session type indicator character: (per IC18252)
1 - General Backup/Archive client session (same as 5)
2 - Open registration
3 - Password update session
4 - General Backup/Archive client session (same as 1, but commonly
recorded rather than "1").
Note that, in Backup, there is no indicator value to
distinguish Consumer sessions from Producer sessions: you can
only infer a Producer session by it having a "4" indicator,
and fields 16 and 17 being zero.
5 - Client scheduled session
6 - Admin console session
7 - Admin general session
8 - Admin password update session
9 - Export/import session (used internally)
10 - Admin -MOUNTmode session ('dsmadmc -MOUNTmode')
16 - Server-to-server library sharing session.
19 - Session proxied through storage agent.
26 HSM: Number of space-managed database objects inserted during the
session.
27 HSM: Amount of space-managed data (size), in kilobytes, sent by the
client to the server.
28 HSM: Number of space-managed database objects retrieved during the
session.
29 HSM: Amount of space-managed data (size), in kilobytes, recalled.
30 Product release (new with ADSM 3.1). See also fields 1, 31.
Example: "7", as in TSM 3.7.2
31 Product level (new with ADSM 3.1). See also fields 1, 30.
Example: "2", as in TSM 3.7.2
Notes: - Accounting is by node transaction. The filespace is not recorded, so
you cannot produce reports by filespace.
- The session number is not recorded! (It is available in the server
database SUMMARY table.)
- The amount of data includes retries, as when the client's sending of
data to the server is interrupted because a tape has to be mounted.
- The server's view of session activity has proven to be more consistent
with reality that the client's view. Thus, accounting records tend to
be a better source of session timings than client session summary
statistics (reflected in the client log and ANE messages given to the
server for its Activity Log).
- The session overall data rates can be computed from the field 21
duration and the session type KB values, to yield values like the
"Aggregate data transfer rate" from session summary statistics. But
the equivalent of "Network data transfer rate" cannot be, as there is
no network transfer time directly recorded.
- There is no explicit record of Delete ARchive activity.
- In MVS (OS/390) the recording occurs in SMF records: SMF record type,
subtype 14.
- In TSM backups, there will be a "control session" and a "data
session", where the control session envelops the data session. The
two are recorded separately, with the data session obviously recorded
just before the control session ends and is recorded.
See "Consumer session" and "Producer session".
- As of TSM 3.7, clients perform backups in a multi-threaded manner
such that a single backup job will be recorded across multiple
accounting records. See: Multi-session Client
- The server does not provide any means for cutting off this file, which
will grow endlessly unless you do something.
- Servergraph.com sells software which allows viewing accounting
information in graphical form.
- A pertinent Technote: "Why ITSM accounting record can differ from
Summary records in "Activity.Summary" table" (swg21155024), which
notes that the Summary table is exactly that - a summary of each
session - whereas there may be multiple accounting records for the
same session, one record per thread in a multi-threaded session.
API NOTES:
The API is said not not work with TCPNODELAY in the client system options
file.
ADSM/TSM UNNUMBERED MESSAGES (SEEN IN ACTIVITY LOG, ETC.):
Out of Memory
Error seen in the TSM 32-bit Server. Consider using the 64-bit version;
See: http://www.ibm.com/support/docview.wss?uid=swg21154955
ANR-----(server messages)-----------------------------------------
A number in parentheses alongside a module name, such as "asalloc.c(5006)", is
not signficant: it can be expected to be the sourcecode line number, as via the
ANSI C _LINE__ definition; so it will vary from one maintenance level to
another.
Module names beginning with "sm" (e.g., smnode.c) seem to reflect software
involved in TSM server commands.
ANR0000W Unable to open default locale message catalog, /usr/lib/nls/msg/C/.
Ostensibly, your environment variable LANG is set to C rather than
en_US.
ANR0000W Message repository for language AMENG does not match program level.
The server message repository file is built for a certain level of
dsmserv (the "program") and the two must be matched to work properly.
"Loose" management of TSM filesets or casual copying of application
files at a site can result in inconsistencies. This one does not keep
the server from coming up, but is a big clue that things are out of
whack, and should not be ignored.
ANR0000E Unable to open language AMENG for message formatting.
Usually happens when you try to execute the server from a directory
other than where the server code is located. Try "cd"ing to
/usr/lpp/adsmserv/bin and issuing it there; or under Csh do
'setenv DSM_DIR /usr/lpp/adsmserv/bin'. If you ARE running it from
there, check to make sure the file exists (dsmameng.txt) and that you
have sufficient permission to it.
ANR0102E asalloc.c(5006): Error 1 inserting row in table "AS.Segments".
A problem within the database, involving other than a disk storage pool,
perhaps precipitated by a shortage of space in the database volumes or
an abrupt shutdown of the server while it had a problem, such as a
failing tape dismount. You may have to do a 'dsmserv Auditdb' offline,
or an 'AUDit DB' online. Some customers have found that restarting the
server helps. Or you may be able to locate an involved tape and remove
it from the system to clear that info from the database. Also found to
be caused in Reclamation: the currently open copypool volume, upon being
reclaimed, went "empty" rather than "pending". It never went into
scratch status, but rather, was reused, and marked as "STGREUSE" in the
volhistory. A resolution to this is to mark the "STGREUSE" volume
ACCess=READOnly, which may allow writing to the copypool again, so you
can run a Move Data against the R/O volume, making sure it goes to
scratch status.
You might avoid such messes by utilizing *SM MIRRORWrite on your *SM
Database and Recovery Log, with Sequential writing.
May be caused by APAR IC36975, involving the COPYSTGPOOL= simultaneous
write feature. See also: REPAIR STGVOL
ANR0102E dsalloc.c(980): Error 1 inserting row in table "DS.Segments".
A problem within the database, involving a disk storage pool, seen in an
abrupt shutdown of the server while it had a problem, such as a failing
tape dismount. Fixing it with least disruption requires working with
TSM Support. They may advise migrating all disk storage pool data to
tape, shutting down the server, and then doing 'DSMSERV AUDITDB
DISKSTORAGE FIX=YES'.
You might avoid such messes by utilizing *SM MIRRORWrite on your *SM
Database and Recovery Log, with Sequential writing.
May be caused by APAR IC36975, involving the COPYSTGPOOL= simultaneous
write feature. See also: REPAIR STGVOL
ANR0104E astxn.c(1159): Error 2 deleting row from table "AS.Volume.Assignment"
followed by ANR0865E expiration processing failed - internal server
error.
Solution: Recover the database. Use RESTOREDB if you have recent
BAckup DB tapes; else you will have to perform a salvage operation
using DUMPDB/LOADDB.
But you may be able to employ the following instead:
1. Create a listing with command 'Q VOL * STA=Pending F=D'
2. Search for the string "Date Became Pending".
3. Compare the date with the "Reuse delay" parameter of your storge
pool(s).
i.e.: today 11/09/98, the reuse delay parm is 9 days, than look at
tapes 10/31/98 and below.
4. Execute --> MOVe Data volser
After the move-command(s) was successfully completed or ended with
message "no data on volume"
5. Execute --> DELete Volume volser
to remove this volume(s) from storagepool.
ANR0104E ASVOLUT(2202): Error 2 deleting row from table AS.Volume.Assignment
You have a tape volume in your storage pool that incorrectly has the
reuse delay flag set in the data base. To correct this problem issue
the following commands:
1. Q STG F=D The purpose of this command is to find the reuse delay for
the storage pool.
2. Q VOL F=D STATUS=PENDING With this command you will get a list of
all of the storage pool volumes that currently has the reuse delay
flag set.
The volume that has the error will be the volume on that list that
the reuse delay has expired. Perform a MOVe Data on that volume.
Thereafter, the volume should be in the correct state and you can issue
the command DELete Volume if needed.
ANR0104E imaudit.c(3797): Error 2 deleting row from table "Expiring.Objects".
As seen after message "ANR4206I AUDITDB: Object entry for expiring
object 0.0 not found - expiring object entry will be deleted.".
The "object 0.0" says that Expiring.Object table contains an entry which
does not have a corresponding Object Ids entry, and In consequence it
tries to delete this entry but fails with 'key not found'. That is,
there is an inconsistency in the database which can not be corrected
by dump/load/audit.
ANR0106E imarqry.c(4481): Unexpected error 2 fetching row in table
"Archive.Objects".
The table may be one of several (Archive.Descriptions, etc.).
This was a defect in TSM 5.2, per APAR IC39132, caused by a
timing/locking issue within the expiration process (hence, results may
vary in each run).
ANR0110E An unnexpected system date has been detected; the server is disabled.
Use the ACCEPT DATE command to establish the current date as valid.
So you need to check out your system clock, and perhaps your NTP
service. In smaller systems, the problem may be a depleted battery on
the motherboard. Whereas this message typically occurs at TSM server
start-up, and causes start-up to fail, it leaves no opportunity to
perform the ACCEPT. The way around this is to create a little TSM macro
file in the server directory called like "accept_date" containing the
lines ACCEPT DATE and COMMIT, and then do
'dsmserv runfile accept_date', which just performs that task and halts
the server, whereafter a normal restart should work.
ANR0207E Page address mismatch detected on database volume _____,logical page
xxxxxx (physical page xxxxxx); actual: 0.
ANR9999D iccopy.c(nnnn): ThreadId<nn> Unable to read from db volume.
You have a corrupted database. Were you configured with MIRRORWrite DB
Sequential as the product recommends, to guard against such problems?
If not, you'll probably have to restore your DB - after investigating
the cause of the problem (overfull db? opsys crash? disk problem?) so
as to keep it from occurring again. Refer to IBM Technote
http://www.ibm.com/support/entdocview.wss?uid=swg21155009
ANR0212E Unable to read disk definition file: dsmserv.dsk
It may be absent, or its permissions may be wrong. One user reported
finding this happening when the Recovery Log was full.
ANR0252E Error writing logical page 141410 (physical page 141666) to
database volume adsm-db1.mirror.
See: ANR7838S
ANR0259E Unable to read complete restart/checkpoint information from any
database or recovery log volume.
Typically seen in an install, where the recovery log and database
volumes have been only formatted with dsmfmt and then the server is
started by just invoking 'dsmserv'. There needs to be information in
the db about the environment as the server last saw it. In a fresh
install, though, the db is blank. You need to instead run
'dsmserv format' to initialize the database. See the Admin Guide.
Seen during a server restart: Suggests disk damage.
If experienced during a 'dsmserv restore', it would seem to be the case
that a fresh database and/or recovery log is being supplied for the
restoral - but the fresh areas have not been formatted. Do 'dsmserv
format' first. Watch out for your dsmserv.dsk not reflecting the
reality of the db and recovlog disks you set up: if necessary, move
dsmserv.dsk out of the way and manually define your disks when the
server comes up. Try using command option TODATE=, to do a point in
time restore with a volume history file, to keep dsmserv from trying to
read the Recovery Log to do a roll forward restore.
ANR0355I Recovery log undo pass in progress.
As seen in TSM server start-up. The larger and more full your Recovery
Log, the longer this will take, so be patient. If concerned whether
it's running, use your opsys monitorinig tools to check for lots of I/O
activity on the Recovery Log and Database.
ANR0361E Database initialization failed: error initializing database page
allocator.
Explanation: During server initialization, the server database
initialization fails because the page allocator cannot be started.
One thing to check is that the ADSM server files are as the dsmserv.dsk
file thinks they are. Part of the ADSM database may be corrupted.
Consider running an AUDITDB.
ANR0362W Database usage exceeds 86 % of its assigned capacity.
Periodic warning message when the database is getting full, added by
APAR IC08768. Messages begin when database utilization exceends 80 % of
its capacity, and then re-issue the message as database utilization
increases by 2% thereafter.
ANR0406I Session NNNN started for node <NodeName> (Opsys_Type) (Tcp/Ip
<Client_IP_address>(Client_Port_Number)).
Appears in the Activity Log when a client session is initiated, as when
the 'dsmc' command is issued, with a subcommand. (Note that the
originating process will differ, depending upon the issuer: if the
superuser invokes 'dsmc', the client port origin of the session will be
the dsmc command itself; but if 'dsmc' is invoked by an unprivileged
user, the client port origin of the session will be the dsmtca process.)
Note that with client schedules v3 and above, there will be two ANR0406I
session messages: the first (outer) is the data movement session itself;
the second (inner) is one by which the client sends ANE client event log
messages to the server resulting from the session.
ANR0423W Session ____ for administrator ________ (<PlatformType>) refused -
administrator name not registered.
The administrator name position may contain strange characters, which
should make you suspect Unicode. They look wacky to you because your
dsmadmc session to query the Activity Log is not using the same
character code page as the platform which is initiating the timed
sessions, so your admin session can't make sense of the name. Probably a
Windows client, but could be other, now that Unicode support has
extended. You might try dsmadmc from such a Unicoded client system.
See if there are accompanying ANR messages indicating the source of the
sessions, and get in touch with the client owner. If no ANR msgs and
the server is AIX, you can employ the AIX iptrace/ipreport command set
to relatively easily see where this traffic is coming from.
ANR0428W Session ____ for node ________ (<PlatformType>) refused - client is
down-level with this server version.
The node had been accessed with a higher level client than you are now
trying to use, and that higher-level access caused the server to "latch"
the requirement that the client always access its filespaces using at
least that level, due to data formatting and/or control information
established by that higher level - which lower level clients cannot
understand, and so sessions from them must be prevented.
This is very insidious where a session was conducted for a nodename from
a platform type other than that of the true, owning client. This bogus
access can cause not just a platform reattribution, but also latching of
the level peculiarities possessed by that interloper client; so when you
next try to conduct a session from the true client, you are locked out
with this error message. This is often due to a bug involving Unicode
vs. non-unicode clients: the rogue session probably resulted in the
client being marked as unicode-enabled. In the IBM/TSM support site,
search on ANR0428W and there view the instructions for patching the
database entries for the damaged client. By all means, feel free to
call vendor support for assistance in this process.
See also: ANS1357S
ANR0440W Protocol error on session 70919 for node _Name_ (_OSType_) invalid verb
header received.
The invalid protocol stream and lack of node identifiability suggests
that some random system on the net is trying to connect to your *SM
server port (default: 1500), but not *SM terms. Someone out there may
have erroneously set up some kind of mail client or the like to
periodically poll what it thinks is a mail server for new mail, using
your port number. First assure that you have not configured your server
to use a port number well-known for some other service. They you'll
have to use some kind of network/packet trace or the like to determine
where it's coming from, as in using the Unix 'tcpdump' command or AIX
'iptrace'. (You should be prepared to do this in any case: any site on
the net could be subject to Denial Of Service attacks, and needs to be
ready to find out where they are coming from so has to have their router
filter out traffic from that IP address.)
A closer-to-home possibility could be some spud in your company fumbling
to set up a *SM client to connect to your *SM server.
If an established client, may be mangled communication from it. Try
conducting other kinds of sessions from the HP (telnet, ftp) to see if
it can do those without manglement, then try again with ADSM, first with
queries like 'dsmc q fi', to try to narrow down where the anomaly is.
ANR0444W Protocol error on session NNNNN for node NODENAME (CLIENT_TYPE) -
out-of-sequence verb (type Confirm) received.
^^^^^^^^^^^^
Has been seen with client type TDP Domino NT when the Domino server was
short on memory.
ANR0444W Protocol error on session NNNNN for node NODENAME (CLIENT_TYPE) -
out-of-sequence verb (type SignOff) received.
^^^^^^^^^^^^
The specified client is attempting to establish a session, but no
password has been established. (PASSWORDAccess Generate)
Accompanied by message ANR0484W.
In Windows, can be caused by invalid Password stored in Registry.
Also look in dsmerror.log for "ANS1838E Error opening user specified
options file C:\program files\utilities\adsm\baclient\dsm.opt".
To fix: As root/Administrator, simply run a client-server command like
'dsmc q sch' to re-establish the password.
ANR0444W Protocol error on session NNNNN for node NODENAME (CLIENT_TYPE) -
out-of-sequence verb (type (Unknown)) received.
^^^^^^^^^^^^^
Seen with defective TCP/IP software, as within MVS, caused by a buffer
overflow or other problem.
ANR0480W Session 232 for node _Name_ (_OSType_) terminated - connection with
client severed.
The client dsmerror.log typically contains:
ANS1809W Session is lost; initializing session reopen procedure.
ANS1810E TSM session has been reestablished.
In combination, this would indicate that neither the server nor client
knows why the connection was severed - which says that something in
between the two did it.
May be that firewall software is in between, and it may have its own
"idle timeout" value: if it believes that a session is doing nothing,
the firewall terminates the session. (The TSM client may well be busy
trawling through a file system seeking the next backup candidate, or a
TDP may be waiting on a response from Oracle, etc.
A network cause is where a network switch was rebooted and
autonegotiation failed to agree on link characteristics (like half-
vs. full-duplex). In this case, avoid using autonegotiation. Otherwise,
look for evidence on the client (dsmerror.log, OS logs).
In API coding, the programmer omitted the session termination step, or
the API program failed and exited prematurely.
Has also been reported where a defective storage pool was put offline
and its replacement was defined - but the administrator neglected to do
ACTivate POlicyset. May be accompanied on client by dsmerror.log entry
"Txn Producer thread, fatal error, signal 11".
ANR0481W Session NNN for node _Name_ (_OSType_)) terminated - client did not
respond within 60 seconds.
The COMMTimeout value is too low. The AIX server default is a puny 60
seconds. Boost it.
ANR0482W Session <SessionNumber> for <NodeNode> name (<ClientPlatform>)
terminated - idle for more than N minutes.
If the messages reflects 15 minutes, your server still has the product
default IDLETimeout of 15 minutes, which is much too small. Boost it to
60: you need a large value to accommodate clients rummaging around in a
large, relatively inactive file system looking for changed files to back
up. If you have a good-sized value already, investigate why your client
is idle so long...which could occur if you need a password on the
command line and neglected to specify one, which causes the *SM client
to prompt and wait indefinitely.
ANR0484W Session 123 for node _Name_ (_OSType_) terminated - protocol violation
detected.
Usual cause: The specified client is attempting to establish a session,
but no password has been established. When you registered the client on
the server you established a password, which must be used when the
client session is invoked, either implicitly with
"PASSWORDAccess Generate", or explicltly with "PASSWORDAccess Prompt".
If the IP address in the message is not that of your workstation, it
might be that some other machine is using that name, or possibly that an
old server has been reactivated, which has old info about client IP
addresses.
Do 'dsmc q sch', a basic client-server query which goes through all the
password and network stuff that backup does, and will prompt for and
establish the password in the client area if "PASSWORDAccess Generate"
is in effect. If such a password is in effect, a good response will
verify the client-server interaction.
Another cause: A Client Schedule is defined with ACTion=Macro and the
macro whose file name is coded in OBJects= contains administrative
commands instead of client commands. (Use Administrative Schedules for
executing administrative commands.)
Another cause: An ADSMv2 HSM defect in which it caused the password
entry in /etc/security/adsm to be obliterated.
Accompanied by message ANR0444W.
ANR0487W Session ____ for node ____ (<OpsysType>) terminated - preempted by
another operation.
As in performing a Backup, and someone initiated a Restore or Retrieve,
which has a higher priority. A Backup will not necessarily be
terminated by such an event: a v3+ client will stick around and try to
pick up where it left off, as is evidenced by the following msg in its
backup log: ANS1809E Session is lost; initializing session reopen
procedure. (However, a v2 client will suffer "ANS4017E Session
rejected: TCP/IP connection failure" and start its backup all over
again.)
ANR0492I All drives in use. Session NNNN for node NODENAME (AIX) being
preempted by higher priority operation.
Has been seen happen to a Backup operation, as when another Backup
needs to get its data from HSM space. In that HSM retrieval is a
higher operation, it unfortunately shoves the important Backup out of
the way.
If the Backup interrupted was performed via ADSM scheduling, it will
resume *if* you coded a good Duration (window) value such that it has
an opportunity to restart itself thereafter.
ANR0511I Session ____ opened output volume ______.
Later followed by: ANR0514I Session ____ closed volume ______. These
messages may not be long apart, reflecting the current design of ITSM,
that the volume is closed after each transaction.
ANR0520W Transaction failed for session ____ for node ____ (<Platform>) -
storage pool ____ is not defined.
Is this a Lanfree backup through the TSM storage agent (dsmsta), and the
storage pool had been deleted and redefined? Try recycling the storage
agent, to cause the agent to re-find the stgpool. (Realize that TSM uses
internal identifiers for storage pools, which are reset when deleting
and redefining the storage pool. Stopping and restarting dsmsta resynchs
the storage agent with the TSM server, allowing the storage agent to
find the storage pool. This prevents the ANR0520W message, and clears up
the ANS1329S message which is the result of not finding the storage
pool.)
ANR0521W Transaction failed for session NN for node ____ (<OpSys>) - object
excluded by size in storage pool ________ and all successor pools.
Most likely: The stgpool MAXSize value was configured to not allow a
file that big to be stored - and no successor storage pools to the one
specified can accept the large file. Or: You have a disk storage pool
with a single volume or multiple volumes whose combined size is too
small to accommodate the huge incoming file. Normally, however, you
would have a tape storage pool below the disk pool, and the tape pool
would have "infinite" capacity, so incoming file size would not be an
issue. But perhaps migration to the tape pool is defeated, or that pool
is read-only or is depleted of tapes. Check it out. See also: ANS1310E
Note that the message unhelpfully fails to identify the too-big object,
which thwarts communication with the client administrator.
ANR0522W Transaction failed for session ____ for node ____ - no space available
in storage pool ______ and all successor pools.
Make sure that your policy set is activated.
Are your storage pools read/write?
Is the object being stored larger than the space available in the
storage pools.
Does your Storage Pool MAXSize value prohibit the store operation?
Are your storage pool volumes full and your Stgpool MAXSCRatch value is
insufficient?
ADSM wants to store associated directories in either the management
class specified by DIRMc or the class with the longest retention period:
is there space there?
Are your tape library and drives working? (You might see a lot of mount
requests denied.) Do 'SHow LIBrary' to check their status.
In general, there should be associated error messages in your server
Activity Log which would indicate the true problem.
Note that a large file that was spanning from the end of the last volume
available in the storage pool, which cannot be completed for lack of
further volumes, has to be logically expunged from the volume where its
writing left off.
ANR0534W Transaction failed for session nnnn for node xxxx - size estimate
exceeded amd server is unable to obtain storage
The delta between the low Pct Migr and the high Pct Util represents the
amount of space that has been reserved for other clients that are
concurrently being backed up: This is bitfile (Aggregate) space that has
been allocated in the storage pool for transactions that are currently
in flight.
Has been seen where client compression is turned on, and client has
large or many compressed files: *SM is fooled as compression increases
the size of an already-compressed file. (Remember that compression may
be turned on in the client or via a Client Option Set or mandated on the
server node definition.)
Prior to a client sending a file, the space (same as allocated on
client) is allocated in the TSM server's disk storage pool. If caching
is active in the disk storage pool, and files need to be removed to make
space, they are - up to the limit indicated by the incoming storage
space request. But if the file grows in compression (client has
COMPRESSIon=Yes and COMPRESSAlways=Yes), the cleared space is
insufficient to contain the incoming data.
Look also for a filled storage pool.
Watch out when using TDPs: Their size calculation may be inaccurate, and
thus problems when the stgpool uses caching.
Some customers report this error occurring when Migration is running and
a backup is performed.
See also IBM site TechNote 1156827.
ANR0535W Transaction failed for session nnnn for node xxxx (OS_Type) -
insufficient mount points available to satisfy the request.
As when a client backup session has been running, then along comes a
BAckup STGpool that needs a drive, which denies it to the client
session. The client session, however, does not terminate: only the
current data-sending transaction failed...the client will emit message
ANS4312E Server media mount not possible, then wait for the mount with
message ANS4118I Waiting for mount of offline media.
See: Drives, not all in library being used
ANR0538I A resource waiter has been aborted.
Resource refers to a TSM server lock or synchronization object. The
server terminated the wait for such a resource because it has been too
long (relative to the server RESOURCETimeout value). This could cause a
process or session to fail. This situation may be due to a server
deadlock. May be accompanied by messages which illuminate the
situation, such as ANR4513E.
ANR0540W Retrieve or restore failed for session NNN for node ____ (AIX) - data
integrity error detected.
Accompanied by: ANR9999D smnqr.c(1132): Bitfile 11975366 not found for
retrieval. Has been seen when the volume containing the data is in a
Destroyed state.
ANR0548W Retrieve or restore failed for session session number for node ____
(<Platform>) processing filespace ____ for file ____ stored as
Backup|Archive - data integrity error detected.
Formerly ANR9999D SMNQR(1132): BITFILE XXXXX NOT FOUND FOR RETRIEVAL,
replaced with more descriptive message per APAR IY09212, 2000/03/21.
The server ends a file retrieval operation for the specified session
because an internal database integrity error has been encountered on the
server. May be accompanied by msg ANR1424W, telling of a volume
unavailable because its access mode is "destroyed". Otherwise, you
should re-try the restore or retrieve operation and if the file is also
backed up in a copy storage pool, the operation will attempt to read the
file from the alternate location. If you don't have a copy storage pool
(shame!) then you could try Move Data several times over several drives
to see if it might finally copy the bad file. You can use the Query
CONtent command with the COPied= operand to check for files also being
in a copy storage pool.
Nearby this message in the Activity Log you should see some indications
of I/O errors or other problems, naming a specific volume, which is the
one containing the file in trouble. If that volume is not Destroyed, do
'Query CONtent VolName ... DAmaged=Yes' and see if any Damaged files on
it. See "Damaged" for handling info.
ANR0566W Retrieve or restore failed for session ____ for node _____ (OpSys) -
file was deleted from data storage during retrieval.
This can be due to the file being in the progress of migration to a
lower level storage pool in a hierarchy at the time that its retrieval
is being requested by a client. Until TSM can be redesigned to
accommodate this circumstance, compensate by avoiding a migration or by
activating caching on the migrating-from storage pool.
ANR0567W Retrieve or restored failed for session ____ for node ____ (<Platform>)
insufficient mount points available to satisfy the request.
See: "Drives, not all in library being used"
ANR0670W EXPORT SERVER: Transaction failed - storage media inaccessible.
This is a conclusion message: There should be an accompanying message,
such as ANR1420W, explaining the problem.
ANR0692E EXPORT NODE: Out of space on sequential media, scratch media could
not be mounted.
Seen when Exporting a filespace whose size is such that multiple tapes
will be required to contain it, but you specified too few explicit
VOLumenames instead of using Scratch=Yes.
ANR0812I Inventory file expiration process 330 completed: examined 125455
objects, deleting 1 backup objects, 0 archive objects,
0 DB backup volumes, and 0 recovery plan files.
0 errors were encountered.
The "examined ___ objects" refers to the number of Inactive filespace
objects that were examined. The "deleting ___ backup objects" number
is typically much less than the number examined, thus indicating that
expiration is not looking at just the Inactive objects older than the
expiration periods of their respective management classes. The "DB
backup volumes" and "recovery plan files" values pertain to DRM, the
former value according to the 'Set DRMDBBackupexpiredays __' spec and
the latter per the 'Set DRMRPFEXpiredays __' spec.
If errors were encountered, examine preceding messages in your Activity
Log (do not invoke Expire Inventory with the Quiet option for this).
Typically accompanied by: ANR9999D imexp.c(1350): Error 8 deleting
expired object (011531735) - deletion will be re-tried.
This message with this return code generally means that ADSM could not
find the inventory information for this file. When you get this
message, inventory expiration will take longer while we search for any
other references in this data base to this object and remove them. You
should not see the same message for this same object number the next
time you run inventory expiration unless we are unable to remove the
other references to this object.
ANR0836W No query restore processing session Session_ID for node Node_Name and
Filespace_Name failed to retrieve file High_Level_File_Name
Low_Level_File_Name - file being skipped.
No Query Restore processing failed to retrieve the specified file
because of an error: the file will be skipped.
This may simply be that you were doing a wildcard restore, which is
considered a "no query restore", and TSM is simply reporting files that
are unavailable. However, there may be more serious, underlying causes.
Were there accompanying messages in the Activity Log to explain where
the server expected to find the file, or what problem it had? This could
be a situation like a volume in Destroyed or Unavailable status, or a
similar unavailability problem. If your database is not huge, a 'Select
Volume_Name From Contents Where File_Name=______' may be feasible to
determine where the file backup lives; or you could more basically do a
Query Volume looking for anomalous status values.
ANR0874E Backup object 0.43293636 not found during inventory processing.
Occurs during EXPIRE INVENTORY. Surrounding Activity Log messages may
explain the problem. You may do 'SHow BFObject <ObjectID>' and 'SHow
INVObject <ObjectID>' to identify the object. You may be able to audit
the volume on which the file occurs to relatively simply resolve the
problem; or you may have to contact vendor support (and possibly run
an Audit DB per their specs).
ANR0905Q Options file dsmserv.opt not found
Did you start the server from the server directory? Is the file in the
server directory? What is different about the way that you are starting
the server this time from all the preceding times that it was
successfully started?
ANR0981E The server database must be restored before the server can be started.
The server believes that the RESTORE DB operation was incomplete. If
the Restore operation reported success, then suspect your disk
subsystem, which may have gone defective: Get your operating system and
hardware people involved. Have them look for error logs. They can use
utilities and/or diagnostics to test the voracity of the disk subsystem.
If necessary, and possible, try using a different type of disk
subsystem.
ANR0985I Process NNN for AUDIT LIBRARY running in the BACKGROUND completed with
completion state FAILURE
If you're lucky, those are new volumes that were inserted without having
been labeled. Otherwise they may be old volumes which, in the classic
shared library scenario, were overwritten by the ogre you're sharing the
library with. Might also occur if the SCSI address or Element address
of the drive was changed. Some customers report upgrading the tape
device driver (e.g., Atape) and the problem went away.
ANR0985I Process NNN for LABEL LIBVOLUME running in the BACKGROUND completed
with completion state SUCCESS at 10:19:40.
Beware that though it says Success, some volumes may have failed to
initialize. Look for ANR8806E and accompanying failure messages in the
Activity Log.
ANR0986I Process 61 for SPACE RECLAMATION running in the BACKGROUND processed
136 items for a total of 535,063,780 bytes with a completion state of
FAILURE at 14:14:36.
Simply means that the task was interrupted and stopped: data transferred
up to that point is on its new volumes and is just fine. The failure can
be caused by the Reclamation process being preempted by a higher
priority process, such as an HSM Recall: check your Activity Log.
Accompanied by messages ANR1080W, ANR1440I.
ANR1025W Migration process ___ terminated for storage pool ______ - insufficient
space in subordinate storage pool.
Migration is trying to move data into the next level down in your
storage pool hierarchy, but there isn't enough space in that next level
for the data movement to occur. A classic cause of this is an inadequate
MAXSCRatch value on the destination stgpool, or insufficient scratches
in the library. A less common cause is the lower stgpools not being
read/write.
ANR1081W Space reclamation terminated for volume ______ - storage media
inaccessible.
Seen on a 3494 when the robot actuator hand is losing its gripping
power, failing to pull tapes out of cells. On a 3494 you should also be
seeing an Intervention Required on its operator station. You'll have to
move the problem tape from that cell to cell 1 for the robot to clear
the bad tape status condition, then change the *SM Unavailable status.
ANR1082W Space reclamation terminated for volume ____ - insufficient number of
mount points available for removable media.
See handling under similar ANR1134W.
ANR1086W Space reclamation terminated for volume ______ - insufficient space in
storage pool.
Seen when Stgpool MAXSCRatch is inadequate. See the "MAXSCRatch" topic
to fully understand the requirements.
ANR1117W Error initiating migration for storage pool RECLAIMPOOL - internal
server error detected.
Seen accompanied by:
ANR9999D asutil.c(220): Pool id 4 not sequential-archival strategy.
ANR9999D afmigr.c(644): Error locating pool descriptor for pool id 4.
You have defined a stg pool of DEVCLASS DISK for a RECLAIMSTGPOOL of a
primary tape pool (presumably because that tape pool has only one tape
drive available to it). It won't work. ADSM insists that
RECLAIMSTGPOOLs must be from the FILE device class.
ANR1134W Migration terminated for storage pool ________ - insufficient number of
mount points available for removable media.
You did actually create the tape device in the operating system and then
define it to *SM, yes? Remember with "rm" drivers is that the operating
system would already have its own, usual driver in place to handle the
drive as an rmt device: you have to dissociate that so that you can have
the "mt" device driver control the drive.
In a manual library, you like need to mount the tape, dude.
Otherwise do Query DRive to assure that your drives are online and
available and not already in use. Check your storage pool definitions
to assure that hierarchical migration actually has somewhere to go.
Assure that your MAXSCRatch value is appropriately high.
See also "Insufficient mount points, 3590" in the CONDITIONS section,
further down in this document.
ANR1142I Moving data for collocation cluster 3 of 10 on volume ______.
A tape reclamation is in progress.
ANR1144W Move data process terminated for volume ______ - storage media
inaccessible.
Seen on 3590E drives where the tape was being mounted but, because the
3590E has two springs on the mouth flap instead of one on the 3590B
drives, the gripper could not push the tape into the drive mouth with
sufficient force to get the drive to take it in. So the tape remains
trapped in limbo.
See also: ANR8447E
ANR1149W Move data process terminated for volume ______ - insufficient space in
target storage pool.
Usually, because you are down to your last tape, either in the scratch
pool or defined to a storage pool.
ANR1163W Offsite volume ______ still contains files which could not be moved.
Indicates that a reclamation or MOVe Data was attempted on an offsite
volume, but it still contained data. When MOVe Data or reclamation is
performed for an offsite volume, files are obtained from a primary
storage pool or possibly from another copy storage pool. Message
ANR1163W is issued when residual files are left on the offsite volume
after this move is completed. This typically occurs when the server
cannot copy files from another storage pool because they reside on
volumes that are unavailable or offline. Another possibility is that
files in the source storage pool are marked as damaged and therefore do
not get moved. Check your activity log for messages indicating the
reason why the files were not moved. I would not expect an
'AUDit Volume ... Fix=Yes' to correct the problem, as you are indirectly
using proxy volumes; but you might give it a shot. It might also be the
case that no onsite copy was made of some of the data that went offsite.
Do a Query CONtent on the subject volume and research from there.
ANR1171W Unable to move files associated with node ____, filespace ____ fsId _
on volume ______ due to restore in progress.
Look for an active restore via 'Query SEssion', or a held-off
restartable restore via 'Query RESTore'.
ANR1173E Space reclamation for offsite volume(s) cannot copy file in storage
pool storage ______: Node ____, Type ____, File space ____, fsId ____,
File name _____.
Running expiration or migration at the same time as your offsite volume
reclamation can cause this condition.
Can also occur if you recently switched your offsite pool from
non-collocated to collocated? During offsite reclamation for a
collocated pool, the server checks the clustering information for the
objects on the volume being reclaimed. Clustering information is
typically based upon node name, and filespace. Non-collocation means
that multiple nodes can be mixed on a given storage pool volume. During
offsite reclamation processing, the server moves only one node's data at
a time: the objects for other nodes will be skipped and the ANR1173E
message will appear. Multiple passes are required to fully clear the
offsite volume. To deal with this you can do one of:
- Increase the MOVEBatchsize server option value.
- Issue MOVe Data or MOVe NODEdata, depending upon how many nodes are
on that volume and/or the nature of the storage pool.
When a primary disk pool is involved, an additional step is to have a
primary tape pool to which it can migrates. The logic for offsite volume
reclamation is slightly different when the copy resides in a tape
(sequential) pool instead of on disk: the server retries the reclamation
(reprocesses the volume) more when the copies are on tape.
ANR1216E BACKUP STGPOOL: Process ___ terminated - storage media
inaccessible. (SESSION: ____, PROCESS: ___)
Seen where a 3590 tape drive problem (which causes an Int Req condition
on a 3494 library) incites the library, unto itself, to eject the tape,
leaving TSM to think the tape is still in there. The 3494 display panel
will have an Int Req message like: "Damaged volser (001018) ejected to
the convenience I/O stations (03-01-2005 22:12:55)". In this instance,
the tape leader block was missing - snapped off in the drive. You need
to at least put the drive offline, get it repaired, and then check the
storage pool tape out (without eject), then check it in as Private.
ANR1221E command: Process <process_id> terminated - insufficient space in target
copy storage pool.
Assure that there are scratch volumes available in the copy storage
pool, or at least enough space left on pre-existing tapes that the
BAckup STGpool can proceed. (Remember that though there may be space
left on existing volumes, if the next file to be backed up is too large
to fit, insufficient space results. Msg ANR1405W should appear if no
scratch volume available.)
Check your MAXSCRatch value to assure that you are not artificially
limiting how many tapes may be used. Beware not fully understanding
what MAXSCRatch really means (see the definition in this doc), and
setting the value too low: if in doubt, boost it - which may cause the
problem to disappear. You may intentionally have MAXSCRatch=0 so that
operations will use only volumes specifically Defined to that storage
pool, which is fine.
If using a tape library, assure that your scratches have the proper
category codes to be used by the Devclass.
Unlikely, but: Do 'Query STGpool <PoolName> F=D' and assure "Access:
Read/Write".
The destination storage pool may numerically have enough space, but it
is necessary that its volumes are on-site for the backup to occur.
If the empty volumes are Defined to the storage pool (rather than using
Scratches), assure that your free volumes are read-write and not
Unavailable or Offsite.
As always, make sure you are doing regular Expirations to assure free
tapes.
ANR1315E Vary-on failed for disk volume ...... invalid label block
Could occur, for example, if something failed to mount in your Unix
system, or the volume or file system or file is missing.
Can also occur if you rename the volume in ADSM or the operating system,
but not both, making for inconsistency.
See "Raw logical volume" entry for notes on how *SM uses such a volume.
ANR1339W Session SESSION_ID underestimated size allocation request - _NN_ MB
more space was allocated to allow the operation to continue.
New msg with APAR IC37437 to deal with performance degradation involved
in recomputing server DISK storage pool space estimation as files move
from client to server or within server. The server will now use
subsequently larger allocation sizes to more efficiently and optimally
store the file in the DISK type storage pool. This was supposed to be
fully incorporated into 5.2.2.0; but swg21165187 cites a subsequent
"fudge" - which customers report doesn't help on clients < 5.2.2.0.
ANR1341I Scratch volume _______ has been deleted from storage pool ________.
The corollary of message ANR1340I.
Because the REUsedelay period has expired, or a DELete Volume was
performed, or Reclamation emptied and returned it to scratch status.
Note that DELete VOLHistory does not cause this message, because such
volumes are not in storage pools. A less common reason for this message
is that a scratch volume is mounted for an operation, but the operation
does not proceed: nothing was written to the unused tape, and so it goes
right back to scratch.
ANR1342I Scratch volume 000122 is now pending - volume will be deleted from
storage pool ________ after the reuse delay period for this storage
pool has elapsed.
Will be followed after that many days by "ANR1341I Scratch volume
______ has been deleted from storage pool ________."
ANR1343I Unable to delete scratch volume
Reclamation is still running: volumes which it has emptied remain locked
until the reclamation ends.
ANR1401W Mount request denied for volume ______ - mount failed.
This is a summary message: the actual problem should be spelled out in
preceding messages, such as ANR1144W storage media inaccessible.
Note: In a shared library environment, beware restarting the TSM server
application on the library manager TSM server system but not restarting
the TSM server application on the other system. In a shared environment,
both TSM server applications have to be restarted if one is.
LTO2 had an early microcode problem wherein a tape volume would fill and
the drive would not reset its EOT (end-of-tape) flag. Subsequently, when
a scratch volume is being mounted in the same drive and TSM tries to
write the BOT (beginning-of-tape) information, the drive returns to the
driver (and then TSM) that it is at the EOT (end-of-tape) and therefore
TSM is unable to write to the tape. This is an LTO II drive microcode
problem which is fixed at microcode 37E1 level and above.
ANR1402W Mount request denied for volume 000003 - volume unavailable.
As when doing 'DSMSERV DISPlay DBBackupvolumes
DEVclass=OURLIBR.DEVC_3590 VOL=000003'. The volume probably has a
category code which is not one defined as belonging to this library.
ANR1405W Scratch volume mount request denied - no scratch volume available.
Could mean exactly that. Had you checked in tapes of the right type,
as Scratch? Is your MAXSCRatch value realistic?
For 3995 (optical storage): You checked in your scratch volumes as type
OPTICAL whereas your Devclass Device Type is WORM, or vice-versa.
ANR1410W Access mode for volume 000081 now set to "unavailable".
Seen when a tape was to be mounted on a library drive, but the Load
failed.
ANR1411W Access mode for volume ______ now set to "read-only" due to write
error.
Often accompanied by server msg "ANR8359E Media fault detected ..." and
client msg "ANS4301E Server detected system error".
Most usually the result of dirty tape heads...which can occur if a
manual library has not been manually cleaned or in an automatic library
the automatic cleaning has been disabled or cleaning cartridges have
been exhausted. Could also be a dirty or defective tape.
If using DLT8000 drives, you must use DLT type IV or better cartridges.
Make sure you use the TSM driver software to control the drive, rather
than the driver from the operating system.
Seen particularly in Compaq servers with DLT8000 drives due to problems
with the standard NT SCSI adapter drivers, when the Compaq SCSI
Controller drivers from the SSD was NOT used. The standard NT SCSI
drivers seem to have problems to communicate with DLT8000 drives; but
DLT4000 and DLT7000 worked fine.
Note that 'Query Volume' will show "In Error State?: Yes" for such a
volume, and that a Copy Storage Pool volume whose data has all been
removed will remain in Pending state indefinitely because of the error:
only a 'DELete Volume' will release it.
ANR1420W Read access denied for volume ______ - volume access mode =
"unavailable".
Perhaps seen in an Export operation (with msg ANR0670W). Volumes which
are to participate in such operations need to have a Status value which
allows them to participate.
ANR1423W Scratch volume 123456 is empty but will not be deleted - volume access
mode is "offsite".
The volume's contents have evaporated through expirations and the like
such that the volume is finally empty. But because it is offsite, it
cannot be re-used. It has to be brought back onsite, and then its
status can be changed to read-only so that it will be deleted.
Or you could strong-arm the process by performing:
UPD VOL * ACC=READW WHERESTG=____ WHERESTATUS=EMPTY WHEREACC=OFfsite
ANR1425W Scratch volume 000002 is empty but will not be deleted - volume
state is "mountablenotinlib".
Disaster Recovery state. Do 'MOVe MEDia WHEREState=MOUNTABLENotinlib'
to delete the scratch empty volume.
ANR1440I All drives in use. Process 61 being preempted by higher priority
operation.
As when you are doing a Reclamation and an HSM Recall, Retrieve, or
Restore process has arisen.
ANR1469E DEFINE SCRIPT: Command script _____, Line ____ is an INVALID command
Commonly, a server script employs a GOTO, and the target label does not
have a trailing colon, or the label is more than 30 chars long.
ANR1639I Attributes changed for node <TSM_Nodename>: TCP Address from to
<IP_Address>. (SESSION: ____)
Possibly because the client nodename or IP address differs from their
values in the prior session with the client (which may happen in a
cluster failover). Or maybe the client is employing a Nodename spec
which does not match a reverse lookup on its IP address. See: GUID
ANR2020E UPDATE SCHEDULE: Invalid parameter - <Whatever>'
Typically because you flubbed the quoting of the OBjects parameter.
See the guidelines in the Admin Ref manual under DEFine SCHedule.
ANR2034E QUERY FILESPACE: No match found using this criteria.
And you expected a match for the filespace name you typed. It is
probably the case that you are unwittingly dealing with a Unicode
filespace, where the name created by the PC is itself Unicoded. You can
deal with this in one of two ways:
1. Use "NAMEType=UNIcode" and enter the filespace name accordingly.
2. Simply do 'Query FIlespace' with no operands and find the filespace
in the full output.
ANR2099I Administrative userid __________ defined for OWNER access to node ____
It is a new feature in server level 3.1.2.1, part of the support for the
web B/A client, which is an administrator client. The new admin id lets
the client owner be a limited power admin to run the web B/A client for
his own machine. You can suppress the new admin id's by adding
'USER=NONE' to the 'REGister Node' command.
ANR2100I Activity log process has started.
This is the entry in the activity log written when the server starts
logging to it (at server restart).
ANR2102I Activity log pruning started: removing entries prior to MM/DD/YYYY
HH:MM:SS
ANR2103I Activity log pruning completed: NNN records removed.
The above 2 messages result from Activity Log pruning, as controlled
by the 'Set ACTlogretention N_Days' value. Changing the value
downward tends to kick off the space reclamation in the ADSM server
database, where the Activity Log lives.
ANR2111W BACKUP STGPOOL: No data to process.
When attempting to do a 'BAckup STGpool' from a primary disk pool to a
copy storage pool. This is not abnormal, and typically occurs when the
data that had been in the disk pool had already bee migrated to the next
pool in its defined hierarchy such that there is no longer any new data
to be backed up from the disk pool. Do a 'Query STGpool' to see - and
don't get thrown off by cached data in the disk pool.
ANR2152E REMOVE NODE: Inventory references still exist for node ____.
You attempted REMove Node <NodeName>, but that could not be fulfilled
because filespaces still exist for the node: they must be removed,
first, as via DELete FIlespace.
ANR8212W Unable to resolve address for <Hostname>. (SESSION: ___)
Seeming DNS issue. If no one has changed hostnames in the client options
files, then suspect the DNS service which provides lookup service to
that TSM server system. Make sure no one has changed the
/etc/resolv.conf on that system. Use the 'host' and/or 'dig' commands to
check things out. If no cause can be found, try putting the full
hostname into the options file - including the domain. If still
nothing, try putting the IPaddress rather than host network name into
the options file. Only as a final resort would I recommend employing the
DNSLOOKUP NO server option choice. That is, TSM is calling attention to
systemic problems in your shop, which should be fixed rather than
circumvented.
ANR2321W Audit volume process terminated for volume ______ - storage media
inaccessible.
Seen in libraries where the tape is not in a place where the library can
obtain and mount it. Expect TSM to mark the volume Unavailable upon
finding it inaccessible. Problem can be a defective drive.
ANR2361E BACKUP DB A full database backup is required.
1) You are running BAckup DB for the first time, and don't have a full
backup to start with.
2) You have already taken 32 incremental backups. The maximum number
of incremental backups that ADSM allows between full backups is 32.
ANR2362E command: DATABASE BACKUP IS NOT CURRENTLY POSSIBLE - COMPRESSED LOG
RECORDS EXIST IN THE CURRENT TRANSACTION CHECKPOINT.
A BAckup DB command was issued but a database backup cannot be started.
Log compression has recently taken place, and the compressed log records
are still part of the current transaction checkpoint. After these log
records are no longer part of the current checkpoint a backup can take
place. Reissue the command at a later time.
What this is saying is that there is something running which has
database work-in-progress tied up. You can wait, or see what process or
session is causing this, and possibly cancel it if it persists in
blocking server db backups. A server restart will certainly clear
things.
ANR2391E BACKUP DEVCONFIG: Server could not write device configuration
information to <Windows_Networked_Drive>.
The TSM "service" process, which runs under a "service context" cannot
see networked drives - a Microsoft issue. Drives are networked and
visable in a "user" context.
ANR2404E DEFINE DBVOLUME: Volume /dev/rdsk/c0t4d0s0 is not available. or:
Anr2404e volume /dev/rdsk/c0t1d0s1 not available return code 14.
See "Raw Logical volume in Sun/Solaris".
ANR2404E - DEFINE VOLUME: Volume [volname] is not available.
Can occur in AIX 4.2 after upgrading from AIX 4.1 such that server
modules "dsmserv.42" and "dsmfmt.42" must be put in place of "dsmserv"
and "dsmfmt" in allow the use of files greater than 2GB in size.
Ref: APAR IX75955.
ANR2411E MOVE DATA: Unable to access associated volume ______ - access mode is
set to "unavailable".
Move Data was invoked on a volume; but the first file on the volume is
spanned from a prior volume whose access mode prevents access to it.
ANR2420E DELETE VOLUME: Space reclamation operation already in progress for
volume ______.
The Reclamation process that reclaimed this volume that you are trying
to delete is still running. Wait for it to finish, or cancel it.
Example: You kicked off a reclamation at 08:00. It tried to reclaim
volume 001931, but that volume had some bad files on it, so the volume
could not be emptied. You notice this at 11:00 and at that time do a
Move Data 001931. That yields the ANR2420E message, as the reclamation
process from 08:00, which worked on 001931, is still running.
ANR2434E DELETE DBVOLUME: Insufficient space on other database volumes to delete
volume /var/adsmserv/db.dsm.
Occurs when 'Query DB' Maximum Extension shows no further space left for
'EXTend DB': the DELete DBVolume' wants that space. So do a 'REDuce DB'
to placate ADSM, then 'EXTend DB' after the DELete DBVolume.
ANR2438E <Command>: Insufficient database space would be available following a
reduction by the requested amount.
You are attempting to perform REDuce DB; but the database does not have
enough free space to reduce by the amount specified. See the "REDuce
DB" entry.
ANR2445E DELETE LOGVOLUME: Insufficient space on other recovery log volumes to
delete volume __________
To delete a log volume, Query LOG needs to show a Maximum Extension
value at least as large as the volume being deleted. Do a REDuce LOG
as needed.
ANR2452E DEFINE LOGVOLUME: Maximum recovery log capacity exceeded.
Per APAR IC15376, the recovery log should not exceed 5 GB (5440 MB).
ANR2561I Schedule prompter contacting ____ (session NNNN) to start a scheduled
operation.
As seen with client option "SCHEDMODe PRompted" for ordinary client
schedules, or per the server 'DEFine CLIENTAction' command, for the
server to communicate with the designated client to start a schedule.
ANR2576W An attempt was made to update an event record for a scheduled operation
which has already been executed - multiple client schedulers may be
active for node ________.
Check to see if you have any restartable restores going on for the
client in question: 'Query RESTore'. Cancel the session number if you
do, and see if that makes the problem go away.
ANR2579E Schedule A in domain B for node C failed (return code __).
Seen when a POSTSchedule or PRESchedule command failed: The only
indication that the command wasn't successful was a non zero return code
in the schedule log. (In Unix, return code 1 typically accompanies
"Command not found".)
Or could be a problem with one file system.
Or a stale NFS file handle (as in mount no longer available).
Examine the dsmsched.log and dsmerror.log files.
May be accompanied by msg ANR1512E (q.v.).
ANR2622E DEFINE ASSOCIATION: No new node associations added
Are the nodes you attempted to associate in the same domain?
ANR2716E SCHEDULE PROMPTER WAS NOT ABLE TO CONTACT CLIENT node_name USING TYPE
address_type (high_address low_address)
The address_type value is usually 1, indicating TCP/IP.
The *SM server cannot connect to the client, possibly because:
- The scheduler process is not running on the client.
- The scheduler process cannot run because it is in a stopped state or
has terrible dispatching values relative to other processes which
demand service from the operating system.
- Maybe there is a time shift between the client and server. (Remember
that the client schedule process only opens and listens to port 1501
during the schedule period.)
- PORT 1501 is not opened (on a firewall (if any))
- The client network adress is mistyped in dsm.opt
the server tries to contact the client trough the wrong adress.
- If the scheduler is running on an specific NT account, check if the
account is not locked.
- The scheduler appears to be running, but it actually hangs.
- The client was booting at the scheduled time.
- There is a firewall between the server and the client, and port
numbers cannot be inherited.
That the web-connection works is no guarantee that the scheduler works:
those services operate independantly. If the client can connect to the
server, but not vice-versa, that mostly points to one of above mentioned
reasons. Try excluding each option above, and find the source of the
problem.
ANR2812W License Audit completed - WARNING: Server is NOT in compliance with
license terms
You have more client demand for server sessions than you have licenses.
You have to (possibly acquire and) register more licenses.
ANR2841W Server is not in compliance with license terms.
As the vendor description says, begin by doing Query LICense to see
what's amiss. It may simply be that an "in use" number is greater than
the corresponding "licensed" value. For example, you defined a library,
overlooking the need to register a license for library use. You can
also do AUDit LICenses to reveal what's wrong. Do not just do 'REGister
LICense FILE=*lic*', as that may result in excessive licensing: find out
what the problem is, and treat that. A simple cause is that the named
FILE may not exist in the server directory.
May be caused by having attempted a 'MOVe DRMedia' where DRM is not
licensed: the attempt causes conflicting Query LICense state:
Is disaster recovery manager in use ?: Yes
Is disaster recovery manager licensed ?: No
That can be cleared by doing:
- Issue a Query DRMedia.
- For all the volumes that have a NotMountable state, perform the
following:
1. Update volume's access mode to offsite:
UPD VOL volumename ACCESS=OFFSITE
2. Update volume's ORM state to NULL:
UPD VOL volumename ORMSTATE=""
3. Update volume's access mode to readwrite:
UPD VOL volumename ACCESS=READWRITE
- After all volumes have been updated, perform the following:
1. Issue QUERY DRMEDIA to make sure all volumes are now in
"Mountable" state.
2. Issue AUDIT LICense
3. Issue Query LICense
The output of the Query LICense should now show that DRM is NOT in use.
An insidious cause is your computer's clock being wrong. See "REGister
LICense".
See also: ANR2812W
ANR2909E The SQL statement is incomplete; additional tokens are required.
Commonly occurs where you were so absorbed in coding the handling of SQL
column values that you forgot to add the " FROM <TableName" at the end
of the SQL query.
ANR2914E SQL identifier token '<Whatever>' is too long; name or component
exceeds 18 characters.
*SM's SQL processing places this limit on identifiers. This error is
most commonly encountered where you entered a string - such as a storage
pool name - into a Select, neglecting to enclose it in single quotes.
ANR2938E The column '_____' is not allowed in this context; it must either be
named in the GROUP BY clause or be nested within an aggregate function.
Usually because you coded a SELECT with a bare column name, plus a Sum()
of another column: the two intentions conflict in that Sum() is the
intention to report a total from all columns, not report all rows as the
bare column name intends.
ANR2940E The reference 'FSERV.STGP_COPY' is an unknown SQL column name.
In a Select, you probably provided an object name (volume name, stgpool
name) without enclosing single quotes such that the SQL processor
thought it to be a column name.
ANR2958E SQL temporary table storage has been exhausted.
A Select has invoked the SQL processor, which for this query needs to
use work space within the ADSM database...but it needs contiguous space,
probably at the end, and there isn't enough.
Per APAR IY08737:
Documentation in the Admin Reference, Admin Guide, and the Message
manual need to be updated to indicate the temporary table is created in
the Maximum Reduction location of the DB and large Query's, Select and
Web Administrative Interface commands will fail with ANR2958E SQL table
storage exhausted if the Maximum Reduction is 0 or is not large enough
to fulfill the Query or Select command.
Otherwise, consider adding a volume to the database for the duration of
the Select processing, thereafter Delete it.
ANR3354W Locally defined administrative schedule XXXX is active and cannot be
replaced with a definition from the configuration manager.
An active admin schedule was not recorded as a managed schedule (i.e.,
was locally defined) and therefore could not be replaced during refresh
processing. To verify that the schedule is locally defined, issue the
Query Schedule T=A F=D command and look at the "Managing Profile" field;
if this field is empty, the schedule is locally defined rather than
managed. Why is the schedule treated as locally defined even though it
was created during refresh processing? Perhaps the refresh processing
failed after creating the schedule; in this situation the new schedule
will not be marked as managed and therefore will be treated as locally
defined. Another possibility is that after the schedule was created,
you deleted the subscription to the managing profile, leaving behind the
administrative schedule which would now be treated as locally defined.
ANR4306I AUDITDB: Processed NNNN database entries (cumulative).
Progress message during a 'dsmserv auditdb'. Customers have reported
that this message may repeat with the same number of entries - but will
ultimately go on and finish. Be reasonably patient. There are some
circumstances where this will go on indefinitely, however.
ANR4513E A database lock conflict was encountered
May be accompanied by ANR0538I.
A lock conflict indicates that multiple processes and/or sessions are
contending for the same TSM db area. Do Query SEssion and Query PRocess
to look for such. Some things, like Expiration and Delete Volume, are
heavy db updaters, where it's necessary to keep too much else from
happening.
ANR4556W Warning: the database backup operation did not free sufficient recovery
log space to lower utilization below the database backup trigger. The
recovery log size may need to be increased.
Well, the DBBackuptrigger event occurred because something was putting a
lot of demand on the Recovery Log; and you would expect it to still be
running as the triggered DB backup was running, making Recovery Log
relief problematic.
ANR4571E Database backup/restore terminated - = insufficient number of mount
points available for removable media.
See: "Drives, not all in library being used"
ANR4639I Restored nnnnn of nnnnn database pages.
Message emitted every 30 seconds during a 'dsmserv restore db', to show
restoral progress.
ANR4706W Unable to open file CommandLineBeanInfo.class to satisfy web session 31
From the Messages manual:
Explanation: A web browser requested a file and the server could not
find the file on the local file system. Special note, some files
requested from the server are not vaild and are caused by a browser
error. For example, a request for CommandLineBeanInfo.class is not a
valid file request. However, a request of a GIF image or HTML page
should not produce this error.
From APAR: IX86373
The problem is that Internet Explorer thinks that the ADSM CommandLine
applet is a JavaBean and requests a file that does not exist. This error
can be ignored. As for resolving the error message created by Internet
Explorer, this is a problem with the browser not with ADSM. The applet
is not a JavaBean. Please note, error messages about missing GIF images
or HTML files should not be ignored. The user should check that the
file exists. If the file does exist, verify the permissions of the
file.
ANR5014E Unable to open disk 0A09 - error 12 received from DISKID.
Return code 12 from DISKID means that the volume is not reserved.
Before the volume was defined to ADSM, it had to have been reserved.
The DSMINST and DSMMDISK execs both prepare volumes for use, but it
sounds like something has happened to the volume, and it is no longer
reserved.
ANR5099E Unable to initialize TCP/IP driver - error binding acceptor socket 0
(rc &eq; 0)
In OS/390 (MVS), TCP/IP runs as a task separate from the operating
system itself. Perhaps someone restarted TCP/IP, leaving all dependent
applications stranded: TSM needs to be restarted to pick up on its
communications. If this occurred right after an IPL, look for your OS
people having "planted" an OS change which took effect with the IPL,
which now keeps *SM from working (maybe OS definitions, TCP/IP software,
execution libraries, etc.). Make sure that the TSM server started task
userid has an OMVS segment with a UID=0 and GID=0.
ANR6913W PREPARE: No volumes with backup data exist in copy storage pool ____.
Have you run Set DRMCOPYstgpool to specify that the copy storage pool is
to be managed by TSM? (Check with Query DRMSTatus.)
ANR7804I An ADSM server is already running from this directory.
The ADSM server has attempted to open the adsmserv.lock file in the
current directory but failed to do so because the file indicates that a
previously started server already has the file open.
Examine the contents of the adsmserv.lock file. The PID for the server
that is or was running is recorded in this file. Two ADSM servers
cannot be started from the same directory. You may remove the
adsmserv.lock file and attempt to start the server ONLY if a 'ps -e'
does not show the PID to be dsmserv.
If there is no adsmserv.lock file, then the more trivial cause is that
you inadvertently invoked dsmserv and you are not root (superuser).
ANR7807W Unable to get information for file _______. A file or directory in the
path name does not exist.
Typically seen in server restart where the administrator has renamed or
otherwise relocated the Recovery Log volume(s). The server has the
Recovery Log pathnames recorded in its database, and expects them to be
present at start-up. You need to reinstate the Recovery Log volumes in
your operating system file system.
ANR7823S Internal error LOGSEG871 detected.
ANR7837S Internal error LOGSEG871 detected.
Usually indicates that your Recovery Log is full during server restart.
You should have taken architectural steps, per the Admin Guide manual,
to prevent this; but now it is too late (but read on). If not at the
maximum Recovery Log size, refer to the Admin Guide to allocate more
space to your Recovery Log: use dsmfmt to create a new Recovery Log
volume; run 'dsmserv extend' specifying the added log volume; restart
your server. If you believe you are at the maximum Recovery Log size,
you still might be slightly under the absolute to-the-megabyte maximum
size, in which case you could allocate that many more megabytes and
hopefully get the server to start. If running in Rollforward mode, you
might instead run in Normal mode, but at the risk of transaction loss.
ANR7838S Server operation terminated - internal error BUF087 detected.
Will be accompanied by Activity Log messages like:
ANR9999D blkdisk.c(1198): Error writing to disk adsm-db1.primary.
ANR0252E Error writing logical page 141410 (physical page 141666) to
database volume adsm-db1.mirror.
Explanation (writev error): A file cannot be larger than the value
set by ulimit.
The server was started with Unix Resource Limits inadequate to encompass
the size of files which the server must deal with. Look first to the
filesize limit specs in the shell you are using (Bsh ulimit), which can
cause them to be lower than the operating system limits defined for the
username (as per AIX /etc/security/limits) - which may also need ceiling
boosts. The TSM server is normally run as root: in some systems, root
may not have had ceiling limits of sufficient size. Sometimes the TSM
server is run as other than root: in the OS, non-root users are
typically given much smaller Unix Resource Limits than root - much too
small to run the TSM server, and need adjustment.
ANR7860W INSUFFICIENT SPACE AVAILABLE FOR FILE file name.
Usually when DEFine SPACETrigger is in effect and you have not left
sufficient space for the expansion you specified to actually occur.
ANR8208W TCP/IP driver unable to initialize due to error in BINDing to Port
1500, reason code _____
(The reason code is the return code from the TCP/IP bind API, which is
to say the operating system error number, so see your particular
operating system reference for the meaning.)
The Tivoli message description has good advice: use the 'netstat'
command (or perhaps 'lsof') to look for another process which has
control of that port, and that your server options file does not specify
more than one use of that port number.
It may be that you are attempting to start another instance of the
server. Make sure that the server process is gone before trying to
restart.
ANR8209E Unable to establish TCP/IP session with 127.0.0.1 - connection refused
Your dsm.opt has 127.0.0.1 set for TCPCLIENTAddress, possibly put there
by the GUI preference editor. Remove the unnecessary, incorrect option
and restart the client scheduler.
ANR8212W Unable to resolve address for ____. (SESSION: ___)
That sounds like a DNS lookup problem. If no one has changed the
options files, then suspect the DNS service which provides lookup
service to that TSM server system. Make sure no one has changed the
/etc/resolv.conf on that system. In Unix, use the 'host' and/or 'dig'
commands to check things out. If no cause can be found, try putting the
full hostname into the options file - including the domain. If still
nothing, try putting the IPaddress rather than host network name into
the options file. One customer encountered a weird hostname showing up,
with a newline in it: upon restarting the CAD service on that client,
the error went away (but someone may have undone a bad change there.)
See also: DNSLOOKUP
(May be accompanied by msg ANR2716E.)
ANR8214E Session open with 111.222.333.444 failed due to connection refusal.
The TSM server is attempting to contact the TSM client on the system
identified by the given IP address, to initiate a scheduled session, as
with client option "SCHEDMODe PRompted" in effect, but could not.
Is the client system up, connected to the network, and the network
working (can you 'ping' it?)? If so, is the client schedule process (or
the CAD) not present, or stopped? Has the client been told to use an
unexpected port number? Is there a network problem, or a client problem
in performing data communications?
ANR8216W Error sending data on socket <number>. Reason 32.
The session was interrupted from the client end such that the connection
broke. For example, in an administrative session, the person performed a
Ctrl-C keyboard action to get out of the session, as in abandoning a
"'C' to cancel" cancelled Select operation, which otherwise can take
hours to terminate by itself. (The 32 is the Unix errno: EPIPE Broken
pipe.) The client dsmerror.log may contain "ANS1074W ***User Abort***".
ANR8220W TCP/IP driver is unable to set the window size to 655360 for client
_____________. The default value will be used.
The tcpwindow size defined under TCP/IP under the operating system for
the client is set smaller than the tcpwindow size defined as the ADSM
client option. In AIX, use the 'no -a' command to see the tcpwindow
size (sb_max value).
ANR8263W End of tape detected on <device type> volume ______ in drive _____ of
library _____.
The server has detected end of tape for the specified volume. The volume
reached the end of tape before arriving at the estimated capacity value
specified in the device class.
The current process stops writing to the specified volume. The status of
the volume is set to read-only. The server accesses another volume if
more data must be stored.
Reduce the estimated capacity in the device class. You can issue the
Query Volume command to view the actual capacity of the volume after it
is full. Use the UPDate DEVclass command to change the estimated
capacity for the device class. Do not use the UPDate Volume command to
change the access mode.
ANR8290W Error sending data through shared memory. Reason <errno>.
The reason code is from the "msgrcv" system call, so it can be found
in /usr/include/sys/errno.h. In particular, 36 is EIDRM, which means
that someone (either the server or someone else using the ipcrm
command) has deleted the message queue which was being used for this
session. Look at the server activity log to see if this session was
canceled or timed out or what.
ANR8300E I/O error on library STK1 (OP=8401C058, CC=205, KEY=FF, ASC=FF,
ASCQ=FF, SENSE=**NONE**, Description=SCSI adapter failure)...
With DLT. This is actually a library failure, not a drive failure.
Do you have "fast load" enabled on the library? This is necessary to
run it with ADSM.
The OP code represents the IOCTL that the server issues to the device
driver. The values of opcodes are platform specific. You can decode
using the symbol table (option 2) in the device driver test tools,
i.e. mttest, optest and lbtest. You can also refer to the codes listed
in IBM Technote http://www.ibm.com/support/docview.wss?uid=swg21155888
("Decoding Opcodes for the TSM Device Driver IOCTLs").
ANR8300E I/O error on library STKLIB (OP=00006C02, CC=207, KEY=FF, ASC=FF,
ASCQ=FF, SENSE=**NONE**, Description=Device is not in a state capable
of performing request).
Seen with a computer word-length mis-match, as in the customer having
installed AIX 5.2 with a 32 bit kernel, but 64-bit TSM software. The
problem disappears with matching 64-bit AIX in place.
The problem may also be exhibited with message ANR8418E.
ANR8301E I/O error on library ______ (OP=________, SENSE=N/A, CC=________).
Notes: The OP and CC values are hexadecimal. OP is the operation code,
where the last four hex digits are 6Dxx (e.g., 6D32), where 6D is the
hex represenation of ASCII letter 'm', which is part of the
"#define MTIO___" for the operation, as found in the
/usr/include/sys/mtlibio.h header file that is installed with the atldd
device driver, and the last two hex digits identify the specific MTIO___
operation:
31 MTIOCLM Mount a volume on a specified drive.
32 MTIOCLDM Demount a volume on a specified drive.
33 MTIOCLC Cancel a queued asynchronous library operation.
34 MTIOCLSVC Change the category of a specified volume.
37 MTIOCLQ Return information about the tape library and its
contents.
38 MTIOCLQMID Query the status of the operation for a given message
ID.
39 MTIOCLSDC Assign a category to the automatic cartridge loader for
a specified device.
3A MTIOCLEW Waits for an asynchronous library event to occur.
40 MTIOCLRC Release a previously reserved category.
41 MTIOCLRSC Reserve one or more categories.
42 MTIOCLSCA Set a category attribute.
The CC value is an MTCC_* number condition code returned from an I/O
control request. You can look them up in the manual "IBM TotalStorage
Tape Device Drivers: Programming Reference", topic "Error Description
for the Library I/O Control Requests", near the back of the manual. The
are also listed in the /usr/include/sys/mtlibio.h header file that is
installed with the atldd device driver.
Also refer to IBM site Technote 1171360 "How to understand 3494 error
message ANR8301E".
ANR8301E I/O error on library ______ (OP=005C6D37, SENSE=N/A, CC=00000023).
CC is the I/O completion code (some of which are documented in Appendix
B of the Messages manual), where 23 indicates that the device does not
exist in the library.
Seen when the mtlib status command shows a drive as "Device available to
Library.", but ADSM finds itself unable to actually use the drive (an
ADSM server 'SHow LIBrary' has it avail=0). Attempted mtlib will show:
mtlib -l /dev/lmcp0 -m -f /dev/rmt3 -V 000081
Mount operation Error - Device is not in library.
If the drive just underwent a card pack (electronics) replacement, the
classic cause of the problem is that the CE set the wrong drive serial
number into the new card pack: getting the same serial number as that of
another drive really confuses the Library Manager and causes mounts to
fail like this. In AIX, verify by first doing 'lscfg -vl rmt\*' and
then compare the true serial numbers from that report with what you get
from 'mtlib -l /dev/lmcp0 -D'. Have IBM fix any inconsistencies.
Look at the 3590 control panel: make sure that the drive is functional,
and that the appropriate Path is Online.
There was a defect in 3.1.x and 3.7.x Server code, affecting drives in a
3494 library, where an UPDate DRive command without a DEVIce
specification can cause the device number to be lost to the server.
(See APAR IC27477)
ANR8301E I/O error on library DAFFY (OP=004C6D31, SENSE=00.00.00.67).
ITSM volumes in the 3494 library have the wrong Category Codes. The 67
indicates no volumes of the required category in the library. You can
verify this via the 'mtlib' command. The most egregious cause of
something like this is a "teach" operation having been performed on the
3494 - which resets Category Codes to 0xFF00 (Insert category). Correct
Category Codes using the 'mtlib' or 'tapeutil' command. (IBM doc
indicates that AUDit LIBRary does *not* fix category codes.)
ANR8302E I/O error on drive ________ (/dev/rmt_) (OP=OFFL, CC=0, KEY=02,
ASC=3A, ASCQ=00)
This message with OP=OFFL is seen at ADSM start-up, when it is issuing
an MTOFFL ioctl command to rewind any tape that may have been left in
the drive and unload it, such that the library will put it away.
ANR8302E I/O error on drive ____ (____) (OP=READ, Error Number=23, CC=403,
KEY=08, ASC=14, ASCQ=01, SENSE=..., Description=Media failure).
Is this a case of a dirty read/write head? Is the same tape okay if
used in another drive? Try using tapeutil or the like to exercise the
tape. See topic "Tape drive cleaning".
ANR8302E I/O error on drive ________ (/dev/rmt_) (OP=WRITE, CC=0, KEY=03,
ASC=0C, ASCQ=00)
The ASC=0C indicates a failed Write. In a LABEl LIBVolume operation,
this may reflect the tape already having an internal label, but
OVERWRITE=YES was not specified as part of the command.
During a Backup, the situation is handled same as writing a tape that
filled: a new tape will be mounted to continue the operation
uninterrupted.
ANR8302E I/O error on drive ________ (/dev/rmt_) (OP=READ, CC=0, KEY=20,
ASC=00, ASCQ=00, SENSE=F0.00.20.FF.FF.D8.50.48. ..., Description=An
undetermined error has occurred). Refer to Appendix B in the
'Messages' manual for recommended action.
Seen to result from a 'CHECKIn LIBVol' being done, but the tape has no
label. Run a dsmlabel on it.
Note that with a 3494 the tape will be spit out, so expect to find it
in the Convenience I/O Station.
Accompanied by message ANR8353E.
ANR8302E I/O error on drive 8MM (/dev/mt0) (OP=SETMODE, CC=207, KEY=05, ASC=26,
ASCQ=00, SENSE=70.00.05.00.00.00.00.18.00.00.00.00.26.00.00.80.00-
.04.00.01.00.00.00.12.BB.5E.00.00.D0.00.00.00.,
Description=Device is not in a state capable of performing request).
You probably used the drive with an operating system command, which
fouled up the settings as compared to what was originally defined via
SMIT in establishing the drive as an ADSM device. Remember that the
"/dev/mt" name indicates that you are using an ADSM driver to access the
device. Repeat the SMIT steps to re-establish the settings.
ANR8302E I/O error on drive DRIVE1 (mt0.0.0.3) (OP=WRITE, Error Number=121,
CC=0, KEY=00, ASC=00, ASCQ=00, SENSE=**NONE**, Description=An
undetermined error has occurred). Refer to Appendix D in the
'Messages' manual for recommended action.
As seen on a SCSI-attached library (e.g., 3583) can indicate use of a
faulty device driver. For example, use of Adaptec 29160 SCSI cards, but
with the Adaptec driver instead of the IBM driver. Refer to the
IBMUltrium.Win2k.Readme.txt or similar file for guidance.
ANR8302E I/O error on drive TAPE0 (MT0.0.0.2) (OP=READ, Error Number=1235,CC=0,
KEY=2B, ASC=4B, ASCQ=00, ...
ASC=4B+ASCQ=00 is a Data Phase Error, meaning that an error occurred
during the Data Phase of a SCSI operation; as when a SCSI target device
receives a zero-length data frame, or too many parity errors have
occurred during the Data-In and Data-Out phases of an operation.
Could be caused by a bad (SCSI) cable (closely inspect the pins; try a
different cable) or faulty SCSI termination or exceeding the SCSI chain
length. If possible, try another drive. If a dsmserv restore, assure
that the device config you are using accurately describes the library
and drives. The library/drive combination may be seldom-used equipment,
in questionable condition: initiate testing at the OS level, writing to
a tape with a command like tar, then try to read the data back. If that
works, try using a command like tapeutil to get raw data blocks off the
tape that had an error: a failure there may point to the tape. For
FibreChannel HBA in Windows, check the MAXSGLIST spec (MAXimumSGList
parameter in the Windows registry).
And, as always, assure that the drive has been cleaned.
ANR8304E Time out error on drive ____ in library ____
May be a mechanically faulty tape keeping a mount from succeeding:
examine the tape cartridge.
ANR8308I <ReqNo>: <Devtype> volume ______ is required for use in library ______;
CHECKIN LIBVOLUME required within __ minutes.
A volume required for an operation (Backup Stgpool input, etc.) but is
not in the library. You have as many minutes as defined on your
Devclass MOUNTWait value to do the CHECKIn LIBVolume. Consider doing
CHECKLabel=No if your library is unchanging and you believe that
checking the volume label is superfluous: remember that a mount can take
considerable time.
ANR8310E An I/O error occurred while accessing library ____.
That could be anything. I recommend inspecting the tape involved in the
operation causing the error: in one case with a 3590 tape, I found that
its leader block had been flipped over (ostensibly by a human), making
it impossible for the drive to get ahold of the end of the tape to pull
it into the drive.
ANR8311E AN I/O ERROR OCCURRED WHILE ACCESSING DRIVE <DriveName> FOR
<low-level operation OPERATION, ERRNO = <DriveErrno>.
Tivoli says: Ensure that the DEVICE parameter associated with the drive
is identified correctly in the DEFine DRive command, and that the drive
is currently powered on and ready. Otherwise...
Errno 22 (AIX: EINVAL) tends to indicate that your tape device driver
(e.g., Atape) is downlevel relative to the drive and needs to be
upgraded to understand what the drive is saying.
Errno 78 (AIX: ETIMEDOUT) A variation on errno 22. TSM is making
requests to these devices which they cannot satisfy.
Some customers who encountered this reported resolution by updating a
device driver or fixing a hardware component. Examples:
- Replaced SCSI cables
- Replaced or updated cardpack in drives
- Upgraded Atape driver
- Upgraded drive and SAN switch firmware levels
- Applied fix for AIX APAR IY10452 and upgraded SDG firmware
Can be because there is a tape in the drive when ADSM wants to mount
one there, and ADSM doesn't remember having mounted that one. This
can occur when the opsys was shut down with ADSM still up and a tape
left mounted per MOUNTRetention; or the library might be shared by
multiple ADSM servers without mediation. This condition will typically
be accompanied by message ANR8455E.
In some libraries, this can occur during AUDIT LIBRARY when there is a
cleaning tape in library: that tape will be loaded into the drive and
audited, generating a read error message.
From a developer's look at the logic: Basically, it's a burp from the
tape drives that gets propogated into our lowest level of code. At that
point, we're checking the return code off of the tape, recognize that
things aren't so whippy and bail out of that transaction.
Might be caused by an ill-behaving device on the SCSI chain, interfering
with the quality or content of the signals from an active device. If
encountered with 3590 drives or other drives having two SCSI ports on
the back, consider trying the alternate port to see if that eliminates
the problem. If multiple drives on the chain, try reducing the chain to
one device, and alternate among them to find the faulty one.
ANR8314E LIBRARY ________ IS FULL.
Well, maybe it is the case that the library is full. Issue an mtlib or
like command to get storage cell statistics from the library. Smaller
units, like the 3581 Ultrium Tape Autoloader, will have a row or matrix
of indicators for each cell that is occupied. The library may have a
false indication of its inventory, and a re-inventorying operation, as
may be performed at power-on time, may correct that. If the false
indication persists, the unit has a problem requiring service. Note
that some libraries have reserved cells and/or areas configured for
input-output operations, so the cells you see not being used may be
off-limits.
If you added a frame to your linear library or column to your silo
library and find it not being used: if a library manager is in effect,
assure that you told it that there is such a new area to be used. Assure
that a Teach operation was executed for the library to learn the
physical position of the new space. If the problem persists, it may be
that the library firmware level is not high enough for it to understand
the new library extension.
ANR8341I END-OF-VOLUME REACHED FOR <Device_Type> VOLUME <Volume_Name>.
The server has detected an end-of-volume condition for the given volume.
The volume is marked full. If more data must be stored, the server will
access another volume for it. (TSM will span a large file onto another
volume as another Segment.) This is a common, expected msg.
If this happens and the volume does not show full when you query it, it
may be the case that TSM is in the process of spanning a large file onto
another volume: a very large file will likely span volumes, and TSM has
traditionally not updated volume statistics until the aggregate or large
file has been completely written (to its final volume). If the volume
still shows Filling after the storage pool operation has complete, this
might be a problem induced by bad tape drive microcode.
ANR8353E 010: I/O error reading label of volume in drive ________ (/dev/rmt_).
With message ANR8302E, can result from a 'CHECKIn LIBVol' being done,
but the tape has no label. Run a dsmlabel on it.
If you know that the volume was in a stgpool, then it is likely that
something other than TSM wrote over the tape, as in a shared library
with inadequate controls. (Various OS utilities such as Unix 'tar' write
their tapes without labels.) You can try reading the tape on all the
drives in your library, as a valiant attempt. You can employ a utility
to print the first few blocks of the tape to get a sense of what wrote
over it. Bad drive microcode may also be at fault, as in writing an EOT
mark at BOT.
Note that for a 3494 the tape will be spit out of the robot, so expect
to find it in the Convenience I/O Station.
ANR8355E I/O error reading label for volume ______ in drive ______ (____)
The internal volume label cannot be read...
First: you're sure that this is a tape which has been in a TSM storage
pool, right?
Is this a new scratch tape in the TSM storage pool? You need to label
new tapes, either via 'LABEl LIBVolume' or the dsmlabel command. And you
should not trust "pre-labeled" tapes from a tape vendor to actually be
labeled correctly.
Or maybe your library is shared with other systems, and they took the
liberty of using a tape you thought was yours. This most typcally occurs
in mainframe environments where the tape library is shared. If this is
the case, tread carefully, as another application may now have viable
company data on that tape. Also consider that this may have happened to
more than just this one tape.
Do you have level-adequate device drivers on your system to handle the
drive? Do your drive and library have the appropriate microcode level to
handle the media type being used?
If your tape technology is troublesome, you may be victim to a cranky
tape drive: it can be that the drive loses contact, rewinds the tape,
and then declares itself ready for more TSM data, resulting in the front
of the tape being written over.
Did you upgrade your server, and is this message only appearing with
tapes written with the new server code...tapes which had been in the
scratch pool? Maybe it's unhappy with the label content on those
previously-labeled tapes, such that running a 'dsmlabel' or 'LABEl
LIBVolume' with the new software may make the new server level happy
with them. Or maybe you need to upgrade your Atape level.
Possible TSM defect (2000/11): This problem happens when a process is
writing to tape and that tape reaches its EOT mark prior to all the data
that needs to be writen goes out. The tape is ejected, and generates
this error. A new tape is mounted and the operation continues normally.
The error is being generated erroneously. Apply maintenance.
Other analysis steps...
Via your operating system, use a utility (in MVS, the 'tapeedit' or
'ditto' commands; in Unix, the 'dd' command) to try to capture and
examine what data *is* at the beginning of the tape (making sure that no
label processing is attempted). If there is readable data there and
it's not a tape label, then the tape was written over: the data content
may give you a clue as to what did it. The tape is history. If it's a
Primary Storage Pool tape, try to use your Copy Storage Pool to restore
the volume.
If even the OS utility is having trouble reading the tape, try
physically examining the tape, including the surface at the beginning to
see if there is anything there that can be manipulated. Then try on all
drives available to you, to see if one can manage to read it.
In any case, you're stuck. Do a Query CONtent to see what's on the tape
and if it can be recreated or ignored as lost. And by all means research
the problem to see how it happened, so as to prevent recurrence.
ANR8359E Media fault detected on <Devtype> volume ______ in drive ____ (____) of
library ____.
Check your operating system error log for indications as to whether the
problem is the tape (media surface defect; dirt on media) or the drive
(head may need cleaning). If it looks like media, follow the advisory
in the message manual
ANR8376I Mount point reserved in device class ______, status: RESERVED.
A Reserve also happens when a process needs multiple devices (eg two
tape drives) in *different* device classes and only one is available...
which is then reserved, while the process waits for another one to
become available, making sure the reserved drive is still available for
use when the other resource becomes available. Whereas most customers
will be operating with one library/drive type and a single device class,
this case would probably be uncommon.
You can do 'SHow MP' and possibly 'SHow LIBrary' to verify this status.
Do 'Query REQuest' to see if anything oustanding. Your only quick
recourse may be to restart the TSM server.
Identify the drive involved and look back in the TSM server Activity Log
to try to determine the circumstances under which the condition occurred
and possible allied software components involved (library client,
storage agent) to help avoid the situation and perhaps lead to
correction.
See also: Reserve
ANR8381E LTO volume ______ could not be mounted in drive ____ (/dev/rmt_)
May be accompanied by messages:
ANR8945W Scratch volume mount failed 659ACP.
ANR1404W Scratch volume mount request denied - mount failed.
ANR8779E Unable to open drive /dev/rmt2, error number=46.
Check the obvious first - that the drive is ready and usable.
If the problem occurs just with that one volume, over multiple drives,
it may be the tape. Again, check the obvious first: that the
write-protect flipper on the cartridge is not set to prevent writing.
Look back in your Activity Log for indications of issues with the vol.
If the volume is, for some reason, assigned to a storage pool, do a
'Query Content ... F=D' to see if the volume, which as a scratch should
not contain any data, in fact does. If it does, do an Audit Volume to
try to fix its state. If not, do a Label Libvolume to fix any label
problem and reset the volume's state.
ANR8413E <Command>: DRIVE _____ IS CURRENTLY IN USE.
The drive is apparently in a Busy condition. The drive's front panel
should indicate what its situation is: perhaps it got left in an odd
state by the CE (in Service mode, etc.). If it's a Magstar drive, you
can use the mtlib command or the like to query its status. You should
additionally check the state of your operating system definition of the
drive. For example, a drive in AIX which shows Defined is unusable: it
has to be in an Available state. Similarly, the tapeutil command can be
used outside of TSM to test the drive. Follow "the chain" outbound from
the operating system and host to the drive and find where things are
awry. Don't overlook a bad cable or two drives with the same SCSI
address. Also important is knowing when this started happening, to track
it to an event or busy fingers. If the drive is physically unoccupied,
a power cycle may clear erroneous state.
ANR8418E DEFINE PATH: An I/O error occurred while accessing library STKLIB.
May be a computer word-length mismatch. See ANR8300E.
ANR8419E DEFINE DRIVE: the drive or element conflict with existing drive in
library.
Maybe because you defined the wrong drive as the SMC. Check the body of
documentation concerning your mini library and its configuration for
TSM, particularly against what you see for elements in OS queries.
In a 3570, this was caused by the configuration of hardware setting in
the panel as SPLIT instead of BASE.
ANR8420E DEFINE DRIVE: An I/O error occurred while accessing drive ________
Typically seen when the drive you are trying to define is already in
use, either actively or because of an IDLE or DISMOUNTING type tape
still mounted. For example, you are using the drive for a Unix tar
operation, or the drive is shared with another server.
More problematic is when you go to define a physical drive that is
already defined and in use by TSM: you should not do DEFine DRive more
than once for a single physical tape drive, or you will encounter
conflicts during TSM operations which can result in the drives being put
offline.
A trivial cause is that the drive is not available, as for example being
in a Defined state in AIX, rather than Available.
I encountered this when doing a 'DSMSERV DISPlay DBBackupvolumes
DEVclass=OURLIBR.DEVC_3590 VOL=000001' and drive 301 was already in
use, busy with a tar operation. Can also occur if you invoke the next
command too soon, such that the dismount of the prior volume has not
yet finished. If you are attempting to do this in order to share drives
across *SM servers, be aware that *SM may not release the drive unless
you render it offline, which can make such sharing prohibitive.
If encountered in a server migration to a new platform (of the same OS),
via Restore DB, this may be remedied by doing DELete DRive, then DEFine
DRive, then DEFine PATH for the drive.
ANR8426E CHECKIN LIBVOLUME for volume ______ in library ________ failed.
Possibly, no tape drive available if CHECKLabel=no was not chosen.
Or you attempted to check in a tape whose Category Code is not FF00
(Insert category), as a physical insertion or Checkout would make it.
(This safeguards against the inadvertent adoption of another ADSM
server's tapes when multiple ADSM servers share a library. Consider
using the 'mtlib' command to change the category to FF00, then repeat
the Checkin.)
ANR8442E CHECKOUT LIBVOLUME: Volume ______ in library ________ is currently
in use.
You performed a CHECKOut, but the volume is either mounted or
dismounting. If mounted, try a DISMount Volume, if no process is
using it. If still not obvious what the problem is, do a direct inquiry
of your library to see what the state of the volume is, as in the 3494
action 'mtlib -l /dev/lmcp0 -vqV -V VolName'.
ANR8443E : Volume ______ in library _______ cannot be assigned a status of
SCRATCH.
You attempted 'CHECKIn LIBVolume ... STATus=SCRatch' or 'UPDate
LIBVolume ... STATus=SCRatch', but the volume is already known by TSM to
be in one of its storage pools. You really meant to use STATus=PRIvate.
ANR8444E DEFINE DRIVE: Library ______ is currently unavailable.
Most likely because you did a DEFine DRive without first having done the
prerequisite DEFine PATH.
ANR8444E Internal Operation: Library _______ is currently unavailable.
As seen with a 3494 library. Possibly, you communicate with the 3494
over ethernet and either something happened to your /etc/ibmatl.conf or
the network address specified in the file is no longer valid: it is best
to specify an IP address rather than a network name, to avoid name
service and reverse lookup problems.
Make sure your lmcpd daemon is running.
Go to the 3494 and make sure that its Library Manager is up, that the
library is in an Online, Automated Operation state, and that the host
from which you are trying to reach it is still allowed in its LAN Hosts
list.
If a TSM 5.1+ system, you may have a Path problem: do Query PATH and
check for problems.
ANR8444E UPDATE DRIVE: Library ______ is currently unavailable.
Well, check the physical library and the paths to it.
ANR8447E No drives are currently available in library <LibName>
Simple cause: Your let the Devclass MOUNTLimit default to 1 such that a
multi-tape operation like MOVe Data cannot proceed; or maybe you
specified a MOUNTLimit with a value more than the number of drives
available.
Can be caused when you try to CHECKIn a 3590 tape without the
parameter "DEVType=3590": it thinks you want other than a 3590 drive
and only 3590 drives are available.
Or you defined your drives in Devclass to be a type other than that
which can be used for your current tape types: you need to redefine the
drives.
Maybe the volume to be mounted is of a type or format which cannot be
processed by the available drive. This is a customer configuration issue
in having contradictory formats and capabilities.
Do 'SHow LIBrary' to see what *SM thinks of the drives. Supplement with
'mtlib' display of drive state.
In a tape drives upgrade (e.g., 3590E->3590H) you may have to delete all
the drives and paths and recreate them.
In a library shared by multiple servers, you need to define the number
of drives actually allocated to each server; otherwise, if you let
DRIVES prevail, each server may think it should have access to all the
drives in the library.
Check the state of the drives in your opsys: in AIX, via lscfg and
lsdev - where they should have a state of Available (not Defined).
If a 3494 library, use the 'mtlib' command to check the state of the
library and drive: the drives need to be online and available to the
library.
Do Query Path and check for drives in an Offline state.
There may be an accompanying "ANR8376I Mount point reserved" message.
If the TSM server is Windows, rebooting to remap the drives may help.
One 3466 customer reports that after a 3466 upgrade that tapes have a
device class of 'funny'/"Unknown": the tapes could be read, but not
mounted for writing. He had to do an UPDate Volume ACCess=Reaonly on
all old tapes and add some new ones.
See also explanation for ANR1144W.
ANR8448E Scratch volume ______ from library ________ rejected - volume name is
already in use
Most likely, you are trying to delete one volume from a Backup Series,
as in the Full that the subsequent Incremental(s) is/are dependent upon.
Use Query VOLHistory to inspect it in context. Try to invoke DELete
VOLHistory so that one whole, old Series is deleted.
ANR8452E Initialization failed for 349X library ____; will retry in 2 minute(s).
Well, the obvious thing to do when your system, which has been accessing
the 3494 library fine up until now, has a problem doing so is to at
least check the status of the library via the 'mtlib' command, if not
visit the library and check its status. Your operators may have opened
the library to deal with a tape problem, and failed to put it back into
Automated Operation mode, or the CE may be working on it, etc. Or there
may be a network problem. Do the basic continuity checks along the line
to isolate the problem. Customers with major libraries should have
active monitoring of them: don't wait for TSM to tell you, indirectly,
about problems with your library.
ANR8455E Volume _______ could not be located during audit of library ________
Typically occurs when a tape is idle in a drive when the opsys is shut
down without shutting down ADSM beforehand, thus causing the tape to
be trapped in the drive. When ADSM is restarted it cannot find the
tape in the library storage cells and thus this message.
Expect to see message ANR8311E during attempted use of the drive that
has the tape trapped in it.
ANR8463E <MediaType> volume ______ is write protected.
May indicate exactly what it says, indicating that the cartridge is set
to prevent writing. But in practice, this messages is often seen on
tapes which have just recently been written and, in this mount case,
experience this condition despite the cartridge allowing writing. This
is apparently caused by invalid sensing by the tape drive, perhaps due
to faulty microcode. Note that TSM does not set the volumes Access to
Readonly - and it may keep trying to use that one volume.
ANR8469E Dismount of 3590 volume ______ from drive <Its_TSM_name> (/dev/rmt_) in
library ________ failed.
This may not be accompanied by an I/O error indication.
The AIX Error Log may contain an lmcpd SYSLOG message entry saying
"ERROR on <Libname>, ERA 6D Library Drive Not Unloaded"
and there may be TAPE_ERR4 entries. Further, when you go to manually
unload the drive there may not be any error code on the drive panel,
and the manual unload will work fine.
This set of circumstances suggests that the drive hardware is working
fine, but that there may be a fault in the "card pack" (electronics).
ANR8500E No paths are defined for library ______ in device configuration
information file.
Seen in one customer's attempt to perform a 'dsmserv restore db', though
the devconfig file was fine. The cause turned out to be another systems
guy having changed the network address of the 3494 without telling
anyone: TSM could not get to the library through lmcpd and the
now-inaccurate /etc/ibmatl.conf file.
ANR8555E An error (<Error Code, Error String>) occurred during a read operation
from disk <DiskName>.
The error number is from your operating system. Pursue that. In the
case of Windows, refer to a good error codes web page, like
http://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/
tsleft.web+WinBatch/Error~Codes+Windows~System~Errors.txt
ANR8749E Library order sequence check on library ________.
ADSM thinks that mount/dismount is pending, in progress, or done for
the volume, typically because ADSM thinks that the tape was left
mounted in the drive by some other action.
ANR8775I Drive ________ (/dev/rmt_) unavailable at library manager.
Spontaneous cause: Intervention Required condition; library panel
message that the drive has failed and requires service.
Manual cause: When we go to the 3494 console and use the Availability
menu to change a bad drive's status to Unavailable until it is repaired,
ADSM will sense this when it goes to use a drive. A 'SHow LIBrary' will
show "avail=0" for that drive.
ANR8776W Media in drive _______ (/dev/rmt2) contains lost VCR data; performance
may be degraded.
(This can be an individual tape problem; but if it happens on all
tapes, suspect a recent application of faulty 3590 microcode.)
Best to do a 'MOVe Data' to get data off the tape, then do a dsmfmt to
reinitialize the tape.
ANR8779E UNABLE TO OPEN DRIVE <DriveName>, ERROR NUMBER=<OS error number from
the open attempt>
The drive cannot be opened by *SM. In AIX, error number is the value
of errno returned by the operating system. In Windows, it is the
Windows Error Code (aka Exit Code, or Microsoft system error). In OS/2,
it is the value of the return code from the call to DosOpen.
Unix: Errno 2 is "No such file or directory". In the trivial case, this
may simply reflect having issued a DEFine DRive or DEFine PATH
command like "DEVIce=rmt1" instead of "DEVIce=/dev/rmt1" (there is
no 'rmt1' in your current directory). It can also indicate that
the drive name you are attempting to use in TSM does not match the
one defined in the operating system. In AIX, consider doing like
'rmdev -dl fscsi0', then 'cfgmgr ...'.
Unix: Errno 6 is "no such device", indicating a problem in the
operating system's configuration with the device. Has the
/dev/rmt_ definition disappeared from your system (as in someone
doing an rmdev, or a cfgmgr with drives powered off)? On AIX, use
lsdev to assure that the drive has state Available, not just
Defined. You can double check it by using tapeutil/ntutil or the
like to try to open the drive as well.
Errno 16 is Resource Busy - Device Busy. Use an OS command, such
as lsdev in AIX, to check device status, and visit the drive's
front panel if necessary. In all cases, make sure that the drive
is properly cabled, has the right SCSI or like address, and is in
a proper state to be used by a host application. If the drive is
physically connected to more than one computer system, make sure
it's not in use by another system.
Errno 46 is Device Not Available, as in the tape drive being in an
Offline state. If AIX, use 'errpt' cmd to seek error detail. If
no entries reflected in error log, perhaps the device is not
defined to AIX: does 'lsdev -Cl rmt_' show Available? If a new
SCSI device, assure element numbers correct.
Errno 47 means that the media is write-protected.
ANR8782E Volume ______ could not be accessed by library ..."
This message is issued when an ERA code is received from the library
manager indicating that it can't access the volume (ERA 64, 67, 6B, or
75). Use the 'mtlib' command to check for the volume being in the
library's inventory, or use the library's console.
ANR8806E Could not write volume label <VolName> on the tape in library _______.
Seen on LABEl LIBVolume. The tape drive failed to write the label on
the volume. May be due to a defect on the tape. Or, more
interestingly, can be caused by a gross physical defect in the mounting
itself, as can be verified via command external to *SM, such as doing
the mount via the 'mtlib' command, which may yield the error message:
"Mount operation Error - Internal error.". If this is an unused tape,
consider doing a 'CHECKOut LIBVolume REMove=No', getting it mounted on
an available drive, then running the 'tapeutil' command operations
"Read and Write Tests" and "Erase" - which will give the tape a workout
and *maybe* overcome tight winding or like problems (but don't get your
hopes up, even if these exercises seem to work). If those exercises
look good, do a LABEl LIBVolume and give the tape a try in a
non-critical way.
ANR8813W Unable to read the barcode of cartridge in slot element 48 in
library _________.
This has been seen with downlevel drive microcode, particularly in the
3575 librraries, which have been notorious.
ANR8820W REPAIRING VCR DATA FOR VOLUME volume name IN DRIVE drive name; MOUNT
MAY BE DELAYED.
The Volume Control Region of the cartridge in the drive are lost or
corrupted, which results in the inability of the drive to do fast
locates to file positions on the cartridge. The VCR is being rebuilt
during the volume mount process in order to avoid performance
degradation on this and future mounts of the volume. There may be a
long delay because the VCR is rebuilt by spacing the tape forward to the
end-of-data.
Solution: See "VCR data" topic.
ANR8824E I/O Error on library _____; request 0F0DF5AE for operation 004C6D31 to
the 3494 Library Manager been lost.
Explanation: A command for the operation was issued to the Library
Manager and a response was not received within the maximum timeout
period.
System Action: The operation and the transaction fails.
User Response: Verify that communications with the library is
operational, that it is online and ready for commands. If the problem
persists, provide your service representative with the 3494 Library
Manager transaction logs and the request id from the failed operation.
Note: If *SM was trying to perform a dismount, this may result in
subsequent ANR8469E Dismount ... failed messages appearing repeatedly:
an mtlib dismount would need to be performed. The AIX Error Log will
also have "Resource Name: lmcpd" errors.
ANR8830E Internal 3590 drive diagnostics detect excessive media failures for
volume XXXXX. Access mode is now set to read-only.
Accompanied by: ANR8831W Because of media errors for volume XXXXX, data
should be removed as soon as possible.
Reflects a MIM error (q.v.).
ANR8834E Library volume <Volname> is still present in library <Libname> drive
<Drivename> (<OSdevname>), and must be removed manually.
As seen during an AUDit LIBRary.
If this occurs for all the drives in the library, and particularly when
the Volname is reported as "**UNKNOWN**" and library inspection shows no
volumes mounted, then there is a serious problem with the library or in
getting valid information from it. This may be due to miscommunication
deriving from drivers in the operating system and/or microcode in the
library manager, which upgrades might fix. Consider utilizing an OS
command or interface provided by the library vendor (such as 'mtlib' in
the case of IBM Magstar) to query the library outside of TSM to try to
obtain the same type of information. Likewise do the same at the
library's control display, if it has one. If the problem persists,
power cycle the library at an opportune time and have it inventory
itself.
ANR8840E Unable to open device /dev/smc0 with error 50.
Seen in a 3584 tape library, or similar SCSI library. 50 is the Unix
errno: ENOCONNECT - Cannot Establish Connection. / No connection.
Has been experienced when the first drive was removed from the library
for repair, while leaving the library available for service. But it
turns out that the first drive, /dev/smc0, is the "master drive" for the
library, and with it removed, the library does not function.
Less obvious cause: Where Atldd and Atape are being used to control the
library and drives, old versions may be in play. If so, upgrade your
Atape and Atldd driver levels.
ANR8847E No 3590-type drives are currently available in library _____.
As during an attempted Label Libvolume. This command will not wait for
a drive to become available, even if one or more drives have Idle tapes
or are in a Dismounting state. Try again after a dismount has left at
least one drive available. It might alternately be the case that a
drive or two is offline: do 'Query DRive'.
ANR8848W Drive _______ of library _______ is inaccessible; server has begun
polling drive.
Look for operating system error log entries for the drive, detailing
problems with it. And there's no substitute for inspecting the drive.
On a 3494, do like: mtlib -l /dev/lmcp0 -f /dev/rmt2 -qD
to check the status of the drive, which may show "Device not available
to Library.". The 3494 Library Manager PC should now contain error
indications that the CE can examine to determine what went wrong. You
could try, from the LM's control panel, making the drive available to
the library again - and perhaps from that action get an indication of
its problem. It may result in an Intervention Required condition at the
Library Manager panel, which would help identify the problem. You could
try doing a Reset Drive from the drive's panel.
This can result from doing 'rmdev -l rmt_' in AIX to take the drive
offline without doing 'UPDate DRive ... ONLine=No' and then *SM tries to
use the drive.
Or perhaps you did not do 'UPDate DRive ... ONLine=No' before using the
drive for non-TSM purposes. (TSM will eventually give up on the drive
with message ANR8471E, and 'Query DRive' will show the it Unavailable.)
Also seen in sharing 3590 drives in a 3494 library (auto-sharing). When
one server obtains the use of a drive, other servers requiring the use
of that drive will find that the drive is locked, and begin to poll the
drive. When the drive becomes free again, the following message results:
ANR8839W Drive _______ of library _______ is accessible. The pending
operation will then proceed on the second server. Note however, that
this behavior is governed by the MOUNTWAIT parameter; if this value is
set too low, the pending transaction will time out before the server
with ownership of the drive releases it.
ANR8914I Drive ____ (____) in library ________ needs to be cleaned.
The drive has returned indications that it needs to be cleaned. In an
automatic library, the library manager will take care of this; but TSM
needs to relinquish the drive.
ANR8939E The adapter for tape drive ________ cannot handle the block size
needed to use the volume.
In Windows, it may be due to the Registry value MAXimumSGList being too
low. See: Ultrium and FibreChannel and ...
ANR8972E Unrecoverable drive failures on drive RMTxxx; drive is now taken
offline.
Reflects a SIM error (q.v.).
ANR9627E CANNOT ACCESS NODE LICENSE LOCK FILE: file name.
Results when the server cannot get at the nodelock file, or the file
system containing it is full, as when doing REGister LICense.
ANR9613W Error loading ./dsmlicense for Licensing function: Exec format error.
Seen when boosting the TSM server level in AIX. "Exec format error"
occurs when the system goes to load a compiled program and can't
recognize its format. This can be due to a downlevel C library (fileset
xlC.rte) or, more likely, a bad mix of 32-bit vs. 64-bit software.
Seen in 64-bit AIX TSM as follows: Fileset tivoli.tsm.license.cert is
common between the 32-bit and 64-bit versions of TSM; but
tivoli.tsm.license.rte is specific to 32-bit and
tivoli.tsm.license.aix5.rte64 is specific to 64-bit, and both of those
resolve to /usr/tivoli/tsm/server/bin/dsmlicense. It may be the case
that you are running 64-bit AIX, but that both the 32-bit and 64-bit TSM
were installed, where the 64-bit server and license fileset were
installed first, but then the 32-bit server and license fileset
thereafter, so the 64-bit modules wer supplanted by the undesired 32-bit
version. Then server maintenance was applied, which resulted in the
32-bit tivoli.tsm.server.rte to be installed first, then the
tivoli.tsm.server.aix5.rte64 server. So then you have a 64-bit dsmserv
module but a 32-bit dsmlicense module. No good. You can verify which
version you have by comparing the mtime timestamps of directories and
files in /usr/lpp/tivoli.tsm.license.rte and
/usr/lpp/tivoli.tsm.license.aix5.rte64 vs the ctime (ls -lc) on your
/usr/tivoli/tsm/server/bin/dsmlicense and dsmserv.
ANR9716E Device '/dev/lmcp0' is not recognized as a supported library type.
May occur when doing 'dsmlabel'. Indicates that the 3494 is in an
Offline state. Go to its operator station and make it Online.
ANR9718E Device '/dev/rmt0' is not recognized as a supported drive type.
Typically, you did not define the tape driver to ADSM, as it
requires. ADSM uses its own tape drivers: you cannot use those supplied
with AIX! Use SMIT to define the drivers, according to the ADSM Device
Configuration manual. The resultant device is typically /dev/mt0 and
/dev/mt0.1 .
ANR9725E The volume in drive '/dev/rmt?' is already labeled (VVVVVV).
You tried to use 'dsmlabel' to label a tape which was pre-labeled by
the vendor.
ANR9798E DELETE DRIVE: One or more paths are still defined for drive ____ in
library ____.
In response to 'delete drive <LibrName> <DriveName>': Do 'show path',
which will probably reveal a redundant drive. Do 'delete drivemapping
<ServerName> <LibrName> <DriveName>' on it.
ANR9969E Unable to open volume F:\TSMDB\SERVER1\LOG08.DSM. The most likely
reason is that another TSM server is running and has the volume
allocated.
Seen in the restart of the *SM server on a Windows machine, after some
odd event. Something is holding a lock. Rebooting the machine usually
clears the problem.
ANR9999D ...
Messages vary. This is a catch-all message number which the developers
use for internal server errors rather than create and document separate
message numbers for various, hopefully rare and unusual, conditions.
The DISAble EVents command's "SEVERE" operand can disable these.
Note that the number in parentheses, like in "ANR9999D dfmigr.c(3224)",
can be expected to be the source code line number, as via the ANSI C
__LINE__ definition.
Few customers look up ANR9999D in the Messages manual to gain
perspective on its intent...which is to provide diagnostic information
which may help you find an APAR which has already been created to
address the circumstance, or to provide diagnostic information to TSM
Support when it is a new problem. The content of the message is
intended more to assist the TSM Support person in handling the problem
rather than directing the customer in a course of action. Thus, if such
a message talked of performing an AUDITDB, I would not infer that it is
telling the customer to do so, but rather that, after looking at the
full picture, an AUDITDB may be a course of action. Keep in mind that
taking action yourself may be "playing doctor", and could result in
irreparable damage to your TSM system. If research on the IBM site does
not turn up an obvious solution to the situation, contact TSM Support
rather than undertaking actions yourself.
In the server, you can do 'Set CONTEXTmessaging ON' to get more info
when they occur.
ANR9999D adminit.c (982) Insufficient log space to update Administrator -
.Attributes
Refer to server commands manual, "DSMSERV EXTEND LOG".
ANR9999D admstart.c(2191): Error 21 from lvmAddVol.
You did 'dsmserv extend log ...' but the volume specified is one
previously used by *SM. You cannot extend the Assigned Capacity of an
existing volume in a stand-alone manner: you have to do that online.
In stand-alone mode you can only add a new volume. The approach you
can take is to temporarily create a File type volume (as in /tmp) -
enough to get your system up - and once up, do an EXTend LOG and then
delete the temporary volume (DELete LOGVolume).
ANR9999D AFMIGR(500): Error checking pending volumes for completion of reuse
See "ANR0104E ASVOLUT(2202)".
ANR9999D afmigr.c(2574): Reconstruction of aggregates is disabled. Run audit
reclaim utilities to re-enable reconstruction of aggregates.
Instructions about this problem are in the 3.1 server README.SRV file.
ANR9999D AFMIGR(2619)
The older README's contain informatiom about this. You start with
running an "AUDIT RECLAIM" command, then you "select * from
RECLAIM_ANALYSIS" and if this table is empty you "cleanup reclaim" and
this reactivates reconstruction processing.
ANR9999D asalloc.c(1195): Missing allocation storage pool.
Reportedly seen during reclamation on version 3.1.1.3 (AIX).
Upgrading to version 3.1.1.5 fixed the problem.
ANR9999D ASRTRV(494): End reached prematurely on volume ____
This message indicates that the database information for a particular
file was not consistent with the actual file data on your storage pool
volume. Using metadata from the database, the backup operation tried
to read a certain number of bytes from a volume, but encountered
end-of-volume before that number of bytes had been read. The root
cause of the problem is likely faulty drive microcode, one case being
the drive dropping tension after a long idle, but failing to verify
position after starting the next operation, wherein the tape had
slipped back a bit, thus causing new data to write over old. Tucson
calls it the "Chopped Block" problem.
Audit Volume Fix=Yes should eliminate this inconsistency by deleting
the problem file from the database.
ANR9999D asutil.c(210): Pool id 6 not found.
ANR9999D asutil.c(215): Pool id 27 not found.
A storage pool shows up numerically during *SM start-up or reclamation.
This message may be caused by the disappearance (unusual removal) of a
storage pool which *SM knew to be valid. Maybe you had some copy groups
(under management classes) that pointed to these storage pools. (*SM may
let you delete a storage pool even though you have copy groups using
it/them.) May be able to fix by updating the copy groups and activate
the policy set again and these should clear. One user reports being
able to accidentally fix this by going into the admin graphical
interface (dsmadm); when he went to exit from the reclamation tab of the
3570 pool it asked if he wanted to save changes. He did not think I had
changed anything, but went ahead & let it save. That appeared to fix
the problem. Otherwise, run an AUDITDB to reconcile the database with
reality. (One customer reports that running 'dsmserv auditdb storage'
was efficacious.)
ANR9999D asvol.c(1043): ThreadId<ThreadNumber> NumEmptyVols went negative for
pool -2.
One of the storage pool descriptor records has a field which records the
number of empty volumes in a pool. Whenever a volume is deleted (as when
it goes empty or is deleted), this count is decremented. Logic checks
whether the count is already zero and, if so, this message is emitted.
The overall cause is a server defect.
ANR9999D asvolmnt.c(1586): Unknown result code (30) from pvrOpen.
Seen where a storage pool volume is physically in the library, but is
not in a TSM checked-in state. Do a Checkin.
ANR9999D BFCREATE(768): Bitfile aggregate 0.7514479 not found in any storage
pool.
ANR9999D BFCREATE(781): Bitfile aggregate 0.7514479 not found for delete.
ANR9999D BFCREATE(712): Inconsistent content for alias aggregates 0.14869371 and
0.7514479.
As encountered in a Delete Filespace. Accompanied by message ANR0859E
Data storage object erasure failure, DELETE FILESPACE process aborted.
You may have to contact Support for resolution. Possible cause is that
a storage pool volume went away, leaving the database entries for the
filespace objects orphaned. Run a Select on the Backups and Archives
tables, seeking the Object_Id that is the lower portion of the bitfile
number (7514479) and try to track the object name to the volume it is
supposed to be on. If the volume is inadvertently out of the picture
and can be reinstated, you would be in luck; else you may have to do an
Audit.
ANR9999D bfcreate.c(1906): ThreadId<18> Destination switched from BACKUPPOOL to
MAGPOOL in the middle of a transaction. [The pool names don't matter.]
Typically seen when the TSM server is upgraded from v4 to v5 without
regard for the levels of existing clients, and a v3 client (possibly
employing DIRMc) attempts to then back up data to the upgraded server.
The server upgrade finally put the server out of the reach of the
antiquated client. IBM stipulates what client-server level mixes will
work (and those would be reasonably contemporary mixes): outside that
range, the mix is untested and unsupported - and obviously may not work.
ANR9999D bfutil.c(3276): ThreadId<51> Unnexpected error obtaining AUX bitfile
information. Callchain of previous message: 0x0000000100017c74 outDiagf
<- 0x00000001002082 cc bfIsAuxFile <- 0x00000001003720f8 DoBackQry <-
0x0000 000100380ccc SmNodeSession <- 0x0000000100434f84
HandleNodeSession <- 0x000000010043ad68 smExecuteSession <- 0x0
00000010042d920 SessionThread <- 0x0000000100007fd0 StartThread <-
0x09000000004e9244 _pthread_body <- (SESSION: 2439)
How ugly is that? Probably accompanied by ANR0538I. One customer
encountered this where database activity was too intense: spreading out
the workload relieved the situation.
ANR9999S Bitfile not found
Seen preceded by "Invalid Object header state in Retrieve Operation"
Could be when you have done a DELete Volume on a primary stgpool volume,
and/or when the primary tape is bad (i.e, unavailable or destroyed), and
some tapes in the backup stgp group are bad also.
ANR9999D dballoc.c(802): Sequence number mismatch for SMP page addr 417792;
HeaderSmpNum = -1, Expected = 408.
Database is corrupted, often involving a rude termination of the
server. MVS users have reported seeing this when deleting filespaces,
and believe that restarting the server right after a deletion helps
them.
ANR9999D DFQRY(449) Missing row for bitfile 0.29693264.
Has been seen in attempting to delete a volume but encounter message
"Volume still contains data", and cannot be deleted. The only reported
solution is to 'DSMSERV AUDITDB FIX=YES DISKSTORAGE'
ANR9999D dsalloc.c(1899): Error writing to volume /dev/rstorage_pool: execRc=-1,
summaryRc=-1.
Encountered when writing to this raw logical volume. Because of the
write error the volume is set to readonly (message ANR1411W).
ANR9999D dsvol.c(501): Error 2 creating bit vector DSKV0000010123 for disk
/dev/rlv-hsm-stgpvol2.
Experienced when performing a DEFine Volume to add a raw logical volume
to a storage pool. The error code 2 translates to BVRC_TOO_LARGE: the
*SM server does not allow the definition of volumes which are larger
than the largest size which the operating system supports for a file.
(TSM itself does not support a volume larger than 1 TB.)
Your recourse is to instead define smaller volumes.
ANR9999D Error reading from standard input; console input daemon terminated
Perhaps you didn't start the server from the server directory, and the
server cannot find some config files.
ANR9999D icrest.c(2076): ThreadId<0> Rc=33 reading header record.
Contrary to what an errno 33 is supposed to mean, this indicates that no
drives were available to mount the needed tape volume, as during a
dsmserve restore db.
ANR9999D icstream.c(1047): ThreadId<9> Invalid record header found in input
stream, magic=_____
You may be trying to restore a TSM db from an older TSM level to
5.1.x.x, which cannot be done. See APAR IC33690.
ANR9999D icvolhst.c(4329): Error Writing to output file
You have too little remaining disk space, as in either the file system
filling or because disk quotas prevent you from using more. In Unix,
use 'df' and/or 'du' commands to examine file system capacity, and
consider deploying the public domain 'lsof' command to see open files.
ANR9999D imexp.c(3405): Error comparing deletion date for object 0 5045722.
If you are using server-to-server functions, it may be that the times
on the two servers are not synchronized.
ANR9999D imutil.c(1296): Error deleting object (0 2361348)
As seen in a Delete Volume operation for a volume containing Archive
data. The second number in the pair is the OBJECT_ID in the Archives
database table: you could perform a Select to identify the name of the
file involved; and *maybe* you could perform a Delete Archive operation
from the owning client system to get the problem file out of the
database.
ANR9999D imutil.c(2555): Lock acquisition (ixLock) failed for Inventory node 17.
This has been seen to occur when you run a Query OCCupancy while an
Import is running.
ANR9999D imutil.c(5570): ThreadId<100> Bitfile id 0.496165168 not found.
Seen by a customer in an EXPIre Inventory. Indicates an inconsistency
in the database. Possible fix: do a Select search on the Backups or
Archives tables on that OBJECT_ID to identify the filespace object, then
via the Contents table identify the volume it is on, then do an
'AUDit Volume Yes'.
ANR9999D Invalid attempt to free memory (invalid header); called from 10020e2e4
(aftxn.c (643)).
Seen when migration can't work because the its target storage pool is
not writable. May be accompanied by msg "ANR1025W Migration process 5
terminated for storage pool <SomeStgpoolName> - insufficient space in
subordinate storage pool. (PROCESS: 5)".
ANR9999D LOGSEG(415) Log space has been over committed - OR ...
ANR9999D logseg.c(498): Log space has been overcommitted (no empty segment
found) - base LSN = 577969.0.0.
Accompanied by: ANR7837S Internal error LOGSEG871 detected. (q.v.)
ANR9999D lvminit.c(1915): ThreadId<0> The capacity of disk '/some/name' has
changed; old capacity 77056 - new capacity 102656.
ANR9999D lvminit.c(1671): ThreadId<0> Unable to add disk '/some/name'
Message set seen in a case where the TSM Database and Recovery Log
volumes were all lost after a TSM shutdown, and the customer responded
by creating new volumes and doing 'dsmfmt -log path size' (only), then
attempted to perform a 'dsmserv restore db [preview=yes]'.
The first message is informational: the second message indicates that
TSM cannot proceed with the volume.
Look out for the former dsmserv.dsk file still being in place, which
may identify the old log and database volumes rather than new ones.
Further, your device configuration server file may name file system
objects which were on the lost disks, and need reworking to reflect
your re-established server object on the replacement disks.
ANR9999D lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has changed;
old capacity 983040 - new capacity 999424.
Oh, you're using those dangerous Raw Logical Volumes, where there's no
file system in the logical volume to clue someone in that it's in use
for something, and it appears that someone did a 'chlv', 'extendlv', or
like command to change the size of the logical volume - which speaks to
deficient site administration practices. To deal with this, you first
have to discover who did it, why, and what may be set up to start using
this "empty" space. (If you don't head off this big truck which may be
coming at you, your attempts to recover may be futile.) Then you may
have to restore your db - depending upon what was on that volume - or
use prep routines to replace an empty volume.
ANR9999D lvminst.c(323): ThreadId<0> Error creating Logical Partition Table
for LOG volume ________.
Seen when setting up to restore the *SM database onto another server
machine, and the Recovery Log size exceeds the architectural maximum
ANR9999D mmsflag.c(4551): Operation 004C6D32 failed with Command Reject.
Accompanied by:
ANR8301E I/O error on library DAFFY (OP=004C6D32, SENSE=00.00.00.27).
Probably: The tape label could not be read. Try another drive to see if
it is a drive problem, rather than volume problem. Look for hardware
error indications. If it is a scratch tape, you could try relabeling the
tape and try mounting it again.
ANR9999D Monitor mutex acquisition failed; thread 0 (tid 537551472).
The BUFPoolsize is too large.
ANR9999D pvrfil64.c(1056): ThreadId<30> Error writing FILE volume
V:\TSM\RECLAIMPOOL1\00000DAA.BFS.
Is there space in the filesystem? The MAXCAPACITY of the device
class...is there that much space available?
ANR9999D pvrntp.c(1838): Error writing EOT to NTP volume xxxxxx
Encountered when someone opens the cap door and ejects a cartridge that
is in the process of being written.
ANR9999D pvrgts.c(4059): ThreadId<9> Invalid block header read from volume
______. (magic=5A4D, ver=20048, Hdr blk=5 <expected 0>, db=0
<262144,262144,0>)ANR9999D icrest.c(2076): ThreadId<0> Rc=30 reading
header record.
During a 'dsmserv restore db volumenames=______ devclass=____': If also
accompanied by messages (ANR8326I, ANR8335I) which talk of a device
class (GENERICTAPE) which differs from what is specified on the command
line, it may indicate that dsmserv could not read a possibly incorrect
devconfig file to ascertain the actual tape drive type, and as a result
may be misinterpreting the contents of the tape.
ANR9999D pvrserv.c(650): Error positioning SERVER volume ___________ to MM/DD/YY
HH:MM:SS 1:0.
May be accompanied by:
ANR9999D icrest.c(2076): ThreadId<0> Rc=30 reading header record.
Encountered during an Import: The label prefix in the device classes
has to be the same in both old source server and new source server
device classes. See APAR IC26603.
Encounterd during 'dsmserv restore db': As when refreshing a test
version of a TSM server from your production version. Can be caused by
use of incorrect device configuration file, and the failure to format
the log and database volumes, and perhaps using a different server
options file than was on the source server (where it may have specified
a different devconfig file). Also seen where the new server was set up
with raw logical disk volumes whereas the original server was actually
set up with its disk as filesystems. Another customer reports
encountering this on a new Windows system where the tape drivers were
not installed (they are not installed by default for a WIN2K server
instance).
ANR9999D smadmin.c(2649): IMPORT: Error - Authorization Rule aleady exists.
Remember that the Import default is Replacedefs=No. An Authorization
rule is a specification that allows another user to either restore or
retrieve a user's objects from ADSM storage, and seems to already be
defined in your destination server. You'll either have to resolve the
conflict or allow replacement.
ANR9999D smexec.c(976): Session NOT allowed in standalone mode.
Some clients are attempting to connect to your server while you have it
in some kind of recovery mode.
ANR9999D smexec.c(1171): Session NNNN with client _____ (WinNT) rejected -
server does not support the UNICODE mode that the client was
requesting.
Maybe: ADSMv3 Windows NT client and server at version 2 and the nodename
parm was not used in the dsm.opt file.
ANR9999D sminit.c(656): ThreadId<20> SM Failed to Initialize - Time Out.
Seen when in a stand-alone dsmserv operation or doing UPGRADEDB, and a
client attempted to initiate a session, when the server is not in a
position to initiate sesssions; thus this Session Manager Initiation
message. Might also be the result of a hacker doing port scanning at
that time, or the result of anti-virus software in action. See "Server,
prevent all access" for blocking clients during such activities.
ANR9999D smnode.c(6786): Bitfile not found for BackMigr, session NNNN, client
<NodeName> (<OpsysType>), bitfile 0.201155604.
Pursue as in other bitfile issues documented herein.
ANR9999D smnode.c(5323): Error validating inserts for event 14995.
Seen when using the TDP for MS SQL to view data stored on the server by
a different level of that TDP. Backups made with TDP for MS SQL
Version 1 CANNOT be queried or restored using Version 2 nor can backups
made with Version 2 be queried or restored using Version 1: you must
keep TDP for MS SQL Version 1 for as long as you have Version 1 backups
that may need to be restored. See User's Guide topic "Version
Migration/Coexistence Considerations".
ANR9999D smnode.c(7091): ThreadId<594> Error receiving EventLog Verb - invalid
data type, 24944, received for event number 4964 from node (WinNT)____.
Most likely, screwy in the install of TSM on the named client, resulting
in inconsistencies in what the client is sending to the server. (For
example, imagine a client administrator who never follows instructions
when installing software, and leaves the client scheduler running while
he upgrades the TSM software underneath it, and doesn't reboot
afterward. Or a TSM upgrade was interrupted before it completed, and
the client admin went ahead and started the client scheduler anyway.)
The best course is probably to reinstall TSM on that client, by the
book, and reboot after doing it.
ANR9999D smnqr.c(1132): Bitfile 61238278 not found for retrieval.
Do 'SELECT * FROM BACKUPS WHERE NODE_NAME='UPPER_CASE_NAME' AND
OBJECT_ID=61238278' to get HL_NAME and LL_NAME of the file.
then do 'SELECT * FROM CONTENTS WHERE FILE_NAME='{HL_NAME} {LL_NAME}''
to see whether the file exists on any volume. (Note that a Contents
search is time-consuming.)
It might be that there's a damaged volume that should be audited; or a
reclamation might dispose of the entry if it's old; or you may have to
Audit your database.
ANR9999D ssrecons.c(2210): Invalid magic number found in frame header
May be seen with optical media: "Invalid magic number" error messages
may be triggered because of not tracking the sides of double-sided
media. The error will typically arise during reclamation or
reconstruction.
ANR9999D tcpcomm.c(1567): SessionThread: return code from setsockopt is 22
Seen under Solaris server. Probably related to TCP buffer sizing in
Solaris. See http://www.sean.de/Solaris/tune.html (search for EINVAL,
which is the errno name for the 22 number). Clicking on the SUN TCP/IP
Admin Guide link there takes you to the gospel, at http://docs.sun.com/
ab2/coll.47.4/NETCOM/@Ab2PageView/1787?DwebQuery=ndd#FirstHit where it
says "Attempts to use larger buffers fail with EINVAL". You might try
changing your client and server TCPWindowsize options to see what might
improve things; or perhaps it may be a Solaris adjustment. Confer with
your Solaris people on this.
ANR9999D xibf.c(664): Return code 87 encountered in writing object 0.9041218 to
export stream.
As seen during a server Export operation. It *might* be simply the
result of a tape drive that needs cleaning. Otherwise, it could be a
storage pool file that has a length issue, for example: pushing the file
out of existence with repeated 'dsmc s' Selective backups could end the
problem. You can identify the file via 'SHow INVObject 0 9041218', as in
this instance.
ANR9999D smlshare.c(2174): ThreadId<81> Server-to-Server protocol error. unknown
verbType=20992.
Unrecognized verbs are your big clue to a level mismatch, as in a higher
level server using verbs which a lower level server is not programmed to
recognize, as in trying to mix a 5.2 and 5.1 server. Pay attention to
required levels and Readme files.
ANS-----(client messages)-----------------------------------------
ANS0101E NLInit: Unable to open message repository
'/usr/tivoli/tsm/client/ba/bin/<Lang>/dsmclientV3.cat'.
A common cause is the permissions on the file or its containing
directory preventing access by the invoker. Another cause is the
DSM_DIR environment variable being used, but pointing to the wrong
directory. In rare cases, the TSM client install package is faulty and
fails to create the <Lang> directory. Another obscure cause is when the
product changes names (e.g., ADSM -> TSM) and new names are used in the
path of installs, but the new installer doesn't uninstall the prior
version, making for a mixed and sometimes conflicting environment.
ANS0102W Unable to open the message repository tdpsdan.txt. The American English
repository will be used instead.
Products such as TDP key on the Locale settings of the machine in which
they are running. The operating system Locale setting may not be one
that the given product supports (Danish, in this case) and so it reverts
to English. To avoid the error message, follow the instructions in the
doc (README): set the LANGUAGE environment variable to "ENU" using the
the GUI, or from the command line via 'TDPSQLC SET LANGUAGE=ENU'.
ANS0105E ReadIndex: Error trying to read index for message [message number] from
repository dscameng.txt
Typically, a permissions problem, probably the result of someone
meddling with the client file system.
NT: See if the file dscameng.txt is missing from the baclient directory.
ANS0106E ReadIndex: Message index not found for message _______
Check for the message repository file being in the standard directory
for your operating system. If so, use DSM_DIR to point to it and, if
you get the same ANS0106E error with that, it indicates that the
repository itself is defective. You might try a higher client level to
get a clean copy.
ANS0237E (RC2033) On dsmInit, the node is not allowed when
PASSWORDAccess=GENERATE.
Seen when invoking TDPs or buta. You failed to observe the instruction
in the manual, specifying that the PASSWORDAccess option should be
"prompt" in the Client System Options File.
ANS0239E
As seen using Notes Connect Agent, can be caused by someone having
named a folder in their Notes mailbox a wildcard character such as a
"*". The only way to fix it is to see what mailbox was being backed
up and then look at the folders in that persons mailbox and have them
rename the wildcard to something else.
ANS0263E (RC2230) Either the dsm.sys file was not found, or the Inclexcl file
specified in dsm.sys was not found.
It may not have been found because it's not where it should be: the
dsm.sys that TDP looks for should be in the api/oracle/bin directory
(older TDP) or /opt/tivoli/tsm/client/api/bin - not the dsm.sys that the
standard client software uses. In that TDP uses the *SM API, you can
set environment variable DSMI_DIR to the name of the directory which
contains your dsm.sys file.
If setting up a 64-bit client, assure that no 32-bit stuff is
inadvertently in the mix.
ANS0326E Node has exceeded max tape mounts allowed. (An API message)
The Messages manual fails to come right out and say that the node's
server-defined MAXNUMMP value has been exceeded, perhaps because of an
unusual number of client sessions. If warranted, have the server admin
perform an UPDate Node to boost the value.
ANS0500-0599 These are TDP For Oracle messages
ANS0599 TDP for Oracle: (2106): 05/09/2001:20:12:35 =>(ssrspsfp1-ora)
sbtclose(): oer = 7023, errno = 41.
Errno 41 from TDP means you have exceeded the maxmimum mount point
allowed for your node on the server. Check to see what is the value for
"Maximum Mount Points Allowed" for your node on the server by issuing 'q
node <NodeName> f=d' from admin command line. Don't start more sessions
than that value or change that value.
ANS0944E dsmnotes error(s) occurred
Is basically telling you that something is missing or corrupted inside
the database. If the problem is not too severe, it will give you this
warning. But when the database is badly corrupted, it will not give you
any warning, it will just hang. There is no way for the Notes agent to
detect if the database is corrupted or not before the backup happens.
You should use other Notes tools to check and fix the corrupted database
before doing a backup using the Notes agent.
ANS1005E TCP/IP read error on socket = <SocketNumber>, errno = 73, reason: 'A
connection with a remote socket was reset by that socket.'
The 73 is AIX errno ECONNRESET: Connection reset by peer. The client
detected this, and its peer is the TSM server: check the TSM server
Activity Log for that clock time for an indication of why the server
terminated the session. It may be that your TSM server implementation
is relatively new and still has default configuration values, where its
timeout specs need boosting (particularly, COMMTimeout)?
If both ends of the session see it simply disappear (the server did not
cause its demise) then something in between caused it: network
equipment, OS TCP/IP protocol stack. One possibility is value conflicts
with the router/switch, as where autonegotiation of settings is
involved. If no good reason apparent... Your TCPWindowsize values may
be conflicting with your operating system network sizes.
See also: TCPWindowsize client option; ANR0480W.
ANS1005E TCP/IP read error on socket = <SocketNumber>, errno = 104, reason :
'Connection reset by peer'.
The 104 is Linux errno ECONNRESET. Treat same as above.
ANS1005E TCP/IP read error on socket = <SocketNumber>, errno = 232, reason :
'Connection reset by peer'.
The 232 is HP-UX errno ECONNRESET. Treat same as above.
ANS1005E TCP/IP read error on socket = <SocketNumber>, errno = 10053, reason :
'An established connection was aborted by the software in your host
machine.'.
The 10053 TCP/IP errno indicates that session was terminated at the
other end of the connection, which is to say by the TSM server. Check
its Activity Log for reason.
The client may attempt reopen, msg ANS1809W.
ANS1005E TCP/IP read error on socket = <SocketNumber>, errno = 10054, reason :
'Unknown error'.
The 10054 TCP/IP errno is probably from Winsock, reflecting a connection
being reset by peer, which in TSM terms means that the server terminated
the session with the client. Thus, the place to look is in the server
Activity Log, which should explain why it did so. It may be that server
timeout values are too low. If the server log also shows a mystery
disconnect, that indicates you are having networking problems.
ANS1005E TCP/IP read error on socket = <SocketNumber>, errno = 10054, reason :
'An established connection was aborted by the software in your host
machine.'
The TSM 5.2 server has the ability to prevent clients from initiating
both manual or scheduled sessions by setting the node's
SESSIONINITiation parameter to SERVEROnly. If you have the correct
HLAddress (IP addr) and LLAddress (port) specified and you get this
error, either when attempting to connect manually or via a client
polling schedule, then probably the parameter is set to the SERVEROnly
value. A value of Clientorserver is necessary for the client to be able
to spontaneously contact the server, as in a human-invoked session.
ANS1017E Session rejected: TCP/IP connection failure
See: ANS4017E
ANS1025E Session rejected: Authentication failure
May occur in the TSM 5.2.2.0 Windows client, if the password already
exists in the Registry key of this node:
HKEY_LOCAL_MACHINE\SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\
Nodes\<Node_Name>
The workaround is to delete the registry key before authentication: the
key will be rebuilt during authentication, as Administrator initiates a
TSM client-server operation. (In rare occasions, the supposed Admin user
does not actually have the permissions expected, so check that.)
An accompanying Error 2 in Windows means File Not Found and may suggest
that the HKEY value is being subverted by an environment variable or
client option like PASSWORDDIR.
ANS1026E (RC136) Session rejected: Communications protocol error
Has been seen in restoring large files in the doofy old MVS TCP/IP
environment, with a buffer overflow in TCP/IP happening every max
sequencenumber: instead of beginning with zero again the TCP session
dropped.
Also seen with "bad" NIC drivers; examine your log for TCP communication
errors, or watch your switch: the NIC may be renegotiating (often) the
speed & duplex settings, which may be avoided by defeating
autonegotiation.
May accompany dsmerror.log msg "sessRecvVerb(): Invalid verb received."
See also ANR0484W
ANS1028S Internal program error. Please see your service representative.
Seen when Retrieving a file...waiting for a tape mount, the "Retrieving"
message appears, then ">>>>>> Retrieve Processing Interrupted!! <<<<<<".
See the dsmerror.log for supplementary info; and/or the server Activity
Log. There is often nothing reflecting a problem in the Activity Log,
which would indicate the client reacting to something within its
environment.
Can be caused by having Archived a file with an ADSMv2 client and then
trying to retrieve it with an ADSMv3 client.
If no Activity Log indications of a problem, the problem may be the
result of the client system operating system level having been boosted,
as seen when IRIX v5 started numbering errno values at 1000.
But another, much more trivial cause is the user having run out of disk
quota during a Retrieve or Restore.
During a backup, watch out for running out of disk space (or disk quota)
for the client logs.
ANS1029E Communications have been dropped.
Could be caused by having the PASSWORDAccess Generate option, and the
NODename option as well. If so, eliminate the latter.
ANS1030E System ran out of memory. Process ended.
A poorly phrased message which misleads most customers: This is not a
real memory issue, as modern computers use virtual memory; and it is
almost never the case that the system ran out of virtual memory, but
rather that your client process ran out of its allotment of virtual
storage, typically during an ordinary incremental backup, which
accumulates and sorts filenames in virtual memory. See the TSM message
description. Tends to be seen mostly on personal computers.
If a Mac OS 7-9 system, boost the application memory size in the Get
Info box.
If a Unix system (including Mac OS X, which is Unix):
- The problem usually is that you exceeded the Unix Resource Limits
values for memory utilization, defined for either your account, or
any process in the system.
Verify with the csh 'ulimit -a', or the equivalent for your shell.
In AIX, also check /etc/security/limits definitions and make sure
that root's memory utilization is not artificially constrained. (A
value of zero or -1 implies "unlimited".)
- In Solaris, this can be a consequence of using option
"LARGECOMmbuffers Yes", and happens principally for non-root users.
The fix for this problem is to do the following:
1. Become root
2. Append the following line to the file /etc/system:
set shmsys:shminfo_shmmax=2097152
If your /etc/system already contains such a line make sure its
value is at least 1500000.
3. Reboot the system by issuing "reboot"
- Check the Backup log to see where the thing ended: this may be the
case of a circular symlink causing ADSM to go in circles until
virtual memory is exhausted. You might also do a
'find DIRNAME -type l -ls' to inspect symlinks in suspected
directory.
- In AIX 4.3.3 there is an interesting JFS architectural situation
involving exhaustion of the .indirect segment for the file system
relative to files >= 32 KB.
See IBM site item swg21162093 and pTechnote0777.
If a Windows system:
- Close all unneeded applications and services, to free memory.
- Change LARGECOMmbuffers and/or MEMORYEFficientbackup (q.v.).
- Your system virtual memory may simply be inadequate: backing up 100
GB of small files via standard Incremental requires a lot of memory
for filename matching during the backup.
- See also notes under ANS9999E.
(Where a system's virtual storage is actually exhausted, there would be
major, obvious manifestations in the system. In the case of AIX, it
would issue SIGDANGER signals to all processes warning of impending
virtual storage exhaustion such that they could end gracefully before
AIX was forced to do SIGKILLs to contend with the problem. In the case
of Solaris, where /tmp is often defined as virtual storage, various
processes would fail in writing to /tmp. If Unix did have to kill off
processes, this would be evident in your Unix process accounting
records (check there). In addition, your AIX Error Log (errpt) should
contain PGSP_KILL entries around the time of the problem. If this was
not in evidence, then it suggests that it was the case that your
process exceeded its Unix Limits value for memory utilization.)
ANS1033E An invalid TCP/IP host name was specified
Check permissions on /etc/hosts, /etc/resolv.conf, /etc/nsswitch.conf
and like network configuration files to assure that they are publicly
readable. Check that you have DNS service.
ANS1035S Options file '/usr/tivoli/tsm/client/ba/bin/dsm.sys' not found
Multi-user systems (Unix) require both a dsm.sys and dsm.opt: make sure
you have both. In using the *SM API, set the DSMI_CONFIG environment
variable to the full-path name of your dsm.opt file. Check any DSM_DIR.
Might also happen if the file permissions don't allow the invoking user
to use the file.
See also ANS0263E.
ANS1036S Invalid option '_OptionName_' found in options file 'file-name' at line
number : _____
The client option is not considered valid by the software.
Do 'dsmc Query Option' to check client options. Refer to the B/A client
manual for details on coding options, as supplemented by any README
documentation supplied with the specific release level software.
Check to see if the option is in the right file (dsm.sys vs. dsm.opt),
or if it needs to be within a server stanza.
If using a TDP, assure that you are using its options file, not the one
used by the Backup/Archive client.
In TSM 5.1 and the Windows Share contains a dollar-sign on an Include
(like include \\remote\test$\*) this was a defect, fixed by IC36467.
ANS1038S Invalid option specified
In Unix, typically because you coded a dsm.sys option in dsm.opt.
ANS1063E Invalid path specification
Accompanied by like "ANS1228E Sending of object 'F:\*' failed"
or "ANS1228E Sending of object '\\something\g$ failed".
If an incremental backup, the account that is running the scheduler
service (SERVICE) does not have full rights/permissions to that drive,
as in SYSTEM account permissions no longer set for the drive. (If it
works manually, but not as a scheduled event, it is almost always a
permissions issue.) If Windows, and the failure had been on a drive
letter, try \\servername\share_name instead. If the problem persists,
you can try resetting the password for the domain admin ID under which
the problem child's TSM scheduler service is running
under. (Double-click on the TSM Scheduler service listing, and switch to
the "Log On As" tab.)
If performing a 'dsmc Backup Image', you probably specified /dev/____
instead of a file system name when the logical volume is defined as a
file system and is mounted. Or you specified the /dev/ character device
name for a device rather than its block device name.
ANS1068E Device is not local
In doing a Backup Imageon AIX, you probably specified like /dev/hd2 but
/etc/filesystems does not contain that spec for the logical volume which
contains a file system.
ANS1071E Invalid domain name entered: '_________'
May result from doing like 'dsmc i /home/ians/projects/hsm*/* -su=yes':
you cannot use wildcards in directory / folder names.
If not using wildcards, and you are specifying like 'dsmc i /etc',
instead try 'dsmc i /etc/': TSM is rather dogmatic about specs, and
expects that an object specified without a trailing slash is a file
rather than a directory, and here /etc is a subdirectory of / rather
than its own file system. (TSM will recognize a true file system without
a trailing slash spec.)
ANS1073E File space correspondence for domain 'domain-name' is not known.
The number defining the correspondence between drive letter or file
(domain name) and volume label is not known to the server.
This might be caused by the specified name not being recognized by ADSM
as a Domain (filespace) because it specifies the filespace name as a
stem and is followed by a directory name which causes ADSM to think that
the whole thing is the domain name. The solution in this case is to set
off the filespace portion of the name with braces.
ANS1074I *** User Abort ***
May appear only in the dsmerror.log - with no complementary server
Activity Log error indications!
Some customers report experiencing this client-side message when the
server disk storage pool runs out of space, or lack of mount points.
The 4.1.2 client level had a defect in failing to emit other messages
describing the actual problem.
In TSM5, setting client option RESOURceutilization to 1 may prevent the
intermittent error, by preventing thread switching.
ANS1075E *** Program memory exhausted ***
*SM thinks: The program has exhausted all available storage.
*SM recommendation: Free any unnecessary programs, for example,
terminate and stay resident programs (TSRs), that are running and retry
the operation. Reducing the scope of queries and the amount of data
returned can also solve the problem.
If a Unix system, check Unix Limits values. Assure that the system is
not running out of virtual storage. If AIX, and still using ADSM, you
may be in need of more than the single memory segment that AIX allows by
default. (AIX TSM employs the Large Program Support conventions to
avoid this situation, as verified by Richard Cowen.) You can modify the
ADSM server module to use LSP, as follows:
The amount of memory that the process needs may exceed the size of one
data segment (256 MB), which is the default number of segments a
process may use. The process is in this case killed by the system.
The work-around for this is to enable the program to be able to use
more than one data segment by enabling Large Program Support, using the
following commands:
cd /usr/lpp/adsm/bin
cp -p <Pgm_Name> <Pgm_Name>.orig
/usr/bin/echo '\0200\0\0\0' |
dd of=<Pgm_Name> bs=4 count=1 seek=19 conv=notrunc
which causes the XCOFF o_maxdata field (see <aouthdr.h>) to be updated.
This allows the program to use the maximum of 8 data segments (2 GB).
Choose the string to use for a given number of data segments from
the following table:
# segments vm size string
------------------------------------------------
8 2 GB '\0200\0\0\0'
6 1.5 GB '\0140\0\0\0'
4 1 GB '\0100\0\0\0'
2 512 MB '\0040\0\0\0'
ANS1076E *** Directory path not found *** (same as ANS4078E)
In the dsmsched log, the msg follows the problematic name.
The command was given a parameter which it took to be a file system
object name, went looking for it, but could not find it.
The most trivial cause is a misspelling, or that the DOMain or command
line specifies a subdirectory which was removed from the system.
Could be something as silly as forgetting the hyphen in front of a
command line option, like "description=" instead of "-description="
or "archmc=" instead of "-archmc=" such that it looks like a filespec
instead of an option.
You may have incorrectly specified the object to be backed up, as in
perhaps something like "/filesys/.../*" in a client schedule.
May be caused, in Unix, by an invalid symlink, such that you have to fix
it and repeat the operation which stumbled onto it.
Another possibility is that the file system type is odd...one that the
client is not programmed to recognize and handle (such as a newer file
system type in Linux, being tried from an older client).
Netware: Might be a Rights issue. Watch out for the situation where the
NWUSER is logging into the server and not the tree: in certain
applications, if you specify the server name instead of the NDS tree, it
will default to Bindery login. If Netware server Bindery Context is not
enabled, the volume might not be recognized since the needed
authentication did not occur. Less likely: try running a vrepair on the
affected volumes and then retry incremental backup.
A bizarre cause of this error was a user employing the Selective backup
command on the content of his Include lines.
ANS1078S *** Unknown system error <Error_code>; program ending ***
"An unknown and unexpected error-code occurred within the client
program. This is a programming failure and the client program ends."
In Unix: By reason, the "system error (nnn)" should reflect the Unix
errno global variable value returned by a system subroutine. However,
you may find that the errnos in your /usr/include/sys/errno.h do not go
up that high - which indicates that the errno value is garbage, perhaps
there when the TSM client module called the system subroutine, and upon
return it believes that the errno value is meaningful, and reacts.
Might be due to running an old client on a new opsys, where the opsys
has error codes that are newer than when the client program was written.
Upgrading the client usually eliminates such errors. If not, then it is
purely a product defect and should be reported to the vendor.
Circumvention: Well, the error is there, and you need to get work done
despite it. Consider changing variables in a controlled manner seeking
one which helps, such as TXNBytelimit.
ANS1079E No file specification entered.
In creating a client schedule to Archive files, you may have forgotten
to specify the files to be Archived in the OBJects parameter of the
DEFine SCHedule command? That is, unlike the Incremental command,
Archive does not assume file objects; and the OBJects parameter is
required when ACTion=Archive. Note that your Include-Exclude list will
also be observed when the archive operation is actually performed, so
you can specify an alternate management class on an Include statement.
ANS1081E Invalid search file specification '/usr/stuff/*/fonts.info' entered
The given spec string contains invalid characters or a wildcard in the
file system name (Unix) or drive name (Windows). The most likely cause
is attempting to use wildcards for directories, particularly where the
restoral is "in place", rather than to an alternate place... As the Unix
Client manual says: "In a command, you can use wildcard characters in
the file name or file extension only. You cannot use them to specify
destination files, file systems, or directories." Alternately, it may
be that you invoked 'dsmc' to enter interactive mode, and then entered
the filespec in quotes, which might cause a client at a given
maintenance level to take the wildcards as literal characters instead of
as wildcards: quotes are for the OS command level, where you need to
keep the shell from expanding wildcards - don't use quotes in dsmc
interactive mode. Another possibility is there being multile filespaces
with common name ingredients such that you need to explicitly delineate
the filespace portion with braces: {/usr/stuff}/*/fonts.info.
ANS1082E Invalid destination file specification '/usr/here' entered
You attempted to perform a file system restoral like:
dsmc restore -su=yes '/adsmbkup/usr/there/*' /usr/here
where the destination is a new file system, and got this rejection, as
seen under the AIX TSM 3.7.2 client. The problem is that in the absence
of a trailing slash, TSM thinks that the destination is a file rather
than a directory; that is, you told it that the restoral was a "many to
one". What you have to do is specify the destination as "/usr/here/"
and it will work.
ANS1086E File not found during Backup, Archive or Migrate processing
The file was probably transient, and went away between the time that TSM
got a list of files to process to the time that it got to this file.
If you believe that this file should not have been included for
processing, see: Include-Exclude "not working".
ANS1092E No files matching search criteria were found [same as ANS4095E]
ANS1102E Excessive number of command line arguments passed to the program!
(Might also be seen/reported as "too many arguments passed to the
program.")
May be accompanied by "ANS1133W An expression might contain a wildcard
not enclosed in quotes".
The dsmc client command has a self-imposed limit on the number of file
specifications that may be passed on the command line. (The intention is
to "protect the customer from himself", as in inadvertent "runaway"
situations where a wildcard might supply a large number of filenames to
an operation.) Limits:
Query: 1 Restore: 2 Retrieve: 2
Archive: 20 Delete: 20 Selective: 20
It is with Archive, Delete, and Selective that you typically seek to
pass a large number of file names. If they are unique names, you are
forced to specify only up to 20 per command invocation. If they have
common elements, you may be able to use wildcards. In a Unix
environment, at least, you should then either put the whole file
specification in quotes, or put a backslash (\) before each wildcard
character, to keep the shell from expanding the wildcards.
One user inadvertently got this error by forgetting to put '#' before
comments in the dsmc sched line in the Unix /etc/inittab file.
See also: Continuation and quoting; "dsmc command line limits";
-REMOVEOPerandlimit; Wildcard characters
ANS1103E Invalid management class entered
In Archive: The management class for this file does not have an archive
Copy Group, and so the file cannot be archived. This can be caused by
having defined a management class, but not having done the
ACTivate POlicyset command to have it participate in the Policy Set.
ANS1105E The management class for this file does not have a valid backup copy
group. This file will not be backed up.
Check your server definitions, and review administrative changes.
Remember that when you do a backup, you're doing more than backing up
current files - the client is also telling the server what files no
longer exist on the client, such that those objects which have existed
in server storage can now be marked for expiration. Those files in
server storage were associated with a given management class. If you
delete that management class definition and the files are still in
server storage, you might run into this situation. If this is the case,
recreate the old management class definition.
ANS1107E Invalid option/value: '-PITDate'
Lazy programming fails to be specific. The problem is typically that
the date format employed in the value is inconsistent with prevailing
date format options.
ANS1107E '-Clusternode='yes'' invalid option / value pair in dsm.opt file.
That is the format for command-line options: in the options file,
options should be specified like: CLUSTERnode Yes
ANS1108E Invalid option (-POSTSchedulecmd) for the INCREMENTAL command
Or similar. Commonly, you tried to specify an option on the command
line when it is legal only in the options file, per the manual.
ANS1115W File '_____' excluded by Include/Exclude list
When running an Archive or Selective operation against a file spec which
contains files that are excluded, this message should be issued for each
excluded file; and if the operation is via schedule or batch, the return
code should be 4. (This should occur regardless of whether QUIET is in
effect.) The rationale is that Archive and Selective operations are
explicit requests for objects to be sent to TSM storage, and that if
they are not sent because of an Exclude, you very much should be made
aware of that...particularly with the preservational intent of Archive.
In contrast, the message would not appear for an Incremental type backup
where the files set is implicit because that is not an operation where
it is not required that files go to the server.
ANS1115W File '/tmp/whatever' excluded by Include/Exclude list
In Unix, /tmp is defined by Tivoli to not be backed up, so even if you
do not have /tmp excluded in your inclexcl, it does not want to back up
anything in /tmp, whether by Incremental or Selective backup. See: /tmp.
ANS1128S Invalid Management Class assigned to directories. Please see the
error log.
Are you using DIRMc, but it refers to a Management Class which doesn't
have a backup copy group assigned to it?
ANS1134E Drive \\MachineX\d$ is an invalid drive
Also known as the "Invalid drive specification D:" problem.
The simplest cause is that the system or invoker did not have
permission to use the drive. Perhaps the TSM scheduler is running as a
system account and mapped drives are not available: Try changing the
service to run using a local administator account, and confirm that the
user account has the mapped drive in its profile.
The message may be seen with hard drives other than "C", which can
indicate attempting to operate on a remote drive, which may not be
resolvable because of UNC or other issues. On machine MachineX, you
can define drive D as shared with the name "D$" then it will be able
to back it up.
ANS1149E No domain available for incremental backup
Sounds like the Client User Options file (dsm.opt) got changed so that
it is now lacking a DOMain statement, as for ALL-LOCAL.
ANS1194E Specifying the schedule log
'/usr/tivoli/tsm/client/ba/bin/dsmsched.log' as a symbolic link is not
allowed.
May be followed by "ANS1190E Symbolic link
'/usr/tivoli/tsm/client/ba/bin/dsmsched.log' to '' was successfully
deleted", which suggests that one of the subdirectories in the path is a
symbolic link. (Some software will examine each path element in turn,
not just the final file name.) Also seen with file
/var/log/adsmclient/adsmclient.log where /var/log/adsmclient is
erroneously a file rather than a directory.
ANS1228E Sending of object '____' failed
During an Archive or Backup, the client tried to send the the server
either the file itself, for addition to the server storage pool, or
information about the file (attributes update, expire the file), but
that interaction with the failed due to something invalid. This will
typically occur every time the client job is run.
There may be accompanying messages to explain why, as in:
ANS1063E (lack of permissions); ANS1086E (file not found); ANS1310E
(object too large). Another cause, in Windows: path length exceeding
the maximum of 259 characters. A very ugly case we've seen is where the
files named in the message had been deleted from the client some time
ago, meaning that the Sending action involves the client trying to tell
the server to expire a file whose name is in the list of Active files
which the client obtained from the server when an Incremental Backup
started. The file name in such a case may contain "tough" characters -
probably binary, and most likely binary zeroes. The TSM software should
programmed to be able to deal with bogus characters in file names, so
this failure should be considered a defect. Consider trying a backup
with -INCRBYDate to avoid filename passing between client and server.
Keep in mind that many message-issuance routines expect normalcy in the
strings they handle, and neither look for nor deal with inadvertent
non-displayable, binary characters. That kind of thing can always throw
off an investigation: what you see is not the reality.
If accompanied by "fsCheckAdd: unable to update filespace on server" in
the error log, it may be that database locks were in effect, as when a
serious operation like Export Node is happening the same time as a
Backup. Be careful to not have conflicting things running.
With .NET, note that *.cch.* files are temporaries: consider excluding
them from backup.
ANS1228E Sending of object 'c:\adsm.sys\EVENTLOG' failed
ANS1228E Sending of object 'c:\adsm.sys\IIS' failed
ANS1228E Sending of object 'c:\adsm.sys\WMI' failed
Accompanied by ANS4005E messages. The product has traditionally been
doing "exclude c:\adsm.sys\...\*" - but it should have been doing
"exclude.dir c:\adsm.sys", to avoid race issues. Amend your exclude list
to have the exclude.dir. See APAR IC40016.
ANS1228E Sending of object '/intermail/mss_db/mbox/18/db' failed Read Failure
The file to be backed up could not be read. Do the environmental
problem analysis to find out what the problem *actually* is: an OS error
log may reveal that there is a disk situation at play - which might be
something as resolvable as a loose cable or failed power supply, meaning
that the disk and its data may be intact but currently unreachable.
ANS1230E Stale NFS File Handle
See ANS4010E
ANS1245E (RC122) Format unknown [sometimes regarded as "unknown format"]
See: ANS4245E
ANS1256E Cannot make file/directory.
If a Windows machine, possibly one of the following:
1. "Permission Folder" was deleted by mistake
2. Illegal characters (maybe "?") used in file/dir name
3. File name (including directory path) exceeds 255 characters
You can check this by going to the subdirectory note in the error
messages, then right clicking, selecting permissions and attempt to
reapply the permissions on this directory. Windows should give you
an error at this point saying that it can't apply valid security
permissions to a certain file(s). These are your offending
files... either rename / delete.
Of course, as an expedient you can Exclude the offending objects.
ANS1262E Password is not updated. Either an invalid current password was
supplied or the new password does not fulfill the server password
requirements.
In addition to what the Messages manual advises, do Query Status on the
TSM server and check the Minimum Password Length value.
See: Set MINPwlength
ANS1287E Volume can not be locked.
ANS1287E Volume could not be locked
Seen during attempted restoral of an image backup. Some causes:
- The drive partition was inadvertently left open.
- The dsm.opt file directs log files to that volume.
- The TSM Journal service is running against that drive.
- Having an Internet Explorer window open to that drive.
- Remote users accessing that drive via IE across a network connection.
If despite all Windows closed it persists, do a quick format of the
drive, which has been seen to clear it: a similar message may appear,
but you get the option to go ahead and format anyway.
ANS1301E Server detected system error
See: ANS4301E Server detected system error
ANS1304W Active object not found
The most common, contemporary cause is that a prior Backup stored the
object in TSM server storage as a given type (e.g., regular file), but
in the most recent Backup the object is a different type (e.g,
directory), and this confuses things. Seen mostly in Netware.
As of 2001/02, this is being seen as a result of a debacle in the
misprogramming of the 4.1.2 client series in the handling of
international characters in filenames - including the lowly question
mark (?). This is also known as "the umlaut problem".
This problem occurred when you migrate from a V2 client to a v3.1.0.5
(or 3.1.0.6) client and when you have international characters in
filenames or directories.
And the problem is treated but may not be fully fixed with APAR
IC21764. And you should also have: USEUNICODEFilenames No.
A fix is provided in the TSM 4.1.2.12 client. See its README
(IP22151_12_TSMCLEAN_README.1ST).
ANS1309W (RC9) requested data is offline
A nonsensical error encountered during a TDP Exchange *backup* - not a
restoral as the TSM error message explanation suggests. (Consider the
misleading message a programming defect.) The standard fix is to, in
one way or another, set the TDP Mountwait to Yes (/Mountwait=Yes or
options file change). Issue the TDPEXCC QUERY TDP command to verify.
Also check your server stgpool MAXSize value to assure that it allows
the storage of a large incoming blob, and that there are volumes
available for the management class that the client is using.
ANS1310E Object too large for server limits
The object is too large. The configuration of the server does not
accommodate, or allow, such a large object in the storage pool. The file
is skipped. The message is apparently referring to the Stgpool MAXSize
value, if not the physical capacity of the storage pool.
Expect it to be accompanied by: ANS1228E. Probable server message:
ANR0521W. Note that this condition may result when client compression
is enabled and an already-compressed file is sent - and the secondary
compression attempt causes it to expand.
ANS1311E Server out of data storage space
See: ANS1329S
ANS1312E Server media mount not possible. [Same as ANS4312E]
(Note that there may be no "timeout" reflected as the message
description suggests.)
Maybe no tapes available: check your MAXSCRatch value and the number of
tapes available on the server.
Maybe no mount points available: Is the server already busy servicing
other clients such that all drives are in use? Also check your
Storage Pool MOUNTLimit and Node MAXNUMMP values. (Some customers have
reported finding after an upgrade to 5.1.7 that their prevailing
MAXNUMMP value of 2 won't work: they boosted to 4 and msg disappeared.)
It might be that the server's drives are all busy with higher priority
operations and your operation (restore is higher than backup, etc.). You
may simply have to adjust the day's scheduling of server administration
processes vs. client sessions to avoid contention for serial resources
such as tape drives.
See also "Insufficient mount points, 3590" in the CONDITIONS section,
further down in this document.
Starting in TSM 3.7 this can be caused by the REGister Node parameter
MAXNUMMP being zero in a backup/archive operation.
Do 'dsmc q mgm' to see if the client is using the appropriate management
class, and pursue the management class definitions in the server to see
if they lead to a faulty devclass definition such that no volumes of
that kind exist to be mounted.
If collocating by node and backing up directly to tape, the server will
want to append to the last tape that it was filling, but if that is in a
peculiar state it may immediately quit rather than going on to a scratch
or, if no scratches left, to append to any other node volume.
One uncommon cause is that the licensing is incomplete (probably lack of
Advanced Device Support license, as for using a 3494).
Watch out for Query DRive reporting GENERICTAPE instead of DLT, for
example. When backup is attempted, *SM errors out with this error
message. Has been seen to occur after powering off server and tape
drives, so review shutdown procedures.
Check the server Activity Log or dsmerror.log for indications. Possibly
it could not talk to the library to get a tape mounted.
ANS1315E (RC15) Unexpected Retry request
This lazy message does not begin to suggest that the problem is on the
server, in being unable to store the data which the client is trying to
send. Refer to the server Activity Log for the reason. One customer
found the cause to be tape write errors.
ANS1317E The server does not have enough database space to continue the current
operation
Well, your TSM server administrator should be monitoring the server
database over time, and has neglected that administration task such that
the server database has filled. Contact the admin.
ANS1328E An error occured generating delta file for ______, return code 4539.
Probably, your subfile backup cache is full.
ANS1329S Server out of data storage space [same as ANS1311E]
"Out of space": Typically, your server storage pools are full: you have
exhausted all tapes in that storage pool and need to either add more
tapes or perhaps lower your migration threshold or retention periods.
(Also assure that you are running Expiration regularly, to make space.)
Trivial cause: the destination storage pool is marked Readonly rather
than Readwrite.
Or could be that your Stgpool MAXSCRatch value is insufficient.
Or perhaps you believe you have plenty of free tapes - but are they
perhaps assigned to a different storage pool?
Did you do a 'VALidate POlicyset' as part of activating a policy change,
to check that your changes are consistent and logically correct?
Follow the server policy definitions (as used by the client) downward to
see if they lead to usable space. It's easy to forget to define volumes
or a scratch pool.
Is the incoming data larger than the size of any of the storage pools in
the hierarchy, or over the storage pool MAXSize?
Another cause is in the server not being properly licensed for the
number of clients in play.
Has also been seen with a symlink which points to nothing.
Might also be a Backup or Archive on a file system with complex
directory entries (e.g., NTFS) such that they by default go to the
storage pool with the longest retention, but that storage pool (probably
different from where your data would go) cannot be written to. Look into
ARCHMc and DIRMc.
See also: "Storage pool space and transactions"
TDP for SQL: Take a look at the following options in the User's Guide:
/LOGESTimate=numpercent /DIFFESTimate=numpercent
In some situations, the intial size estimate which the SQL server
relates to TDP is too low.
Further: At the start of the backup for the file, the server reserves
enough space in the storage pool to hold the file based on the client's
estimate. If storage pool caching is turned on then cached entries have
to be released. If the system can not reserve enough space for the file
in the storage pool then it is stored on the next storage pool that has
room for the file. Normally, at least one storage pool is defined with
no size limit, so this normally works. Then the file is transmitted to
the server. If it is not compressed or reduces in size with compression
it is stored in the reserved space and all is okay. If the file grows in
size with compression and COMPRESSAlways=No then the client will stop
sending the file and retransmit without compression and all is ok. But
with COMPRESSAlways=Yes the file will be transmitted until the reserved
space is used up. After that time the "server out of data storage space"
message is issued if there is no free space in the storage pool. Without
caching there is normally free space, but with caching the storage pool
is full by design. It would be nice if the client could wait for the
server to find more space in the storage pool or one of the next storage
pools and then continue the backup.
See also: ANR0520W; ANS4329S Server out of data storage space.
ANS1351E Session Rejected: All server sessions are currently in use
May be just that: issue 'Query SEssion' server command and see what's
using them, and review the Activity Log for background. If there are no
sessions, maybe you have "DISABLESCHEDS YES" in your server options
file. Beyond that, consider boosting the "MAXSessions" definition in the
server options file.
ANS1353E Session rejected: Unknown or incorrect ID entered
Can occur when your operating system hostname is not a simple name: is
like "myhost.mycompany.com" instead of simply "myhost".
See also: dsmc SET Access
ANS1357S Session rejected: Downlevel client code version
The server version and your client version do not match such that
sessions cannot proceed. The client code is downlevel relative to the
server. Possibly, the server administrator upgraded the server level
and you weren't advised that it was going to happen; or maybe they did
some rotation among multiple servers. Maybe there are multiple levels
of ADSM/TSM on your client system (as can happen with different versions
installing in different directories) and you invoked the wrong one.
Maybe your client configuration is not now pointing to the right server.
See also: ANR0428W
ANS1327E The snapshot operation for 'C:____' failed. Error code: 673.
Go to www.ibm.com and search on: +ANS1327E +673
Topic "TSM Client v5.2 Open File Support"
(http://www.ibm.com/support/docview.wss?uid=swg21121552) which says:
"There is known limitation in Microsoft Terminal Services server on
Windows 2000 that prevents the OFS feature from working over a Microsoft
Terminal Services session."
ANS1369E Session Rejected: The session was canceled by the server administrator.
This should be due to 'CANcel SEssion' on the server. Might also be due
to THROUGHPUTTimethreshold or THROUGHPUTDatathreshold in effect.
ANS1410E Can not reach the network path - or -
ANS1410E Unable to access network path
In a Backup, it may mean that the System account doesn't have access to
a drive.
In a Restore on Windows NT, you probably specified restoring a file to a
machine other than the one which did the backup, but using the same file
path name. As of version 3.1.0.5 of the client, ADSM now uses UNC names
for the files. This means that the machine name is part of the file
name. If you specified "original location", then ADSM tried to restore
the file to "node_one" because "node_one" is part of the file name
(i.e. \\node_one\c$\mydir\myfile.doc). Instead, try choosing another
location. The dialog allows you to select a drive and directory to
restore to, which will be the local drive and directory on machine
"node_two". Also check the filespace name on the server: it may need
renaming to accommodate the current client machine and disk names, or
vice versa.
ANS1435E An Error Occurred saving the Key.
Accompanied by: ANS1428E Registry Backup function failed.
and maybe ANS4036E.
Make sure there is sufficent space on your system drive to hold the
staged registry files. Also check for a TSM temporary file left over
from the previous backup: it tries to delete such temp files, but if the
temp file has a SHR attribute, that will prevent deletion. If all else
fails, run the backup with client tracing to reveal the problem in
detail. Other things to check:
- Verify that all the .exe, .dll, and dsc*.txt files in your
..\tsm\baclient directory have the same timestamp on them (or at least
within a couple of seconds of each other).
- Verify that adsm32.dll, adsmv3.dll, dsmntapi.dll, dsmutil.dll,
dsmw2k.dll (if Windows 2000), and tsmapi.dll all have the same
timestamp as the files above.
- Verify that if your run DSMC SCHEDULE in the foreground (while logged
on) it works okay.
- Assuming that all of the above check out okay, try configuring the
scheduler service to use the Local System account. Also, don't do
anything else fancy; just use dsm.opt located in ..\tsm\baclient. Make
it as basic as possible. For now, don't bother with any kind of pre-
or post-schedule commands, include/exclude lists, or any other options
not necessary to test the basic function. For example:
COMMMethod tcpip
TCPServeraddress your.tsm.server.address
PASSWORDAccess GENERATE
NODename yournodename
SCHEDMODe PRompted
"NODename" and "SCHEDMODe" are not necessary if you are already using
the default values of the local machine name and "polling",
respectively.
If this works, then the problem may indeed be related to the particular
account being used, or something else in the configuration.
ANS1448E An error occurred while attempting to access NTFS security information
To backup NTFS files, the user also needs the "Manage Auditing and
Security log" user right.
May be accompanied by ANS1228E (q.v.).
ANS1449E A required NT privilege is not held
The user running the backup doesn't have access to the root of the
volume being backed up. If the scheduler is running the backup, you
have to give the SYSTEM id (or whatever id the scheduler is running
under) access to the volume root.
ANS1474E An error occurred using the Shared Memory protocol
This is a blanket message which tells you only that a session using that
protocol could not be established, but does not say why. During
client/server communications the server can close a shared memory
protocol session before the client is ready for it to close. As a
result, the client may still be expecting a message when the session is
closed. As a result, the client issues message ANS1474E. (But the
server code should have been fixed to keep this from happening.)
Perhaps you are not adhering to the rules for using Shared Memory
communication. Look at the server Activity Log for indications.
ANS1485E Schedule log pruning failed.
Like other permissions problems, this plagues NT systems. Get the
current schedule log out of the way and let ADSM create a fresh one.
ANS1497W Duplicate include/exclude option 'EXCLUDE *:\...\pagefile.sys' found
while processing the client options passed by the server.
Do 'dsmc Query Inclexcl' to check for such duplication. If not there,
then be aware that TSM respects the entries in registry subkey
HKLM\System\CurrentControlSet\Control\BackupRestore\FilesNotToBackup
and that pagefile.sys should be in this list (unless removed manually or
with some other tool). So if you have an include/exclude list that has
an exclude for this file, and it is in FilesNotToBackup, then that is
the source of the redundancy.
ANS1503E Valid password not available for server '________'.
Seen when trying to establish a PASSWORDAccess GENERATE type client
password via a dsmc operation. May be due to PASSWORDDIR being present
in the dsm.sys options file, but specifying a regular file rather than a
directory, or the directory not existing. Have a good look at the file
system object that your PASSWORDDIR specifies, and make sure that you
are running the dsmc as root.
ANS1505E Trusted Communication Agent has terminated unexpectedly.
Look for the dsmtca module (in /usr/lpp/adsm/bin, or perhaps /usr/adsm)
having incorrect permissions, or zero length.
ANS1512E Scheduled event '____' failed. Return code = __.
Known "Return code" values:
1 May be accompanied by error like
GetHostnameOrNumber(): gethostbyname(): errno = 11004.
TcpOpen: Could not resolve host name.
The common cause is a faulty customer POSTSchedule or PRESchedule
command. IBM topic on this:
http://www.ibm.com/support/entdocview.wss?uid=swg21108971
May be accompanied by msg ANR2579E (q.v.).
4 Often caused by lack of proper volume label on PC type file
system. See also: ANS4036E
12 Can result when *SM tries to backup/archive a file which has
exclusive open. This may be due to a false indication from the
operating environment, such as Novell NetWare, where a service
pack update may be called for. A ANS9999E error in the
dsmerror.log may point out a problem file system object, which in
turn incites Severe error ANS1028S at the conclusion of a
scheduled backup, which results in return code 12.
Circumvention: Exclude problem files.
127 Typical in a client schedule having been defined with
ACTion=Command OBJects='Somecmd ...' where Somecmd is a command
name which is not in the Path which was in effect with the client
schedule process was started. If there may be any doubt about
command findability within Path, then by all means code the
command with a full path specification.
402 General "error processing request" code indicating that errors
occurred in processing the command. You need to look in the
dsierror.log and the like for reasons.
1837 Means you have all objects excluded from backup, as seen in an
Exchange backup where DSM.OPT has a goofy construct like
EXCLUDE "*\...\*" .
ANS1809E Session is lost; initializing session reopen procedure.
Seen as an NT message, accompanied by preceding messages:
TcpRead(): recv(): errno = 10054
sessRecvVerb: Error -50 from call to 'readRtn'.
Seen as an AIX message, repeatedly during a session. The most innocuous
cause is preemption, where a higher priority process (e.g., Restore)
needs a tape drive which is in use by a lower priority process (e.g.,
Backup). Another common cause is a too-low IDLETimeout value (server
msg IC43445). Alternately, may indicate that you are having local
network problems, likely resulting from an intrinsic error in your
network configuration. Or, you are going through a firewall, with its
own timeout values, which conflict with those between the TSM client and
server, which can cause the session to be cut off and have to restart as
client communication idles while the client searches for candidate
backup files in the file system. Employ the traceroute command, ping -R,
or the like to determine what network elements you are going through.
One customer reported changing TCPServeraddress from a network name to a
numeric IP address to circumvent the problem - but a DNS thing like this
should not cure dunning errors.
See possible explanations under "ANS4017E" - could be a COMMTimeout
value problem.
See also ANS1005E
ANS1810E ITSM session has been reestablished.
Possibly, a networking problem caused the session to be interrupted,
and the client is re-establishing it. The server Activity Log will have
ANR0406I for the session (re)starting. Might be due to an overly
optimistic MAXNUMMP spec for the client node.
ANS1834S Unable to write to '/etc/security/adsm' for storing password.
As the message manual advises, check access permissions and disk space.
/etc/security/adsm should exist, be a directory, and be writable by
root. Are you running the TSM operation as root? (The first execution
after installing a client should be run as root, where PASSWORDAccess
Generate is in effect, to establish the client password in encrypted
form.)
ANS1840E File 'C:\adsm.sys\Registry\VEVPIL01\Users' changed during processing.
File skipped.
It is best to set SERialization in the backup copy group to be
SHARED STATIC, to avoid this error condition.
ANS1865E session rejected: Named pipes connection failure.
The Windows client is attempting to enter into a session with the
Windows server, via the proscribed Named Pipe communication method, but
cannot start the session. The first thing to check is that the server
is actually running and is viable. Also check that the file object
identified by the client NAMedpipename still exists, and is the same as
expected by the server. You can start the server (dsmserv process)
directly from your console, then you can see the messages and what is
happening on your server. Also look for supplementary error indications
in the client and server error logs. (Consider that your Windows system
may have been compromised by one of the innumerable Microsoft
programming gaffes - beware overnight operators taking liberties with
server PCs.) And, Named Pipes are just one client-server communication
choice: you could switch to another method, like TCP/IP.
ANS1874E Login denied to NetWare Target Service Agent '______'.
When logging in to Novell Netware, use a fully qualified NDS ID. For
example, you might use .TSM.BACKUP.BCIT as your user ID. Note that the
leading period needs to be there. Or: An increased number of client
threads consumes more Netware connections, so increasing the number of
available connections for the TSM/Novell ID in Nwadmin may fix it.
See also Novell Knowledge Base Technical Information Document 2944976.
One NOS engineer found: "A specific TSANDS is required in order to get
the Mainframe to login to a 5.1 server to perform backups. If I use
other than the 9/8/2000 TSANDS the server will not allow an unattended
login."
ANS1879E Netware NDS Error on restore processing:
Object .o=organization.ou=organizational_unit.cn=context_name
TSA Error FFFDFE83 - 603 User has no rights to the named object.
The NDS user ID that has been assigned to the client doing the restore
does not have the proper NDS rights assigned. Check the users effective
rights to make sure that it has supervisor object and property trustee
rights.
ANS1899I ***** Examined 2,689,000 files ***** [sample]
Usually seen during a Restore (can also be in a Retrieve), where *SM is
reviewing the server list of files which may be candidates in servicing
the specifications of the restoral being performed. Expect to see low
CPU utilization for the TSM server, if this is the one demand upon it,
and high I/O (vmstat pi/po) there, and TSM db Cache Hit Pct dropping
(reflecting a lot of unique lookups). Expect the client to slow down,
as more of its memory is consumed, and paging increases. The message
will be prominent where the filespace involved has a very large number
of files (millions). Updating the options file to include "TESTFLAG
DISABLENQR" may be appropriate, to cause Classic Restore operation
instead of No Query Restore. (See notes on this elsewhere in this
document.)
ANS1931E An error saving one or more eventlogs.
May be accompanied by: ANS1228E Sending of object 'C:' failed.
A Windows Event Log could not be backed up. Most commonly, you don't
have access to the C: drive, because of permissions problems. (Someone
may have changed them.) If not that, check for having run out of space
on the C: drive. Check the dsmerror.log for indications, and the
Windows xx Event Logs themselves. Could be the result of a *SM defect:
upgrading the client level may fix.
More extreme: Try deleting the c:\adsm.sys directory, then see if the
event log backup still fails. If not, then add the following lines to
your dsm.opt file: tracefile c:\trace.txt
traceflags eventlog
Then re-run backup of *just* the event log, then examine the trace.txt
file.
ANS1950E Backup via Microsoft Volume Shadow Copy failed. See error log for more
detail.
Well, that Windows service may have failed and need restarting; or you
may need to reboot. The "error log" referred to is the Windows event
log; but don't overlook the dsmerror.log as a source of hints.
ANS2048E Named stream of object '\\server\share\full\path\to\file' is corrupt.
May be reported as "File has a corrupt named stream".
Seen during a Windows restoral. As explained by APAR IC33922:
"NTFS file systems support multiple data streams in a file. The part of
the file that you normally see via Windows Explorer or the DIR command
is the unnamed (default) stream. However, some applications also write
one or more named(secondary) streams to a file. For example, an
application that creates bitmap images might store the main image in
the file's default stream, and a "thumbnail" image in a named stream
(that is part of the same file). This APAR concerns itself with named
streams. Because named streams are supported only on NTFS file
system,this APAR affects only the Windows NT-based platforms (NT 4.0,
2000, and XP). The Windows 9x-based family (98/Me) are unaffected.
When the default stream (the "main" part of the file) is restored
correctly (no TSM warning and error messages) and the named streams are
not restored correctly (ANS2048E) the TSM client shouldn't stop.
Circumvention, should the restoral stop: Use 'testflag continuerestore'
to skip the 'bad' file.
ANS2604S the web Client agent was unable to Autheticate with the Server
Requires an administrative account with owner privilegies to the node.
ANS2609S TCP IP Communication failure between the browser and the client mashine
Cause just after installation:
Did you install the web client via the wizard? The initial install
doesn't do it by default. Go to Utilities/Setup Wizard from the menu
bar and install, or check your services panel to see if this service is
installed and started (and set to automatic).
Causes during ongoing operations:
- The LAN connection to the TSM client machine went down.
- You are trying to connect to the TSM client machine using the wrong
port number.
- The Client Acceptor Daemon on the TSM client machine is not up and
running and accepting connections.
ANS2820E An interrupt has occurred. The current operation will end and the
client will shut down.
Mystery message in TSM 5.3. Reported to occur in the dsmserror.log and
dsmsched.log, when the scheduler concludes.
ANS3408W The volume /xxx/xxxx contains bad blocks
Seek it in the Messages manual as ANS13408W(!).
ANS3603E Error creating directory structure
Do not try to restore files via "~USERNAME" form.
ANS4001E Error processing '____': file space not known to server
May be a conflict with lower/upper case. Do Query Filespace to see
what's actually there vs. what you're specifying.
ANS4005E Error processing "<Filename>": File not found
In Novell Netware, usually caused by downlevel TSANDS and/or TSA600
NLM's.
ANS4007E Error processing '<FileName>': access to the object is denied.
In Unix, it may simply be that you are not the owner of an Archived
object being Retrieved, or perhaps you are trying to overwrite a
destination file to which you lack write permission.
If Archiving Files in Windows without being administrator, the user
needs the SE_SECURITY_NAME privilege. This privilege is granted
through the "Manage Auditing and Security Log" right. If the
SE_SECURITY_NAME privilege is not held, GetFileSecurity() (a Windows
function) issues a return code of 1314, which is what ADSM reports in
the dsmerror.log messages you are seeing. At this point there are two
options:
1) Grant the "Manage auditing and security log" right.
2) Code SKIPNTPermissions Yes in dsm.opt. ***** WARNING ***** If this
option is used, NT permissions will not be restored/retrieved when
the files are restored/retrieved.
3) Perform work from the System account.
4) If run from a scheduler, running as a service, and the schedule
references a UNC name directly then the service must be running
under a domain authorised account. Running under the Local System
account (which is the default) won't work because this account
doesn't have any access to domain resources. This could explain
why backup can work from the GUI but not the scheduler. Try
logging the service in as a domain admin account.
5) The file may be one which is always open, like NetWare print queues,
and thus you cannot back it up.
(Also seen as message:
ANE4007 (Sessio: ___, Nod: ______) E Error processing
'D:\labfiles\PHCT_32\OTS\49399900.OLT': access to the object is
denied.
or in Novell:
ANE14007 (Sessio: 1370, Nod: NOV_BLK_EDV_PROD) E Error processing
'SYS:/QUEUES/7702001.QDR/Q_0277.SRV' : access to the object is denied
ANS4010E Error processing '<SOME_FILE_SYSTEM>': stale NFS handle
What this is *supposed to mean*:
SOMETHING attempted to mount this file system in an "NFS manner" at some
earlier time in this opsys uptime; but the mount failed, and remains
pending, hence the staleness. One way for this to have happened via
implied mount request by virtue of being defined as an NFS file system
in /etc/filesystems or equivelent: at machine start the NFS mounter
would try to mount the remote filesystem, fail, and go on. (Eliminating
the unnecessary stanza in the /etc/filesystems will prevent recurrence.)
Another means of it happening is someone having done a manual mount
specifying "System:Filesystem". Or some facility might have issued a
system call to do it. But in any case the mount could not complete, and
so the stale handle.
The associated errno label is ESTALE, which would usually be returned by
statfs() or stat().
What this can mean due to faulty ADSM programming:
It is issued any time that a ADSM makes a timed stat() system call on
any file system and the stat system call does not return in the allotted
time, as governed by the ADSM NFSTIMEOUT value. (In ADSMv3 PTF 7 you
can reportedly code "NFSTIMEOUT 0" for indefinite wait.)
One circumvention is said to be to remove the 'dsmstat' module.
Another circumvention (particularly with HSM) is to put the undocumented
NFSTIMEOUT operand into dsm.sys, with a 120-second timeout:
NFSTIMEOUT 120
You can also try the more extreme 'fuser -k <filesystem_name>', which
kills any NFS process associated with the file system.
Some PMR info about this:
The issues I was referring to are that a stale NFS error can cause the
client backup to fail instead of skipping the effected filespace, and,
that the stale NFS error should really be a stale FS error. The APAR
which contains these issues is IX86323.
The fix, however, is a bit more complex. The old way clients dealt
with the Stale NFS handle issue would cause file data to be
expired. There was a fix which caused ADSM to stop processing to avoid
that expiration, but now clients fail to complete backups. The planned
fix will be to skip these filesystems so that the backup can complete.
Work is still going on in this area and it looks like the fix will be
in 3.1.0.7, but there is still work that needs to be done to ensure
the safety of the fix so it may be delayed. A workaround is to try
and make the NFSTIMEOUT value larger to give the filesystem a change
to return to the call.
The condition has also been seen when a CD-ROM is mounted in the
operating system, but the CD itself is physically removed from the
drive. That is, the device cannot respond.
ANS4014E Error processing '/some/file': unknown system error (157) encountered.
Program ending.
See: ANS1078S
ANS4017E Session rejected: TCP/IP connection failure [Same as ANS1017E]
This is what the client sees and reports, but has no idea why.
The cause is best sought in the ADSM server Activity Log for that time.
Could be a real datacomm problem; or...
Grossest problem: the TSM server is down.
If you get this condition after supposedly changing the client and
server to use a different port number (e.g., 1502), and the Activity Log
has no significant information about the attempted session, use
'netstat' or 'lsof' or similar utility in the server operating system to
verify that the *SM server is actually serving the port number that you
believe it should be. (You *did* code the port numbers into both the
client and server options files, right?)
An administrator may have done a 'CANcel SEssion'.
If during a Backup, likely the server cancelling it due to higher
priority task like DB Backup starting and needing a tape
drive...particularly when there is a drive shortage. Look in the
server Activity Log around that time and you will likely see
"ANR0492I All drives in use. Session 22668 for node ________ (AIX)
being preempted by higher priority operation.".
Or look in the Activity Log for a "ANR0481W Session NNN for node
<NodeName> (<NodeType>) terminated - client did not respond within NN
seconds." message, which reflects a server COMMTimeout value that is
too low. Message "ANR0482W Session <SessionNumber> for <NodeNode> name
(<ClientPlatform>) terminated - idle for more than N minutes." is
telling you that the sever IDLETimeout value is too low. Remember that
longstanding clients may take considerable time to rummage around in
their file systems looking for new files to back up.
Another problem is in starting your client scheduler process from
/etc/inittab, but failing to specify redirection - you need:
dsmc::once:/usr/bin/dsmc sched > /dev/null 2>&1 # TSM scheduler
An unusual cause is in having the client and server defined to use the
same port number!
Might also be a firewall rejecting the TSM client as it tries to reach
the server through that firewall.
ANS4024E Error processing '<SomeFileName>': file write error
Usually a Rights issue when doing a restoral.
ANS4025E Error processing filespace ________: file ____ exceeds user or system
file limit
Check your login filesize limit.
ANS4042S Invalid option 'NODENAME' found in options file ____________
You coded a NODename which is the same as the system hostname, or the
NODename definition is not within a SErvername stanza.
ANS4028E Session rejected: Authentication failure
This message appears all over the console, usually accompanied by
dsmrecalld and similar processes seemingly looping. It signifies an
ADSM defect in having obliterated the client password entry in
/etc/security/adsm/<SRVRNAME> in the face of high activity.
At the client, as root, perform 'dsmc q sch' to trigger a prompt to
enter the password for the client, which will most likely re-establish
things. You should not have to perform an 'UPDate Node' command at
the server to re-establish the password, but be prepared to.
ANS4031S Error processing 'FILESPACE_NAMEPATH_NAMEFILE_NAME': destination
directory path length exceeds system maximum
Can be caused by too long a file name/path name. In NT, one can have
shared directories. If such a file is then given the maximum possible
pathlength (255 chars), that in conjunction with the real NTFS on the
disk causes the path that leads to the shared directory to be longer
than the 255 char max.
In Unix, this may be a recursive directory symlink, which would be
apparent in the reported object name.
ANS4035W File '____________' currently unavailable on server.
This message is usually seen when the tape volume the required files are
on has suffered an I/O error such that the tape has gone at least
'read-only' (message ANR8830E), if not 'unavailable'. Refer to the
server Activity Log for the issue. Look for a corresponding ANE4035W
message in therein, as well as perhaps ANR8359E and ANR0541W.
ANS4036E An error occurred saving the registry key.
Can be that the user attempting the backup is not authorized to back up
the registry. Or the C: drive was full: *SM requires space to therein
make a copy of the Registry (adsm.sys directory), to then back up that
copy. Sometimes, deleting the adsm.sys directory and trying again will
allow a successful operation. See also: ANS5166E
ANS4071E Invalid domain name entered: '/some/directory'
Typically means that what you entered was not a file system name, but
rather a subdirectory of a file system; or it is an arbitrary manual
mount point which is not one defined in /etc/filesystems.
If you really need to backup via subdirectory, consider using the
VIRTUALMountpoint option of the Client System Options file.
ANS4078E *** Directory path not found ***
See ANS1076E
ANS4089E File not found
Probably due to a link to a non-existent file.
ANS4090E Access to the specified file or directory is denied
DFS: The DFS ACL prevents access from Root or cell_admin.
ANS4095E No files matching search criteria were found. [same as ANS1092E]
ARCHIVING/RETRIEVAL: Possible problems...
- You forgot to put a hyphen (-) before a command line option such as
DIRMc.
- You attempted to archive a named pipe (FIFO) or special file.
- You are attempting the operation across nodes and the file system
architectures are incompatible.
- A defect in the ADSM client causes it to think that, by virtual of the
file system name, that it is incompatible with the request.
- You may have to enclose the filespace portion of the file pathname in
braces {} to keep it from getting confused as it parses the pathname.
That is, if you have two filespaces, /archive and /archive/blah, how
is *SM to know which is meant when you say you want to go after
archived file /archive/blah/myfile? It's ambiguous unless you are
explicit as to which it is.
- Beware ADSM sensitivity to a slash (/) following the object name: it
basically says that the object is a directory and that the search is
to look for anything below that directory, while omitting the trailing
slash says to report only names matching that one.
In particular, when using the -dirsonly option, specifying a directory
name with a trailing slash (e.g., dsmc q ar -su=y /usr1/me/) will
fail, but leaving it off (e.g., dsmc q ar -su=y /usr1/me) will work.
Conversely, when using the -filesonly option, specifying a directory
name with a trailing slash (e.g., dsmc q ar -su=y -filesonly /usr1/me)
will fail, but adding a slash (dsmc q ar -su=y -filesonly /usr1/me/)
will cause it to work.
RESTORAL/QUERY BACKUP: Possible problems...
- Your username may not be the same as the one which backed up the
file(s). (Root will have universal access.)
- The file was erased and another backup took place, such that the file
is not Active: restore with -INActive.
ANS4103E Ran out of disk space trying to Restore <File_Name>.
Retry/skip/abort (r/s/a)? _
Can occur during a RESToremigstate=No file restoral of an HSM-managed
file system, as the restoral speed may overrun dsmmigrate's speed in
migrating files to tape to make room. Usually, by the time you ponder
the message, dsmmigrate has been able to clear space, as verified by
doing a 'df' on the file system. (Note that it is normal for the file
system to fill to 100% during RESToremigstate=No restorals, and that
dsmmigrate is usually able to keep up: you will see the restoral pause
when it is writing progress dots to the terminal, and then resume once
space becomes available.)
Note that you need to respond within the session IDLETimeout limit,
else suffer session cancellation, with manifestation message:
"ANS4017E Session rejected: TCP/IP connection failure".
ANS4105S Internal program error. Failing message value was 16.
Please see your service representative.
This is a message reflecting inadequate programming on the part of the
developers, who have failed to intercept and interpret all the error
conditions they should.
In an HSM file recall this error results from going after a file whose
size is larger than your Unix filesize limit (csh 'limit' command).
ANS4116I One or more files will be stored on offline media.
Do you wish to proceed?
Occurs when an ADSM operation will go to tape and your TAPEPrompt
client option says that you should be prompted. This message can
appear during Backup operations, and in HSM when you add data to a
file system, which in turn causes it to go to the storage pool, and
that pool's high migration threshold is exceeded such that it needs to
migrate some of its holdings to the next storage pool level, which
happens to be tape.
ANS4118I Waiting for mount of offline media.
As in backing up directly to tape and client option TAPEPrompt says to
show the mount wait message. Note that you will typically see an intial
flurry of files supposedly having already been sent to the server before
the mount message appears, then followed by Retry messages. This
reflects the communication medium (e.g., TCP/IP) having absorbed the
initial amount of data in its buffers before transmission actually
occurred; hence, the mount message did not appear after the first file.
Such a mount will also be required in the backup of migrated HSM data,
where the HSM client is in the same system as the *SM server such that
*SM will implicitly perform the backup from HSM storage pool volume to
backup storage pool volume, without recalling the data to the client
file system.
Refer also to "Network data transfer rate".
ANS4123E Unable to read commands entered from keyboard. Exiting...
You attempted to run dsmc in the background (perhaps from
/etc/inittab) but neglected to specify the "schedule" keyword, which
is the only way that command runs without a terminal.
ANS4132I Removal of file space "______' successfully completed.
The ADSM client performed a 'dsmc del filespace' operation. The above
message returns immediately - but the filespace has not actually gone
yet: it will take the server some time to delete all its file object
entries from the database.
ANS4228E Send of object 'somefile...' failed
If accompanied by: ANS4268E This file has been migrated. ...
You tried do 'dsmmigrate' a file explicitly, or perhaps ADSM tried to do
so automatically per the list of migration candidates in the
HSM-managed file system .SpaceMan/candidates file. But the file is
already migrated - you can't migrate it again. This is informational,
not a problem. If this was the result of ADSM trying to honor the
candidates list, run a 'dsmreconcile' on the file system to refresh that
list.
If accompanied by: ANS4089E File not found during Backup, Archive or
Migrate processing) ...
Most likely, the file was in transition, as in existing during ADSM's
look at the file system repertoire, but no longer there when it came
time to perform the operation.
If accompanied by: ANS4312E Server media mount not possible ...
Typically means that some other session or process (like BAckup STGpool)
is using the tape drives. (See ANS4312E) This results in message
"ANS4638E Incremental backup of ____ finished with 1 failure" at the end
of the filespace backup, and a non-zero "failed" count in the job-end
summary statistics.
ANS4245E Format unknown [same as ANS1245E]
This message means that the data format is unexpected:
- You may be trying to backup or restore data using a client level which
is lower than was used to back up the data originally. (Note that this
can occur in a backup as the client is endeavoring to expire an older
file in the server storage pool.) As client software evolves, it
introduces new features which require changes in the format of the
data as stored on the server. Obviously, an older client cannot
understand data formatting which is beyond its programming.
- You may be trying to mix and match data handled by the API client vs.
either the command line or GUI client. They cannot be intermixed, and
the API cannot even query data stored by the "normal" clients.
See the "API" entry for further info.
ANS4251E File system/drive not ready
As seen in Backup output: Typically refers to an HSM-managed file which
HSM cannot serve, for some reason. One reason: the filespace was
imported and/or a RESToremigstate restoral was done to populate the
file system with stub files, across nodes; but that just yields the
stubs, with no file data in the HSM storage pool.
ANS4253E File input/output error
Seen on NT systems in the presence of a bad file, which will probably
be named in the dsmerror.log, like:
03/21/1998 07:47:03 TransWin32RC(): Win32 RC 1392 from
FioGetOneDirEntry(): getFileSecuritySize
03/21/1998 07:47:03 PrivIncrFileSpace: Received rc=164 from
fioGetDirEntries: E:
\NMCDATA\Images\VB4\TOOLS\GRAPHICS\ICONS\OFFICE
The return code 1392 is from Windows NT, and means that the file is
corrupt or otherwise unreadable. The RC 164 is the ADSM return code,
translated from the NT return code, that indicates a file I/O error
(i.e. same thing as the 1392). Run a SCANDISK against the E: drive
to clean up the corruption.
The Microsoft Windows NT and 95 error codes are in the WINERROR.H file,
which comes with Microsoft Visual C++ (it may come with some other
development packages like Visual Basic as well).
ADSM return code information can be found in the "Using the Application
Programming Interface" manual or the dsmrc.h file that is installed
with the ADSM API.
ANS4255E File exceeds system/user file limits
A file being restored or retrieved exceeds system set limits for this
user; so the file is skipped. Ensure that the system limits are set
properly. Seen in AIX 4.1 with a file of size 2147483640.
ANS4267E The management class for this file does not allow migration.
HSM is not activated in this MGmtclass. You need to do:
'UPDate MGmtclass ... SPACEMGTECH=AUTOmatic'.
ANS4268E This file has been migrated.
Usually follows an "ANS4228E Send of object 'somefile...' failed",
meaning that the file was *previously* migrated. (The "has been
migrated" terminology in the message misleads you to thinking that the
migration just happened.)
ANS4314E File data currently unavailable on server
As in attempt to restore from a tape whose Access value is Unavailable,
which can be due to the tape having been involved in a past error
situation, having been checked out of the library, etc.
ANS4301E Server detected system error [same as ANS1301E]
May be seen when the server tape encounters an I/O error. See the
server Activity Log for the circumstances. Has been seen with a tape
stuck in the drive, as in a failed unload operation.
One customer reports TSM Support recommending use of the option
MEMORYEFficientbackup=Yes - rather specious. Another, Windows customer
reported getting by this by renaming the 'SYSTEM OBJECT' filespace to
'SYSTEM OBJECT OLD' and then reattempt the backup, suggesting a corrupt
filespace. But look in the server Activity Log for the reason for the
problem - don't shoot in the dark.
Might be due to the type of object on the server being different than on
the client, as in having previously backed up a name which was a file,
but has since been replaced on the client with a directory.
This can also occur when he time zone information is not properly
configured: see IBM site Solution swg21153685 "ITSM Server internal
clock does not reflect change in system clock?"
ANS4312E Server media mount not possible [Same as ANS1312E]
Typically occurs when all drives are currently in use: expect to see
ANR0535W in the server Activity Log. The TSM client, particularly a
scheduled backup, will typically wait for a drive to become available,
with msg ANS4118I Waiting for mount of offline media. Check that your
DEVclass MOUNTLimit is not artificially limiting mounts to below the
number of drives actually available.
ANS4314E File data currently unavailable on server
Has been seen in Restore operations. Reinvoke restoral, adding
"REPlace=No" to avoid waste.
ANS4329S Server out of data storage space.
Typically occurs with HSM when the storage pool quota either defaults to
the size of the file system or is otherwise exceeded by an attempt to
write more data into the file system.
See also: ANS1329S Server out of data storage space
ANS4353E Session rejected: Unknown or incorrect ID entered
The node is not known to the server. At the server, perform a
'REGister Node'.
ANS4475E Insufficient authority to connect to the shared memory region
You must be root to use shared memory for client connections.
ANS4503E Valid password not available for server '________'.
The root user must run ADSM and enter the password to store it locally.
ANS4638E Incremental backup of 'FileSystemName' finished with 2 failure
Message resulting from a Backup operation which encountered problems.
(A successful backup generates message "Successful incremental backup
of 'FileSystemName'", which has no message number.) Things seen when
a backup fails:
ANS4312E Server media mount not possible
ANS4089E File not found during Backup, Archive or Migrate processing
(which can occur when a transient file, as in the .Spaceman/logdir
directory, evaporates between file identification and the actual
backup attempt)
ANS4940E File '________' changed during backup. File skipped.
ANS4776E Unable to recall file from server due to error from recall daemon.
Seen when dsmrecalld daemon processes are looping. Has been cured by
at least killing the child process; but may also have to kill the
parent and reinvoke it.
ANS4847E Scheduled event 'SOME_SCHEDULE' failed. Return code = 4.
Appears in client log to indicate that something bad happened during
the scheduled event. There should be another message in there, as in
above Backup stats, saying what the problem was. And the
/dsmerror.log and the server Activity Log should also be consulted.
ANS4928E PASSWORDAccess is GENERATE, but password needed for server.
You need to establish or renew your client system server access
password, from the client root account.
ANS4931S File space [whatever] in System Options File is invalid.
Typically, for VIRTUALMountpoint you specified a file system
subdirectory which is not present; or you perhaps implicitly attempt to
reference a virtual mount point (as via 'dsmc query filespace') and you
are not the owner and are not superuser.
One thing you should not do is code a Virtual Mount Point which will be
a subdirectory once the file system is mounted, because when it is not
mounted there will be nothing there and this error will be produced
whenever anyone on the client issues a dsmc command.
Another possibility is that you did not code the VIRTUALMountpoint
within the appropriate dsm.sys server stanza.
ANS4999E (RC2120) Unable to log message to server: message too long.
API programming message. In the dsmInit() invocation, the application
identification string exceeds DSM_MAX_PLATFORM_LENGTH (16 chars).
ANS5092S Server out of data storage space. [See also ANS1329S]
You're out of space in your storage pools. Do 'Query LIBVolume' and see
if you are out of volumes. See if your volumes are writable (versus
unavailable/read-only). Boost MAXSCRatch if appropriate.
ANS5174E A required NT privilege is not held.
To backup NTFS files, the user also needs the "Manage auditing and
Security log" user right.
If you are using the schedule service, ensure the user for the service
(System, by default) has the rights to the files.
ANS5166E An Error Occurred Saving the Registry Key
See if there is enough space on the C Drive to allow the Registry key to
be saved to the adsm.sys directory. See also: ANS4036E
ANS5503E File '/usr/lpp/adsm/bin/dsm.sys', line 32, value
'DEFAULTServer ADSM.SRV5' is not a valid option.
The name to be used on DEFAULTServer and SErvername options are *not*
the 64 character server names used in SET SERVER commands, but instead
are stanza names, and are restricted to 8 characters. The person who
wrote the manual confused the two and said that you can use a name of up
to 64 characters on DEFAULTServer and SErvername.
ANS5628E Invalid host name.
Your dsm.sys file needs work, in terms of server identification,
TCPServeraddress.
ANS8001I Return code NN.
You used the adminstrative client (dsmadmc) to issue a server command,
and that command ended with the resulting return code indicated.
Refer to the Admin Ref appendix on Return Codes for possible numeric
values and symbolic names for the errors that you can use in Server
Scripts.
ANS8001I Return code 3.
In entering a continued server command, you may have neglected to leave
at least one space between operands in the way you continued the command
from one line to another.
ANS8017E Command line parameter 3: 'dataonly=yes' is not valid.
Or similar, where you are certain that, for example, the dsmadmc command
does indeed support the flagged parameter, but for some reason the
client is failing to recognize it as valid. This has been seen to be
caused by a faulty LANG (locale) being in effect for the user login.
ANS8023E Unable to establish session with server
As when you attempt to employ the dsmadmc command to conduct an
administrative session with the TSM server, where the server has either
failed or has not yet completed its initialization. There may be a fatal
condition preventing the server from coming up, such as a full Recovery
Log, in which case you need to start the server by going into its
directory and invoking 'dsmserv' (without the "quiet" option), to see
the failure message.
ANS8034E Your administrator ID is not recognized by this server.
Explanation: The administrator ID entered is not known to the requested
server.
This could also occur if you try to use SERVER_CONSOLE from an admin
client, which is prohibited because the userid is not
password-protected: as its name implies, you must use it from the server
console
ANS9003E dsmrecall: file system for ____ is not in the dsmmigfstab file.
Typically because in performing the dsmrecall you specified a full path,
which includes a symbolic link with makes the path look unlike the one
which HSM manages. Instead, use the true path, or go into the directory
where the file lives and invoke dsmrecall on just the file name.
ANS9094W dsmautomig: no candidates found in file system ________.
With MIGREQUIRESBkup=Yes in effect, data must be backed up before it can
migrate.
ANS9096E User is not the owner of file _____ so file is skipped.
Seen with HSM where some random user is trying to do the good-citizen
thing of a dsmmigrate on a file which the user previously dsmrecall'ed
to examine. Doing a dsmmigrate requires that the userid doing it be the
file owner.
ANS9101I No migrated files matching '<SomeFilename>' were found.
In HSM, you attempted a explicit 'dsmrecall' or implicit recall of a file
(migrated or not, as dsmls reports) and got this error. It can
trivially indicate just the condition that the error is saying.
Also seen in a physical (non-standard) HSM file system migration where
the stub-laden file system is imported, but there's nothing in the TSM
database reflecting any migrated files. (A wacky situation anyway.)
You might also see issues where you are attempting to use HSM on a
client but there's no HSM licensing in the server, where the file access
attempts would result in the following error message in the server
Activity Log:
ANR2812W License Audit completed - ATTENTION: Server is NOT in
compliance with license terms. (SESSION: 16)
Query LICence would show:
Number of space management clients in use: 0
Number of space management clients licensed: 0
ANS9126E dsmautomig: cannot get the state of space management for
/ssa/home04/sscphenk/tmp/exportfs: No such file or directory.
This can be due to the file name having a Newline character embedded
in it (or perhaps other binary) such that HSM takes the path preceding
the newline to be the whole file name, though there is more. Look in
the .Spaceman directory's candidates list, then then do 'ls -lb' in
the actual pathnamed file system to expose any binary.
ANS9126E dsmmonitord: cannot get the state of space management for ____:
File table overflow.
ADSM defect, as in APAR IX71926, where a system has many HSM file
systems and an incremental backup causes the system inode table is being
exhausted.
ANS9178E : cannot open file /etc/adsm/SpaceMan/config/dsmmigfstab: No such file
or directory.
This is the HSM file systems control file.
ANS9183E dsmmigrate: file system / is not in the dsmmigfstab file.
You are going through a symbolic link to migrate from the file system.
Use the actual file system name.
ANS9148E dsmdu: cannot find mount point for file system ____
You issued the dsmdu command specifying a file name rather than a
directory name.
ANS9199S Cannot open /dev/fsm
Will appear if the HSM kernel extension (kext) is not loaded.
See: HSM kernel extension loaded? See also: /dev/fsm
ANS9230E Cannot unmount FSM from file system <FileSystemName>.
Message from umount command:
Invalid parameter: U
Typically occurs when you remove HSM management from a file system, as
via 'dsmmigfs remove <FileSystemName>'. Is symptomatic of a defect in
ADSM and/or AIX in performing the umount of the FSM which is mounted
over the JFS.
Otherwise, the problem can simply be that you are sitting in that
directory when you issued the 'dsmmigfs REMove' command, which makes it
impossible for the Unix 'umount' command to unmount the FSM file system
(Device busy) condition. 'cd' out of that directory and repeat the cmd.
ANS9267E dsmautomig: File system _________ has exceeded its quota.
Do 'dsmdf' on the file system name: it will probably report a Mgrtd KB
value which is way over the Quota value reported by 'dsmmigfs query'.
ANS9281E Space management kernel extension is downlevel from the user program.
Encountered when a new level of the HSM software was installed over a
live HSM system - a bad thing to do. See: installfsm
(Some customers, not knowing exactly what all the components are in the
client install package, install them all. This is unhealthy, in that it
can, as in the case of HSM, result in a kernel extension being added to
the system, and additional processes running.)
ANS9283K Attempting to access remote file.
HSM has to go to a storage pool to retrieve the involved file, which may
be on disk or tape.
ANS9285K Cannot complete remote file access.
The server may be unavailable. If you recently relocated your HSM
services, you may have neglected to update your client options file to
specify the new and correct HSM server: code MIgrateserver only if the
default server is not also the HSM server.
Or, for HSM, could well mean that there is not sufficient space (dquota
or physical space) in the file system to recall the named file. Consider
doing dsmmigrate on some files to make room for the subject file. (The
dsmmonitord's design is such that it cannot detect such spurious
events.) Typical scenario:
Recalling 1,928,733 /hsm-file-system/hsm-file
ANS9285K Cannot complete remote file access.
** Unsuccessful **
ANS4227E Processing stopped; Disk full condition
Additionally, make sure that all the dsm* HSM daemons that should be
running are running, and that there are no duplicate, conflicting
processes.
Look in the server Activity Log for reasons for the failure, if not
indications that the HSM client is actually contacting the server.
If it looks like some other condition, check the usual:
Use 'dsmmigfs query' to assure that the file system is really under HSM
control. Make sure that dsmmonitord and dsmrecalld are running and that
/etc/adsm/SpaceMan/dsmmonitord.pid and /etc/adsm/SpaceMan/dsmrecalld.pid
reflect them. Consider using installfsm to query that the kernel
extensions are loaded and in effect. Check the client dsmerror.log for
problems.
Another cause is in having redefined the client environment, and
possibly restored the server, where the dsmrecalld and related daemons
are still using obsolete info.
Beyond that, check for filespace name and other consistencies.
ANS9288I File __________ size ____ is too small to qualify for migration.
In a 'dsmmigrate', the file that ADSM HSM examined is itself a "stub"
file, and thus lacks the excess size (typically, >4KB) required for it
to serve as a stub as the file itself is migrated. This is an
informational message - there is no problem.
ANS9297I File ________ is skipped for migration: No backup copy found.
ADSM defaults to requiring the condition that "migrate requires backup",
as defined in the MGmtclass (do 'dsmmigquery -M -D' to check). You can
do: 'UPDate MGmtclass .. MIGREQUIRESBkup=No' to override.
ANS9501W dsmmigfs: cannot set event disposition on session 0 for file system
_______ token = 0. Reason : No such process
Has been seen when HSM has just been installed, but its daemon processes
(dsmrecalld, dsmmonitord) have not been started by virtue of the
/etc/rc.adsmhsm shell script being run.
ANS9528W dsmscoutd: cannot read from the state
file/etc/adsm/SpaceMan/config/dmiFSGlobalState.
As when trying to access an HSM file. This situation indicates a
problem with the /etc/adsm/SpaceMan/config/dmiFSGlobalState. Can be
fixed by recreating the file, as:
- cd /etc/adsm/SpaceMan/config
- If file dmiFSGlobalState exists, rename to some backup name, like
dmiFSGlobalState.ANS9528W
- Do 'dsmmigfs globalreactivate'
ANS9918E Cannot open migration candidates list for ________.
HSM file system has run out of physical space - expand the file
system. (Msg appears on console and in /dsmerror.log)
ANS9950E File: <file-spec> is not qualified for migration because the Space
Management Technique attribute is set to None.
You may lack "SPACEMGTECH=AUTOmatic" in the management class definition,
or you do but failed to activate the policy set containing it.
ANS9999E ntrc.cpp(879): Received Win32 RC 1450 (0x000005aa) from FileRead()
ANS9999E is the client equivalent of server message ANR9999E, used for
reporting debugging information where unexpected conditions occur, for
which there are no established error messages.
Seen in a Windows 2000 backup. Windows error code 1450:
ERROR_NO_SYSTEM_RESOURCES - Insufficient system resources exist to
complete the requested service. This is a Windows issue, encountered
when backing up big filesystems, or particularly large files. Windows
has a certain amount of memory pool space that it can allocate to
programs, and TSM is using the memory available from that pool such that
there is no more memory left to allocate. TSM is a victim of the Windows
architecture shortcoming. Windows 2000 and its ilk use 32-bit
addressing for memory. This only allows for 4 GB of addressable RAM,
which must be divided into various sections of virtual memory. The
kernel only has 2 GB to divide up and, in this distribution of
addresses, Windows allocates a paged-pool memory maximum size of 192
MB. (This is a good reason to avoid Windows and use a real operating
system.)
The following docs from the Microsoft Knowledge Base Articles describe
this error condition:
Q304101 - Backup Fails with Event ID 1450. [This article talks of
changing some Registry settings, which one customer reports
having resolved his backup problems.]
Q247904 - How To Configure the Paged Address Pool and System Page Table
Entry Memory Areas
Q142719 - Windows Reports Out Of Resources Error When Memory Is
Available
Q236964 - Delayed Return of Paged Pool Causes Error 1450 "Insufficient
Resources"
Q192409 - Open Files Can Cause Kernel to Report
The presence of the ANS9999E may cause the client scheduler to exit with
return code 12 (via ANS1028S).
BMR-----(SysBack messages)--------------------------------------------------
BMR0030E stbackup.c(805): Error from TSM API during SendData call: ANS0278S
(RC157) The transaction will be aborted.
That message doesn't tell you anything useful. If a dsierror.log, check
for actual problem indications therein. Otherwise check the TSM server
Activity Log - which may show a MAXSCRatch problem.
OBK-sbt, like:
(2651) OBK-sbt:<06/18/2001:15:10:22> odsmSess(): # of dsmInit retries = 1
(2650) OBK-sbt:<06/18/2001:15:10:22> sbtread(): End of file reached. oer =
7061, errno = 2505.
These are not Tivoli messages, but rather are passed back to EBU (Oracle
Enterprise Backup Utility) by the media management software. Contact your
respective Media Management Vendor for support.
BusinesSuite Module for Oracle error messages appear in the following format,
where pid is the process id and function is an internally defined function name:
(pid) OBK-sbt: <function>: <error message>
In addition, BusinesSuite Module for Oracle will write extended debugging
information in the file specified by the NSR_DEBUG_FILE environment variable.
SQL-----(DB2 messages)-----------------------------------------
SQLnnnn messages are from DB2 itself. Return Values tend to be return codes
from the TSM API. See also IBM message references like
https://aurora.vcu.edu/db2help/db2m0/frame3.htm#sql2000
SQL2025N An I/O error "_RC_" occurred on media "ADSM".
The RC values are from the API manual, Return Codes appendix.
41 means DSM_RS_ABORT_EXCEED_MAX_MP, which means that the client was
attempting to use more mountpoints for a backup or archive operation
than permitted by the server. From the Admin, run 'Query Node nodename
Format=Detailed' to determine the maximum allowed mountpoints for the
node. You may need to use UPDATE NODE to increast this value. If the
intent is for this client to back up to disk, you will need to check
other things in your configuration to understand why it is trying to go
to tape.
SQL2062N An error occurred while accessing media ____. Reason code: ___.
General notes: The reason code is from *SM itself. The TSM TDPs utilize
the database's API on one side, and the TSM API on the other side, to
effect backup and restoral. Thus, you should look in the API manual for
an explanation of the reason code (API Return Code).
Note that the "media" is usually db2tadsm.dll, the DB2-to-ADSM interface
module: that is, DB2 is writing its backup data to a conveyor module
rather than a tape device.
SQL2062N An error occurred while accessing media. Reason code: "-50".
This is a TCP/IP failure of the *SM API to connect to the *SM server.
You might look in client error logs for leads; and the *SM server
Activity Log may well reveal the circumstances.
SQL2062N An error occurred while accessing media
"/home/db2pet1/sqllib/adsm/libadsm.a". Reason code: "138".
As always, determine when the backup was last run successfully, and what
changed since then. API return code 138 suggests someone diddling with
permissions, or the software being run from an inappropriate or
authority-changed account. Doing an 'ls -l', 'ls -lu', and 'ls -lc' on
the lib file is always advisable, to ascertain when the lib was last
used and, per -c, when someone changed its attributes.
SQL2062N An error occurred while accessing media
"/home/db2inst1/sqllib/adsm/libadsm.a". Reason code: "185".
May be an incorrect version of the libadsm.a library, as in the ADSMv3
client having been installed on a system where ADSMv2 had been, without
uninstalling v2 first. In AIX, do 'lslpp -l "adsm*"' to list the ADSM
program products that are installed and if you find anything at Version
2, remove it.
SQL2062N An error occured while accessing media "C:\SQLLIB\bin\db2adsm.dll".
Reason code: "406"
The 406 indicates that the program cannot locate your API options file.
You may have DSMI_CONFIG set, but pointing at the directory in which the
file resides, rather than naming the file itself.
SQL2062N An error occurred while accessing media
"/home/dbadm/sqllib/adsm/libadsm.a". Reason code: "610".
Seen when that module has been deleted or moved. Replace it.
106 (RC_ACCESS_DENIED).
This error code causes ADSM to skip the problem file and continue on
with the next file.
131 (RC_SYSTEM_ERROR).
This error code causes backup processing of the file system to stop.
0150 (S DSM_RC_UNKNOWN_FILE_DATA_TYPE)
Has been seen in an attempted DB2 restore where there were two copies of
the same DB2 logfile, one of them corrupt, and one of them not. The DB2
client does not offer the granularity to pick between two objects of the
same name.
NpPeek: No data.
This error occurs when running journaling in the client, where the
backup client is trying to read a response sent from the journal daemon,
which isn't available at the moment the read is being done. This error
can happen if the journal daemon ends (obviously a problem) or possibly
if the response the backup client is looking for from the journal daemon
is still in progress, meaning that the journal daemon hasn't finished
processing/sending it. In most cases the response is ready when the
backup client goes to read it, but if it isn't the backup client will
keep trying to read the response until it either arrives or a timeout
occurs. Customers with old (4.3) clients seriously need to upgrade.
APAR IC36144 (5.2.0.1 on Windows 2000) change the msg to the one below.
processSysvol(): NtFrsApiGetBackupRestoreSets(): RC = 2
processSysvol(): NtFrsApiDestroyBackupRestore(): RC = 0
Windows TSM 5.1 innocuous messages indicating that File Replication
Services is not present/active on your system. The extraneous messages
can be ignored; they will be eliminated in 5.2
TcpOpen(): Warning. The TCP window size defined to ADSM is not supported by
your system. It will be to set default size - 33232
Usually, your client options file specifies a TCPWindowsize larger than
your operating system supports (see: TCPWindowsize client option).
Seen on Solaris: The session quits. Attempting to define TCPWindowsize
in dsm.sys results in:
ANS1036S Invalid option 'TCPWINDOWSIZE' found in options file
'/opt/IBMadsm-c/dsm.sys'
Was caused by a mismatch in duplex between the client and the 100Mb
ethernet switch.
The 103068111th code was found to be out of sequence. The code (3432) was
greater than (2259), the next available slot in the string table.
May be a defect in TDP. Another customer who had this problem with TSM
also found it to prevail for FTP, rcp, and other communication functions
- which resolved to a driver defect for the gigabit ethernet cards
10/100/1000 Base-TX PCI-X Adapter (14106902) under AIX, for which PTFs
are available. (As a stop-gap, you can set adapter attributes
chksum_offload and large_send to No.
JBBERROR.LOG MESSAGES:
ERRNO SIGNIFICANCE:
74 AIX:
ENOBUFS: No buffer space available. Usually happens when you've
specified a TCPWINDOWSIZE setting that is larger than your operating
system TCP/IP configuration is set up to handle:
- In AIX, you need to check the sb_max value (on AIX use the command
'no -a' to determine the current sb_max). sb_max is expressed in
bytes, so if you divide by 1,024, that will tell you the maximum
setting you can use for TCPWINDOWSIZE. For example, if sb_max is
65,536, then the maximum TCPWINDOWSIZE value you can use is 64.
- In HP-UX, the limit is the kernelparameter STRMSGSZ, which is
expressed in KB.
Try lowering TCPWINDOWSIZE so that it is less than or equal to sb_max,
and the messages should go away. Alternatively you can increase
sb_max. IMPORTANT NOTE: sb_max is a system-wide TCP/IP setting. You
should be familiar with tuning TCP/IP (or get help from someone who
knows how to tune TCP/IP) before changing sb_max or any other
system-wide TCP/IP settings.
132 Solaris:
Same as AIX 74.
GENERAL SITUATIONS:
Segmentation Fault
This is a software module failure resulting from a programming defect.
It many times manifests itself where virtual memory is constrained: the
programming assumes much, and does not account for boundary conditions
(*SM "hits its head on the ceiling"). In Unix, boosting your Resource
Limits values can circumvent the problem. If encountered in the latest
level of a given piece of software, report it to the vendor. If you can
identify the event or process whose initiation seems to cause the
failure, relaying that information to the vendor will facilitate getting
the problem corrected; and knowing what incites it may make it possible
for you to avoid the failure.
CLIENT SITUATIONS:
Client schedule stays "Pending" for some minutes before it becomes "Started"
Schedules involve a "startup window". They do not necessarily start at
the leading edge of that window. See "Randomizing Schedule Start Times"
in the Admin Guide.
This may also be an effect of the PRESchedule task running, per your
client options file.
Scheduler stops:
You should see indications of the problem in the client dsmerror.log,
and perhaps the server Activity Log. One cause is TCP Read Buffer
errors: in AIX, for example, doing 'netstat -v' may show a non-zero
value for "No Receive Pool Buffer Errors:". Proper operating system
administration will notice such issues and adjust the configuration,
in this case the "Receive Pool Buffer Size" (where a value of 2048 is
typically good).
SERVER SITUATIONS:
Server crash
Look for the file dsmserv.err in the server directory. Sometimes when
the server crashes it puts useful info in there.
RETURN CODES, WINDOWS (see the Microsoft references for full list):
AIX MESSAGES:
SOLARIS MESSAGES:
SOLARIS SITUATIONS:
EMACS MESSAGES:
EXCHANGE MESSAGES:
MTLIB MESSAGES:
Demount operation Failed, ERPA code - 68, Library Order Sequence Check.
You requested a dismount for a drive upon which no tape is currently
mounted.
Mount operation Failed, ERPA code - 68, Library Order Sequence Check.
Means that it can't mount the tape you requested because it is already
mounted on that drive.
WINDOWS PROBLEMS/SITUATIONS:
dsmcsvc.exe looping - consuming most of the system's CPU time (high CPU
usage)
Has been seen with defective file systems. Run CHKDSK, SCANDISK, or
comparable OS utility to examine the file system and the disk containing
it.
Another possibility: A problem or conflict with Norton Anti-Virus (NAV)
running and the CreateFile() Win32 API; in other words, a problem in
Windows itself. TSM calls CreateFile() during backup and this as where
CPU is up to 100%. So consider shutting down NAV services (Alert and
Auto-Protect) during the backup. (Just disabling Auto-Protect may not
help.) See if there is a Norton upgrade which may help.
Explanation (fstat error): A file or directory in the path name does not
exist
The server is testing for its database and recovery log volumes, per its
dsmserv.dsk file, and cannot find them. This can be due to having
rc.adsmserv started in /etc/inittab, but being run too early in the
system start-up sequence, before the volumes containing your TSM
database and recovery log are mounted and ready. This is certainly the
case if you later have no trouble with starting the server from the
command line. Look into any untoward mount delays and/or consider
changing the position of your rc.adsmserv in the inittab or modify
rc.adsmserv to wait for resources you know it needs.
Trace/BPT trap(coredump)
This is a SIGTRAP (signal 5) condition.
Has been seen when swap space (paging space) was not active on an AIX
system.
Also, assure that the file system in which server resources are located
is not full, and has sufficient elbow room for any additional space
that it needs.
CONDITIONS:
Access denied...
With static serialization, the adsm client will try to obtain an
exclusive read-lock on a file before backing it up. If it fails to
obtain the lock, it will return an "Access denied..." message. This
is misleading, making it seem like a permissions problem.
Scheduler stopped/disappeared:
See "Scheduler has been stopped." under DSMSCHED.LOG ERROR MESSAGES
CLIENT TRACING:
IMPORTANT: Client tracing functions only when running under the CLI, not the
GUI!
TSM 5.1+ note: Newer TSM releases may not support some of the above flags,
such as INSTR_CLIENT_DETAIL - see above (but TRACEFLAGS itself still
works). Some alternatives:
- For TSM V5.1 (and earlier) clients:
dsmc s 20MBtest.file -password=your_pw -tracefile=trace1.out
-traceflag=perform,general > test1.txt
- For TSM V5.1.5 (and later) clients:
dsmc s 20MBtest.file -password=your_pw -testflag=instrument:detail
Or, add to dsm.opt: testflag instrument:detail
This will produce a file called dsminstr.report in the same directory
as your dsmerror.log (by default the baclient directory).
To activate all but a few flags, preface the flags with a dash.
Keep in mind that tracing adds its own overhead. Refer to the Trace
Facility Guide manual.
Example of a Backup which sends a small file and waits for a tape mount:
Example of a Backup which sends a large file and waits for a tape mount:
SERVER TRACING:
TRace END
TRace Disable Trace_Class(se)
During a client restoral you can examine the distance between files on tape
by running the following trace on the server (as when you feel that your
restore is stalled or slowed down). Enter the following two commands:
trace enable pvr as
trace begin data_set/file_name
After letting the restore run for a few minutes, enter the server command:
trace end
View the trace file and look for the following (non-contiguous) entries:
<31>pvrgts.c(3121): Positioning from block xxxx to block yyyyy
<31>pvrgts.c(3121): Positioning from block yyyyy to block xxxx
Ref: http://www.ibm.com/support/docview.wss?uid=swg21107022
THROUGHPUT MEASUREMENT:
From time to time you will want to perform measurements of *SM throughput -
particularly when client people complain that they are not seeing the
performance they expect. Closely allied with that is the throughput one may
expect to a tape drive. The rate you get through an application like TSM is
dependent upon all the things that the application has to do in addition to
transferring data. The big factor in TSM is, of course, database updating
as part of file transfer (including the Versions expiration which occurs at
Backup time). The more (small) files you have, the more db updating, and
thus reduced throughput.
- Time a TSM client test in the same host system where the server resides:
This exercises TSM in a client-server arrangement, but eliminating
networking factors.
- Perform a Selective or Incremental backup on a single file, as large as
possible (to eliminate TSM db updating factors), containing random data
(to exercise tape drive data compression at a fairly representative
level). Consider utilizing Shared Memory as well as data communication
methods, for perspective comparison.
- Same conditions, but using a large number of small files, to gauge the
impact of TSM db updating while excluding network issues.
It cannot be stressed strongly enough that you MUST obtain benchmark numbers
BEFORE deploying any facility, in order to both assure that it meets vendor
performance specifications (product acceptance test) and so that you have
numbers against which to compare when issues come up in production. Far too
many customers simply put a complex facility into production without having
done any basal measurements and later are frantic for an answer to what's
wrong with throughput. That's obviously a chaotic way to operate a data
processing complex. What you need to do is, during quiet times when clear
measurements can be made, conduct unit studies of the various components
which comprise a complex and make the numbers available to all site
personnel for later reference. You may have to engage subject matter
specialists (e.g., network people) to conduct some studies. Don't hesitate
to involve others: they will be impressed that you thought to pursue this.
In the study, you may uncover anomalies which can then be addressed and
corrected, before they bite you. Once you have basal numbers for a properly
operating amalgam of components, you can much more readily analyze problems
in operating the whole.
Network load, other than TSM, is a major factor in what TSM can get out of
the network. The whole point of a SAN is that you are dedicating networking
to storage access needs, excluding other types of traffic. If your TSM
traffic is going over a LAN, you are subject to contention with all the
other stuff going through it, not the least of which is the amazingly large
amount of traffic deriving from all the port scans and probes incited by
endless Microsoft Windows security lapses, as sites throughout the world -
and infected computers in your own site - attempt to exploit security holes
at every other site. (Use firewalls!)
You can also use the 'netstat -i' and 'netstat -v' commands on some Unix
systems to see ethernet statistics. If you are seeing a lot of Collisions,
your subnet may be overloaded. If you there are a lot of Late Collision
Errors, you probably have a Full Duplex vs. Half Duplex configuration error
between your ethernet card and network access device, which results in
incredibly slow throughput. (Avoid Auto Negotiation.)
The 3590 tape drives are identifiable in /dev by their Major Device number
being 27, as revealed by 'ls -l /dev/rmt*'.
Retension on Close and Bytes per Inch (density) are not applicable to 3590s
because the drives perform such functions automatically.
The rmt*.smc is for controlling the SCSI Medium Changer (SMC), which is an
assembly on the front of 3590 drives in devices like 7331 and 7336, but not
3590 drives like in the 3494. Note: When running 'cfgmgr -v' to define a
3590 library, the 3590's mode has to be in "RANDOM" for the rmt_.smc file to
be created. (Note: With 3575 and 733* models, the device is /dev/smc_.)
You can issue SMC commands manually via the 'tapeutil' command. Mounts
occur by specifying that whatever tape is in a certain slot number is to be
mounted (it is not done by volser).
Reference: The device drivers manual "IBM SCSI Tape Drive, Medium Changer,
and Library Device Drivers: Installation and User's Guide",
Chapter 4, Special Files.
ADSM DATABASE STRUCTURE AND DUMPDB/LOADDB
(per David Bohm, ADSM server development, posted 19981201):
"The ADSM server data base contains different objects. Most of the objects
are b-tree tables. The cause of using more space for the LOADDB than was
actually used in the data base that was dumped with the DUMPDB command is a
result of the algorithm used to perform the DUMPDB/LOADDB and the
characteristics of a b-tree object...
When a record is to be inserted into a node in a b-tree and that record does
not fit then a split occurs. In a standard b-tree algorithm 1/2 of the data
goes in one leaf node and the other 1/2 goes into another leaf node. When
this happens randomly over time you get a tree where about 50% of the data
base is unused space. With the V2 ADSM server we added a little more
intelligence in the split process. There are many tables in the ADSM server
where a new record will always be the highest key value in that table. If
the insert is the highest key value then instead of doing a 1/2 and 1/2
split we just add a new leaf node with that single record. This results in
closer to 100% utilization in each of the leaf nodes in the ADSM server.
This now takes us to the DUMPDB/LOADDB process. One of the purposes of this
process is to recover from a corrupted data base index. What this means is
we ignore the index on the DUMPDB process and only dump the b-tree leaf
nodes (plus another type of data base object called a bitvector). These
leaf nodes are not stored physically in the data base in key order, which
means they get dumped out of key sequence. The LOADDB will take the records
from each of those leaf nodes and then perform inserts of those records into
the new data base. This means we take those pages that were nearly 100%
utilized because of the efficient b-tree split algorithm and convert them
into 50% utilized pages because of having to use the generic b-tree page
split algorithm.
We do not "compress" records in the data base. The data in the data base is
encoded to reduce space requirements. The data will always be written in
the encoded form to the data base as it is required for us to properly
interpret the data in the data base pages. This encoding is performed with
any writes of records into the ADSM data base, including the LOADDB since it
calls the same routines to perform the writes into the data base as the rest
of the server functions.
A full database audit, having taken the TSM server down to run the batch
command 'DSMSERV AUDITDB FIX=NO', which just identifies problem areas:
Note that you might choose to run the audit with FIX=YES to get everything
taken care of in one execution, in that fully supported execution method.
Alternatively, you may want to try fixing problems via partial audits. Be
advised that the partials are not necessarily documented for customer use,
and may be appropriate when used only in the context of TSM Support
guidance, to knowingly guide correction of problems fully identified to be
in a given area. You may proceed to use the following known, available
partial audits, accepting any risk:
MACROS:
Because ADSM updates the access time (atime) of each file that it reads to
back up, and implicitly updates the access time of every directory that it
traverses to get to files, one would like to mount file systems for backup
in read-only (R/O) mode, on the host where the file system is native, in
order to leave files undisturbed. (This is important for mail program
functionality, is an issue in user file privacy, and is needed for system
administration in knowing when a user last accessed files.) However,
read-only remounting is not a readily achievable goal...
AIX will happily remount a JFS file system R/O on its native host, via the
AIX command: 'mount -r /FSname /MountPoint'. And thereafter you can
traverse the remounted file system via AIX commands and get at all the data.
However, because the file system is still mounted read-write, via its
primary mount, all file access to the read-only version *still* results in
file access times being updated! Thus, regardless of the file system being
read-only, what AIX is going after at a low level is still the same
read-writeable data, and so it updates the inodes accordingly.
The same situation applies if you try to get around this via NFS
remounting... Let's say you export the file system to its local host
(trivial case of NFS) and then remount it via NFS as 'mount -r
ThisHost:/FSname /MountPoint'. All file accesses will still result in inode
updates. Let's say you export the file system to some other host and then
remount the file system there in read-only mode as 'mount -r
SrvrHost:/FSname /MountPoint'. In this case you would hardly expect inode
updates to occur, and yet they do, because again all accesses go back to the
original host where the file system is mounted read-write.
I additionally pursued two more ideas: making the mount point permissions
purely 'r'; and employing the -ro option on 'exportfs'. Neither helps.
ADSM does not like file systems which are remounted without NFS: attempting
to perform incremental backup on a remounted file system fails with error
message "ANS4071E Invalid domain name entered: '/MountPoint'". No
combination of VIRTUALMountpoint and/or DOMain definitions within ADSM will
get by this. You can get ADSM to back up the file system by using NFS
remounting, but the inodes will get updated, because the file system is
mounted read-write on its native system.
You cannot do a DSMSERV RESTORE using an ADSM database backup tape created
on a machine with a different architecture: in such cases you must perform
an Export-Import.
CLIENT SERVER
When the client and server first intercommunicate, they exchange and agree
upon various settings. Among them, the client learns the TXNGroupmax value
of the server and will observe that when sending data to the server: if
either the number of files accumulated to transmit to the server exceeds the
TXNGroupmax value, or the size of the data in KB exceeds the TXNBytelimit
value, the accumulated transaction occurs. That is, though TXNGroupmax is a
server option, the client knows of and operates according to it.
Note that TXNBytelimit implies that the client creates a holding area of the
specified size, which is independent of the operating system communications
buffer size, and will typically be much larger than it. When it comes time
for the client to send the data or receive data, the size disparity will
typically make for much shoveling to get the full contents of the client
holding area sent to the server, or received from it.
ON CONTEMPORARY BACKUP/RESTORAL:
The size of modern disks and their contained file systems can be
characterized as "huge". That size alone means that traditional backup and
restoral mechanisms are problematic in whole-disk recovery, partly because
of speed and largely because of the sheer volume of data that would be lost
in recovering since the most recent backup.
Guarding against disk disaster these days calls for some flavor of
mirroring, which is a form of continuous backup. With it, recovery from a
failed disk can be immediate, and in many cases transparent, with no loss of
data. The commodity pricing of todays disks makes mirroring very
practicable.
In TSM processing, particularly with disk storage pools, you may encounter
disk problems. Here we explore approaches to dealing with the situation.
The first thing to appreciate is that reacting to the disk problems from the
application (TSM) level is the wrong first course of action. Taking action
at the TSM level without first determining what the problem actually is can
result in inappropriate actions, wasted time, and lost data. For example,
consider a disk having intermittent electronics problems: If you react by
performing an AUDit Volume, it may unwittingly deem files to be bad when in
fact they are entirely viable on the drive. You need to approach disk
problems from the operating system level, where there is more substantive
information and diagnostics to analyze the problem.
In the general case, consider the following elements that are involved in
access to the disk, from the computer outward, and what can go wrong with
them:
- The disk adapter (e.g., SCSI card) plugged into the bus slot.
In some instances, the card may not be properly seated in the slot. In
rare instances, the adapter card may fail.
- The cable and cable connectors connecting the disk adapter card to the
disk.
This is a classic problem area. Often, computer room personnel connect
SCSI cables and don't bother to secure the connection with provided
screws or clips, and so over time it is easy for the cable to work
loose, particularly as the cables hanging from the back of a computer
system are jostled by people working behind the systems. Bent or broken
pins inside the connectors are not unknown. In the case of SCSI, people
may unknowingly make the chain too long and it suffers degraded signal
quality; or they fail to terminate the chain or use the wrong type of
terminator. (SCSI is a *very* confusing black art which makes SSA and
Fibre Channel all the more attractive.)
- The electronics collocated with the disk drive which interface it to the
cable connection and govern the actions of the disk drive.
Disk drives always have attached to them a printed circuit card
containing interface and driver electronics, plus power and signal
connectors. The electronics sometimes fail. With spare disks on hand,
you could replace the electronics portion of the drive, typically held
in place by screws.
- Within the disk drive, the drive motor, the disk arm, read/write heads at
the end of the arm, and the oxide-coated platter surfaces of the spinning
disk.
Disk drives which have seen a long life often experience their bearings
wearing out or lubricants drying out. Worn bearings can cause vibration
or wobble which makes for bad track alignment and difficulty reading
previously-written data. When lubricants dry out, the disk arm may
experience difficulty moving, and the spindle of a disk that is turned
off for some period of time may fail to spin-up when turned on. Head-
disk assemblies (HDAs) are hermetically sealed, so should never
experience problems resulting from dirty computer room air. But platter
surfaces can be ruined if the disk heads, which typically fly over their
surfaces at very high speed, come into contact with the platter
surfaces. How can this happen? Consider how many computer systems (with
internal drives) and disk cabinets are placed on desks or tables that
get bumped or jarred. Consider uneven computer room raised floor tiles
that serve as teeter-totters when people walk by adjoining equipment.
Consider carts being rolled through computer rooms and accidentally
bumping equipment, or the custodian with a vacuum cleaner. Disk drives
have relatively high G ratings, but don't push your luck.
- Dust
Sounds like a joke, right? Dust as a component of disk drives? Yes,
because it's unavoidable and pervasive. Consider that almost no computer
rooms are sealed environments: people open doors to walk in and out,
stuff is rolled through, cardboard boxes are routinely opened in
computer rooms, plumbers take down ceiling tiles to work on overhead
pipes and drill holes, etc. Take a look inside any computer or disk
drive and you'll find a disturbing amount of dust covering components
and blocking air flow. Get enough of it blanketing heat-sensitive
electronics components and you get overheating that leads to reduced
life. (Tests I have conducted reveal that ordinary dust is not
conductive, so it should not be the cause of short circuits.) The dust
problem is aggravated by equipment routinely designed to pull air across
innards with no filter of incoming air. Have your computer room people
take avantage of long downtimes to vacuum the inside of cabinets. A set
of small vacuuming tools for use with ordinary vacuums can be obtained
for about $15, and is well worth having.
Note that commercial data processing practice mandates having spares for all
disk drives in the shop, so as to minimize downtimes. Realize that it can
take hours or days to get a replacement drive from your hardware service
people or your hardware supplier. You need to have a spare ready to either
wholly take the place of a failed drive, or provide parts for repair of the
failed drive. At that point you order a new spare, when you can afford to
wait for that one.
If you call in a problem with a supported level of the software, will you
get a fix? Maybe. When there are two levels of software being "supported"
at one time (for example, 3.7 and 4.1), you can probably get a fix for the
newer one, but not the older one, despite both being "supported". The
typical procedure is to open APARs only against the most current release of
the client, as that is where maintenance will be applied. If a problem
exists on a 3.7 client but is not reproducible on a 4.1 client, an APAR may
be opened against the 3.7 client, but this does not necessarily mean that a
fix will be made available for the 3.7 client. Depending on the nature of
the problem and how severe an issue it is, the APAR may be closed 'fixed in
next release' and the resolution would be to apply the 4.1 level of code.
If the APAR is severe enough, a fix may be provided at earlier levels, but
this is usually not done automatically. Fixtests on previous versions are
usually only made on request, and only if the APAR is deemed serious enough,
or there are compelling reasons why the customer can not upgrade to the
current release that contains the fix. This is because pursuit of a fix
takes development time, and the vendor doesn't want to put time into an
older (yet supported) release when they could be using the time to address
higher severity issues and new client functionality. If the customer has a
compelling reason (old 3.1 server that the 4.1 client is not going to be
compatible with), that needs to be known for development to consider if the
fixtest is justified.
Make sure that your business has firmly defined objectives when it goes to
compare storage management products: don't simply look at packages and
compare them. You want a solution which will fulfill defined needs, not a
vendor's sale objectives.
I had a Bad Feeling when IBM took the product from the hardware people
(Adstar) and gave it over to Tivoli, in that Tivoli impressed me as just a
generic, market-to-executives type of organization. I was hoping that my
impressions would be wrong, but realities proved otherwise. Customers
suffered one defective maintenance level after another, speaking to the
utilization of outmoded development and testing techniques, and some
combination of lack of interest by IBM as a corporation and lack of
technological leadership on the part of Tivoli management. Such conditions
result in stressful conditions for the staff, and a high probability that
the best people will leave for better situations. Many customers stuck with
(older) releases they knew worked, rather than facing spending scarcely
available time to hunt for a newer release which provides some minimal
number and type of defects which would allow such a release to work in their
shop. In a nutshell, the product was no longer in the hands of
technologists, and the product and its customers were suffering.
In 2003, IBM folded things back into the main company, and quality improved.
The product continues to have defects, but not fiascos.
Learn about and understand the systems for which you are responsible:
It cannot be stressed strongly enough that as part of implementing any
major system, it is absolutely essential that we become familiar with
it, which means having read the manuals and having become familiar with
both the elements of the system and where to look up information,
particularly problem handling information.
Keep records:
When changing any significant system file, always make a copy of it
first. (I created a 'bkupfile' command, which does a 'cp -p' to make an
image of the file, appending a .YYYYMMDD datestamp to indicate the date
of change; and that command is religiously used at our site.) Leaving
tracks like this is invaluable in both pointing out when changes have
been made, and providing something to revert to.
In some way, preserve the contents of your Activity Log to an age which
encompasses the re-use of your oldest tape. Only your Activity Log will
give you a clear picture of tape usage over time, and that is invaluable
when trying to find out what was used, when - particularly in recovery
situations.
Avoid mixing long and short retention data on the same serial storage media:
This is another configuration design issue. You may have multiple
filespaces mingle on the same serial volumes (tapes). If the files
thereon have wildly different retention values, the volumes will
prematurely end up with a lot of "holes" where the short-period data has
expired, which in turn elevates the amount of reclamation you have to
do - which is mechanically bad for tape drives, and interferes with
other schedules.
Lo unto those who run anti-virus software on file systems being backed up:
There have been innumerable problems created by anti-virus software, for
any backup product, when run at the same time on a file system which is
undergoing a backup. Performance and functionality suffer.
Expect warts:
Don't expect any new release of any software to be perfect. Every
release of something has some warts. For example, upgrade to TSM 5.2.4
to solve some problems and get the minor annoyance of Query PRocess
having misaligned text for Space Reclamation processes. The watchword
in life: "It's always something."
Tivoli: http://www.ibm.com/software/tivoli/
Contacting Tivoli (TSM publications feedback):
http://www.ibm.com/software/tivoli/contact.html
Glossary: http://publib.boulder.ibm.com/tividd/glossary/termsmst04.htm
Search: http://www.ibm.com/software/sysmgmt/products/support/
Software Support downtime web page notice:
http://www.ibm.com/software/support/outages.html
TSM 3.7:
http://www.redbooks.ibm.com/redpieces/abstracts/sg245477.html
Manuals (clients, messages, but not server manuals):
http://ezbackup.cornell.edu/techsup-v3.7/ibmdocs/index.html
TSM 4.1:
http://www.tivoli.com/products/documents/updates/
storage_mgr_enhancements.html#4.1
Manuals (clients, messages, TDPs, but no server manuals):
http://ezbackup.cornell.edu/techsup-v4.1/ibmdocs/
TSM 5.3:
http://publib.boulder.ibm.com/infocenter/tivihelp/index.jsp
LTO
Ultrium roadmap:
http://www.lto-technology.com/newsite/html/format_roadmap.html
http://www.qualstar.com/146252.htm
Ultrium vs. Super-DLT:
http://www.storage.ibm.com/hardsoft/tape/lto/prod_data/ltovsdlt.html
"IBM LTO Ultrium Performance Considerations"
ftp://ftp.software.ibm.com/software/tivoli/whitepapers/wp-tsm-lto.pdf
IBM Tech Support:
Updating firmware:
http://ssddom02.storage.ibm.com/techsup/webnav.nsf/support/
ltofaqs_updatefw_drivefw
LTO - A New Robust Tape Standard:
http://www.storage.ibm.com/tape/lto/white_papers/ltowhitepaper.html
LTO Data Compression:
http://www.storage.ibm.com/tape/lto/white_papers/pdf/
whitepaper_compression.pdf
LTO Ultrium cleaning issues:
http://www.t10.org/ftp/t10/document.03/03-204r1.pdf
LTO Sense Data: http://www.tuganz.org/filemgmt_data/files/SenseData_04.pdf
Ultrium tape recording method (animated overview):
http://www.ultrium.com/newsite/html/about_tech.html
Sense Data:
"Tivoli Storage Problem Determination Guide - Understanding Sense Data"
http://www.ibm.com/support/entdocview.wss?uid=swg21063859
"SCSI Sense Data Structure and Example"
http://www.ibm.com/support/docview.wss?uid=swg21063859
LMCPD (atldd) and 3590 (Atape) driver software (found via "Support" on the
3494 home page):
ftp://service.boulder.ibm.com/storage/devdrvr/ ...or...
ftp://index.storsys.ibm.com/devdrvr
To suspend getting email, but remain a member of the List, you can
adjust your personal settings on the Listserver for "NOMail". Send email
to LISTSERV@VM.MARIST.EDU with the one-line body: SET ADSM-L NOMail
Problem situations:
- Mail back with Subject "Rejected posting to ADSM-L@VM.MARIST.EDU"
and body saying "Your message is being returned to you unprocessed
because it appears to have already been distributed to the ADSM-L
list. ..." This is because some idiot List member is rejecting his
incoming ADSM-L mail back to the listserver. Examine the expanded
mail headers to determine the offending site.
UCSD's 3494:
http://www-act.ucsd.edu/act/ibm3494.html
HSM:
Redbook: "Using ADSM Hierarchical Storage Management" (SG24-4631)
Tivoli Field Guide: TSM for Space Management:
http://www.ibm.com/support/entdocview.wss?rs=0&uid=swg27002498
IBM redbooks, for online viewing and download:
http://www.redbooks.ibm.com
Send feedback email to: redbook@us.ibm.com
IBM product information, emailed to you:
http://isource.ibm.com/world/index.shtml
Lotus/Domino redbooks: http://www.lotus.com/developers/redbook.nsf
IBM Techdocs: http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/Flashes
APARs, PTFs (APAR repository/APAR database):
TSM: http://www.ibm.com/software/sysmgmt/products/support/
IBMTivoliStorageManager.html
where you can enter word, or phrases without quoting
General: http://www.ibm.com/support/
Enter phrases in double quotes.
Other:
http://service.software.ibm.com/cgi-bin/support/rs6000.support/databases
http://www.tivoli.com/asktivoli/cgi-bin/cast.cgi (need userid, password)
http://www.ibm.com/software/sysmgmt/products/support/
-> select IBM Tivoli Storage Manager -> select "Solutions"
Be aware that many are the typing and spelling errors in the databases,
which can thwart searches. (There is no IBM editor assigned to review
the coherency and correctness of what technicians write therein.)
For a given TSM level, you can get a list of the APARs fixed at that level
by searching IBM for like: "APARs fixed in V5.1 PTFs".
Disaster recovery:
Redbook: "Disaster Recovery Strategies with Tivoli Storage Management"
(SG24-6844)
http://www.redbooks.ibm.com/abstracts/sg246844.html
http://www.redbooks.ibm.com/redbooks/SG246844.html
http://www.redbooks.ibm.com/pubs/pdfs/redbooks/sg246844.pdf
Windows bare metal restore:
MS Knowledge Base article "How to Move a Windows 2000 Installation to
Different Hardware":
http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q249694
Education/Training:
http://www.tivoli.com/services/education/courses/
http://www.rdperf.com/ (R&D Performance Group)
Media (3590 tapes):
http://www.emtec-magnetics.com/
http://www.mtc-open.net/ Magnetic Tape Cartridge technology
http://www.mtc-open.net/Infocenter/Linkpage/
http://www.thic.org/pdf/Oct00/imation.jgoins.001003.pdf
Tape Media Guide (table):
In: http://www.storage.ibm.com/pguide/SSGProductsweb0503.pdf
http://fujifilmmediasource.com/specs/new/misc/tapewip02.pdf
Oxford annual ADSM/TSM symposium: http://tsm-symposium.oucs.ox.ac.uk/
Papers/presentations/seminars:
http://tsm-symposium.oucs.ox.ac.uk/callfor.html (current)
http://tsm-symposium.oucs.ox.ac.uk/papers (papers dir.)
http://adsm-symposium.oucs.ox.ac.uk/1999/callfor.html
or http://adsm-symposium.oucs.ox.ac.uk/1999/papers/
http://adsm-symposium.oucs.ox.ac.uk/2001/callfor.html
or http://adsm-symposium.oucs.ox.ac.uk/2001/papers/
"The TSM Client - Diagnostics":
http://adsm-symposium.oucs.ox.ac.uk/2001/papers/Raibeck.Diagnostics.PDF
http://tsm-symposium.oucs.ox.ac.uk/ (TSM Symposium 2003)
HSM on Windows 2000 (NT 5):
http://www.highground.com/rsm/rsmoverview.htm
Microsoft Windows error numbers:
http://msdn.microsoft.com/library/wcedoc/wcesdkr/appendix_2.htm
http://msdn.microsoft.com/library/psdk/psdkref/errlist_9usz.htm
http://www.mvps.org/btmtz/win32errapp/ (Win32 Error Codes application)
Salary surveys:
http://adsmsalarysurvey.8m.com/ As of 2001/05/15 replaced by:
http://tsmsalarysurvey.8m.com by Mark Mooney <m.mooney@ais-nms.com>
www.salary.com
Tivoli Decision Support for Storage Management Analysis (TDS for SMA)
http://www.tivoli.com/products/index/decision_support_storage_mgt/
Said to help you A) analyze your current storage situation, and B) help
predict your longer term storage needs. 2003/06: will going into
retirement fairly soon, to be supplanted by Tivoli Data Warehouse and
TEC.
TSM management:
IBM has a Guide that runs with 'Tivoli Decision Support' called 'Storage
Management Analysis' that is for reporting *SM data. See redbook
"Tivoli Storage Management Reporting" (SG24-6109).
User implementations:
Cornell EZ-Backup, and fee for services:
http://www.ezbackup.cornell.edu/overview
Linux:
Supported devices:
http://www.ibm.com/software/sysmgmt/products/support/
IBM_TSM_Supported_Devices_for_Linux.html
Manuals:
http://publib.boulder.ibm.com/tividd/td/StorageManagerforLinux5.1.html
Client, 3.7:0
ftp://ftp.software.ibm.com/storage/tivoli-storage-management/maintenance/
client/v3r7/Linux/LATEST/
or:
ftp://service.boulder.ibm.com/storage/tivoli-storage-management/
maintenance/client/v3r7/Linux/
Windows:
Redbook: Deploying the Tivoli Storage Manager Client in a Windows 2000
Environment (SG24-6141)
http://www.redbooks.ibm.com/redbooks/SG246141.html
http://www.redbooks.ibm.com/redbooks/pdfs/sg246141.pdf
"Microsoft Installer (MSI) Return Codes for Tivoli Storage Manager Client &
Server": http://www.ibm.com/support/docview.wss?uid=swg21050782
Backup/Archive products in general:
http://windows.about.com/cs/backupswproducts/
Adabas backups:
ADINT/ADSM (http://www.ibm.com/de/entwicklung/adint_adsm/index.html)
Veritas vs. TSM (a limited comparison, sponsored by Veritas...):
http://www.keylabs.com/results/veritas/veritas.html
IBM Tape Solutions (Scott Hoyle presentation, HPSS User Forum, 2000/07/26;
3590 vs. 9840, 3580 Ultrium/LTO, DLT 8000 tape drives):
http://www4.clearlake.ibm.com/hpss/Forum/2000/AdobePDF/
Freelance-Graphics-IBM-Tape-Solutions-Hoyle.pdf
3590 vs. 3580 Ultrium/LTO:
Redbook "The IBM TotalStorage Tape Selection and Differentiation Guide"
http://www.redbooks.ibm.com/redbooks/pdfs/sg246946.pdf
Torture-testing Backup and Archive Programs: Things You Ought to Know But
Probably Would Rather Not, a 1991 paper by Elizabeth D. Zwicky, SRI
International, for LISA V.
http://ftp.at.linuxfromscratch.org/utils/archivers/star/testscripts/
zwicky/testdump.doc.html
(This ADSM/TSM Quick Facts document was made available on the web 2000/05/18.
It is known to be indexed by:
http://dir.adsm.org/FAQ/ http://dir.adsm.org/Cool/
http://www.coderelief.com/depot.htm
http://www-backup.univie.ac.at/ (Vienna University; click on "FAQs")
http://www.meduniwien.ac.at/itsc/services/backup/literatur.php
http://www.akh-wien.ac.at/medwrz/services/backup/literatur.shtml
http://adsm0.cso.uiuc.edu (University of Illinois, Urbana-Champaign
Campus Information Technologies and Educational Services)
http://folk.uio.no/kjetilk/tsmserver.html (TSM doc. at Oslo University)
http://www.tsmgg.nl/Links.htm (Netherlands TSM users group)
http://www.uni-ulm.de/urz/Dienste/ADSM.pdf (University of Ulm)
http://www.jasi.com/TSMUG/Useful_Links/useful_links.html
TSM user group for D.C. area
http://revelstoke.cit.cornell.edu:8080/
http://www.living-wreck.de/bm/reinhold_htmltab.htm
http://www.tsmgg.nl/Links.htm (Oxford University TSM 2001 Symposium)
http://www.autovault.nl/linksnl.html (AutoVAULT, Nederlands)
http://www.jasi.com/TSMUG/Useful_Links/useful_links.html
(The TSM User Group for Baltimore, Washington DC, and Northern Virginia)
http://www.tuganz.org/links.php (Tivoli User Group/Australia, New Zealand)
http://www.lrz-muenchen.de/services/datenhaltung/adsm/sonstiges/
)
"When you can measure what you are speaking about, and express it in numbers,
you know something about it; but when you cannot measure it, when you cannot
express it in numbers, your knowledge is of a meager and unsatisfactory kind:
it may be the beginning of knowledge, but you have scarcely, in your thoughts,
advanced to the stage of science." -- William Thomson, Lord Kelvin
"Today's computers and software are like toddlers, who have to be continually
watched. What scares me is the future, when they become adolescent types and
are convinced they know more than we do..." -- me
"It's not what you know, it's knowing where to find it."
-- Andy Raibeck, Oxford 2001 seminar
"I never waste memory on things that can easily be stored and retrieved from
elsewhere." -- Albert Einstein
http://people.bu.edu/rbs/ADSM.QuickFacts