Beruflich Dokumente
Kultur Dokumente
x
Troubleshooting Techniques
for UNIX (Lessons)
100-002372-A
COURSE AND LAB Copyright © 2008 Symantec Corporation. All rights reserved. Symantec,
DEVELOPERS the Symantec Logo, and Veritas are trademarks or registered trademarks of
Lisa Goldring Symantec Corporation or its affiliates in the U.S. and other countries.
Paul Johnston Other names may be trademarks of their respective owners.
Sandy Tipper THIS PUBLICATION IS PROVIDED "AS IS" AND ALL EXPRESS OR
ADVISORY BOARD IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES,
MEMBERS INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY,
Albrecht Scriba FITNESS FOR A PARTICULAR PURPOSE OR NON-
Carly Jacques INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT
Chris Amidei THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY
Christian Rabanus INVALID. SYMANTEC CORPORATION SHALL NOT BE LIABLE
FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES IN
Dave Little
CONNECTION WITH THE FURNISHING, PERFORMANCE, OR USE
David Rogers OF THIS PUBLICATION. THE INFORMATION CONTAINED
Don Anderson HEREIN IS SUBJECT TO CHANGE WITHOUT NOTICE.
Freddie Gilyard
No part of the contents of this book may be reproduced or transmitted in
Graeme Gofton any form or by any means without the written permission of the publisher.
Joseph Gallagher
Veritas NetBackup 6.x Troubleshooting Techniques
Kleber Saldanha
Mauricio Julian Paredes Symantec Corporation
Mike Williams 20330 Stevens Creek Blvd.
Cupertino, CA 95014
Ray Katos
Robert Owen http://www.symantec.com
Roy Freeman
Satoko Saito
Stephen Williams
Sue Rich
Suzanne Trigg
Tomer Gurantz
TECHNICAL REVIEWERS
AND CONTRIBUTORS
Rich Armstrong
James Dandeneau
Wm. M. Drazkowski
Scott Frohreich
John Gerhardson
Kevin Holtz
Dave Little
Steve Schwarze
Debbie Wilmot
Table of Contents
Course Introduction
Table of Contents i
Copyright © 2008 Symantec Corporation. All rights reserved.
Lesson 9: Troubleshooting Performance Issues
Topic 1: NetBackup Performance Overview ........................................................... 9-3
Topic 2: Isolating Bottlenecks ................................................................................. 9-9
Topic 3: Addressing Bottlenecks........................................................................... 9-19
Glossary
Index
Course Objectives
After completing this course, you will be able to:
• Upgrade from NetBackup 5.x to NetBackup 6.0 or NetBackup 6.5.
• Detect problems using tools, such as the NetBackup reports, nbsupport, and
administrative commands.
• Understand the functions of services, daemons, and processes that drive
NetBackup operations.
• View and manage NetBackup debug logs.
• Back up, recover, and troubleshoot the EMM database.
• Rapidly and accurately isolate the root cause of a backup failure.
• Troubleshoot a wide variety of media, device, and network-related problems.
• Isolate and address problems with backup performance.
• Run backups and restores with a higher rate of success and efficiency.
Course Prerequisites
Before attending this course, you should have:
• Knowledge of backup, restore, and shared storage concepts
• Completed the NetBackup 6.x Administration course, or equivalent experience
This includes working knowledge of:
– Storage units and devices
– Media management
– Policy attributes and schedules
– Restore and import operations
– NetBackup reports
– Common NetBackup command-line utilities, such as bpexpdate and
available_media
• Two years experience in Windows or UNIX system administration, including
knowledge of:
– System Logs
– Tape device configuration
– Networking
– File management
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
1
• Reading available documentation
– Release notes
– Installation manuals
– Latest TechNotes
• Verifying prerequisites
– Minimum required disk space
– Minimum required memory
– Minimum previous NetBackup version
• Deployment planning
– Rollout and staffing schedule
– Complete catalog backup of the previous installation
– A plan to back out
– Procedure to remove and install the new version, including the latest maintenance pack (MP)
or Release Update
• Postinstallation testing
– Catalog backup of the new installation
– Sample client backups
– Sample client restores
Upgrade Difficulties
An automated upgrade of a complex or custom environment may be difficult. If
the Global Device Database and the Volume Pool Databases are located centrally
on the same host prior to the upgrade, merging these catalogs into the Enterprise
Media Manager (EMM) Database is simplified. If you are using supported Disk
Staging Storage Units (DSSUs), the upgrade is quite direct. Difficulties arise when
custom disk staging is implemented, involving Disk Storage Units that use cron
jobs to initiate the duplication of backup images to media. When upgrading from
NetBackup 5.x to 6.x, install the latest 6.x maintenance pack or release update
before running nbpushdata -add to merge legacy 5.x catalog entries into the
EMM database properly.
In a large environment, it is impractical to upgrade every NetBackup server and
client simultaneously, so mixed versions must coexist during the enterprise-wide
rollout. The upgrade process supports a rolling upgrade for the days or weeks it
may take, as long as the EMM database knows that previous-version media servers
are still in active duty (by using nbpushdata -modify_5x_hosts).
Upgrade Prerequisites
Before performing an upgrade, ensure that:
• There are no 5.x database inconsistencies.
NetBackup commands and the NetBackup Consistency Check (NBCC) utility
are available to perform database checks.
• Your current 5.x backups run correctly.
If backups are failing in NetBackup 5.x, they will also fail after an upgrade to
NetBackup 6.x.
• NetBackup Advanced Reporter (NBAR) and Global Data Manager (BDM)
have been removed.
NetBackup Advanced Reporter and Global Data Manager are not compatible
with NetBackup 6.0 and must be removed from all servers before you can
upgrade. NOM is not an upgrade but provides new functionality that replaces
their roles. See TechNote 281578, How to preserve Veritas NetBackup
Advanced Reporter (NBAR) data before upgrading to NetBackup 6.0.
• Global and client settings have been captured to a text file.
An upgrade to NetBackup 6.5 may lose some customized configuration
settings for global and client attributes, resetting them back to the defaults.
Therefore, you should save these settings before the upgrade as a safety
precaution.
1
First run the bpconfig -L > global_attributes.txt command:
Then run the bpclient -All -L > client_attributes.txt
command.
After these two text files are created, move them to a new location outside of
the NetBackup directory structure. Retain them until after the upgrade is
completed and the environment is running correctly.
For more information about capturing global and client attribute settings prior
to a NetBackup 6.5 upgrade, see TechNote 294899.
1
1. Open a case
with support.
2. Download
and run NBCC. Your environment is
ready for the upgrade.
3. Send the
output to Run fix scripts.
support.
Contact consulting to
help with the upgrade.
Note: It is strongly recommended that you run NBCC before performing the
upgrade, and that you obtain assistance during the upgrade.
1
• Hostnames or fully qualified domain names (FQDNs) and reverse
name lookup are required.
train5 train5.symantec.com
IP Address
192.168.27.105
10
Network Issues
Ensure that there are no connectivity issues between the servers. Not only must
each server be reachable, but name resolution must be consistent and include
hostnames or fully qualified domain names (FQDNs) and reverse name lookup.
Resolve any DNS problems before you proceed.
Alternatively, use a hosts file on every master and media server containing the
IP address, FQDN, and short name of every master and media server to prevent
any DNS issues.
Hint: If it takes a long time to restart NetBackup on the master server or to run the
nbpushdata or vmoprcmd commands, it is possible that a name resolution
issue is causing delays and eventual timeouts. Resolve the name resolution issue,
and, if necessary as a workaround, remove all SERVER entries from the bp.conf
file (UNIX) or from the Registry (Windows) for each unreachable media server.
When you are ready to reintegrate these servers, upgrade their software, if
applicable, add their SERVER entries, and run the appropriate nbpushdata
commands.
If you have a firewall between any of the media servers and the master server,
ensure that all required ports are open.
Virtual
1
NetBackup
Server
NetBackup
Clients
12
Clusters
The virtual name and IP address of the cluster is the server name used by
NetBackup, so the actual node running the software is not important. Only the
active node communicates with other NetBackup servers (such as for the
nbpushdata -add command).
Ensure that any required patches to the cluster software are applied prior to
upgrading to NetBackup 6.x. See TechNote 278307, Veritas Storage Foundation
4.1 and 4.2 HA for Windows - Patch for Enterprise Agent NetBackup to use
AgentFile and AgentDirectory attributes, as an example for VCS 4.1 or 4.2 on
Windows.
After you have run the upgrade procedure, you must run nbpushdata on the
active node of a cluster. Running nbpushdata pushes the data to the EMM
database from the existing shared database files and from the local database files
on the inactive nodes. To do this, nbpushdata obtains a list of all of the nodes in
the cluster, uses bpcd to obtain a copy of each local database file from all inactive
nodes, and stages these files on the active node. The data from the staged files is
then pushed to the EMM database.
13
Mixed Environments
Mixed environments are supported as a migration tool, but they are not meant to be
permanent.
Clusters
Mixing server versions is not supported in a clustered environment. All clustered
nodes must run the same operating system, service pack level, and version of
NetBackup.
1
NBU 6.0
NBU 5.1
114
Media 6.0
6.0
5.1
Media 5.1
5.1
5.0
MP4
Media 5.0 5.0
MP4
MP4
6.5
EMM 6.0
5.0
MP4
115
2
3
4
1
an individual system must run the same version.
• Backup images created under an older version of NetBackup are always
recoverable with a newer version of NetBackup.
Upgrade Guidelines
Follow these guidelines when upgrading to mixed-version environments:
• No client can be greater than its media server.
• No media server can be greater than its master server.
• All of the master servers and the EMM server must run 6.5.
17
Upgrade Overview
When upgrading from one version of NetBackup to another, install the NetBackup
software in the order shown on the slide.
Note: This is not the sequence you use to run the nbpushdata command.
The nbpushdata command moves data from your current database files (a
subset of the NetBackup catalog) into a newly created EMM database. The slide
shows the sequence of when and on what systems you run the nbpushdata
command.
For more information on upgrading to NetBackup 6.x and running nbpushdata
refer to the Veritas NetBackup Upgrade Portal at
http://seer.entsupport.symantec.com/docs/290185.htm.
1
1. Prepare,
Back up the 2. Populate
upgrade, patch
NBU catalogs. the EMM DB.
Master Servers.
8. Activate the
7. Upgrade the Back up the
policies and run
clients. NBU catalogs.
a test backup.
7
8
9
1
2
3
4
5
6
18
19
1
are running.
2. Run nbpushdata -add on the upgraded systems in the
following order:
a. The host that was the 5.x Global Device Database host
b. All upgraded master servers
c. All upgraded 5.x Volume Database hosts
3. Restart the services.
4. Run nbpushdata –modify_5x_hosts on all 6.x master
servers.
20
The nbpushdata command is run after the 5.x to 6.x upgrades are complete and
all maintenance packs have been applied on the master server. This is important
because maintenance packs may include nbpushdata-related changes.
1 To run nbpushdata, the daemons or services and processes must be running
on the EMM server and on the systems where the nbpushdata command is
being run.
2 In most NetBackup 5.x environments, the master server, the Global Device
Database host, and the Volume Database host are the same machine, and
nbpushdata -add only needs to be run once on the master server.
However, if the master server, the Global Device Database host, and the
Note: The 5.x media server must be up and reachable. The nbpushdata
-modify_5x_hosts command modifies the 5.x servers so that they can
operate in a mixed environment.
If mistakes are made during the upgrade, for example, running nbpushdata
-add in the wrong order or aborting the command accidentally, work with
Technical Support to clear and repopulate the EMM database. Technical Support
may ask you to run commands similar to the following:
• nbpushdata -remove host_name
This command removes information populated into the EMM database when
nbpushdata -add was run from host_name.
• nbemmcmd -deletehost host_name
This command deletes references to host_name from the EMM database.
• nbpushdata -add
This command repopulates the EMM database.
1
2. Run the installation.
3. Apply the latest maintenance packs.
4. Restart the services.
5. Upgrade any add-on components.
6. Install the latest maintenance packs for the add-on
components.
21
22
1
2. Install the latest maintenance packs.
23
24
Note: You must back up your NetBackup catalogs before and after any NetBackup
upgrade, including maintenance packs. Catalog backups are only readable
by the same version of software that created them.
1
If you want to … Then enter …
25
• Key Points
1
– In this lesson, you reviewed the main reasons for problems when
upgrading to NetBackup 6.x.
– You learned about some of the prerequisite considerations to
performing an upgrade.
– You also reviewed the general procedure for upgrading from NetBackup
5.x to NetBackup 6.x.
• Reference Materials
– NetBackup Troubleshooting Guide
– NetBackup Installation Guide
– NetBackup Commands
– The support Web site at: http://entsupport.symantec.com for
maintenance packs, release updates, and TechNotes
– TechNotes 236274, 265806, 267965, 277242, 278307, 279038, 281578,
281789, 282017, 282159, 282162, 285223, 290185, 294899
26
Labs and solutions for this lesson are located on the following pages:
• Appendix A provides step-by-step lab instructions. See “Lab 1 Details:
Ensuring a Successful Upgrade,” page A-6.
• Appendix B provides complete lab instructions and solutions. See “Lab 1
Solution: Ensuring a Successful Upgrade,” page B-6.
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
2
NetBackup?
Step Description
Observe Observe the situation to generate a description of a
symptom or group of symptoms.
1
Refine and
repeat the
observation.
4 2
Experiment by Formulate a
retesting. new
hypothesis.
3
Make a new
prediction.
Resolution Workaround
2
Disadvantage Requires more time to Is usually a temporary
develop solution
Solaris
12
Using nbsupport
nbsupport is a command-line-based utility. The output nbsupport produces
is determined by the parameters it is given. The two primary parameters are:
• Host type
The host type specifies the type of NetBackup system from which
nbsupport will be run. The host type is specified by using
-master, -media, or -client. The host type also determines the type of
reports that NetBackup attempts to generate for the host.
When running nbsupport, determine where you need nbsupport to run.
One problem may warrant collecting nbsupport output from just one
NetBackup system, such as the master server; another problem may warrant
output from several NetBackup systems, such as a media server and two failing
clients.
• Level of detail
The level of detail determines, at a high level, how many reports should be run.
The available options are -detail {low | medium | high}.
In addition, nbsupport can be configured to exclude specific reports from the
output by creating touch files in the appropriate folder. This may be desirable on a
master or media server where a significant amount of time is required to generate a
report.
The slide on this page shows some examples of using the nbsupport utility.
Knowledge
Knowledge
Base
Base Search
Search
Common
Common
support
support links
links
2
Hot
Hot
Topics
Topics
http://entsupport.symantec.com
18
2
Tape Lists bpmedialist -mlist
Disk Reports
Images on Disk bpimmedia -disk -U
Disk Logs bperror -disk -dt 0 -U
Disk Storage Unit Status N/A
Disk Pool Status N/A
Parameter Description
-hoursago Use -hoursago instead of the -d (start time) and
number_of_hours -e (end time) switches.
-U Use -U for “user readable.”
-l Use -l for long-winded.
-L Use -L for really long-winded.
2
notification
• Configure the nbmail.cmd script.
(Windows only)
22
bpstart_notify
diskfull_notify
Per bpend_notify
Job
backup_notify
backup_exit_notify
restore_notify
parent_end_notify
userreq_notify dbbackup_notify*
session_notify
mail_dr_info*
23
• Key Points
In this lesson, you learned about situations that require troubleshooting, a
standard troubleshooting methodology, and how to gather information using
nbsupport. You also reviewed NetBackup tools used for troubleshooting.
• Reference Materials
– NetBackup System Administrator’s Guide
– NetBackup Commands
2
– NetBackup Troubleshooting Guide
– NetBackup Operations Manager (NOM) Getting Started Guide
– http://entsupport.symantec.com
– http://msdn.microsoft.com
– http://www.blat.net
24
Labs and solutions for this lesson are located on the following pages:
• Appendix A provides step-by-step lab instructions. See “Lab 2 Details:
Troubleshooting Methods and Tools,” page A-16.
• Appendix B provides complete lab instructions and solutions. See “Lab 2
Solution: Troubleshooting Methods and Tools,” page B-18.
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
Topic 3: Backup Process Flow Summarize the process flow that occurs
during an automatic backup operation.
Topic 4: Restore Process Flow Summarize the process flow that occurs
during a restore operation.
Master Server
NBU Catalogs
Master Server Client
Jobs bprd
Catalog vnetd/bpcd
nbpem*
nbproxy
bpbkar(32)
tar(32)
3
bpjobd
nbjm*
nbgenjob* (6.0 only) * Part of the IRM
bpdbm
5
PBX-based Legacy
processes: processes:
• nbpem • bprd
• nbjm • bpdbm
• nbrb • bpjobd
• nbemm • bpbrm
• nbvault • bptm
• nbsvcmon • bpdm
• nbnos • bpcd
• nbsl • bpbkar(32)
• nbgenjob (6.0 • tar(32)
only)
6
Process Types
Private Branch Exchange (PBX) is a communication mechanism in NetBackup 6.x
that allows for reduced network port usage. PBX processes use the
pbx_exchange process for all incoming communications. Certain features in
NetBackup, such as Service Monitoring and Unified Logging, are supported only
with these PBX-based processes. PBX processes exist on the master server, EMM
server, and media server. Processes relating to BMR and NDMP are not shown in
the slide.
Legacy processes are processes that do not receive communications from
pbx_exchange. In NetBackup 6.x, if a legacy process requires direct
communication with another process over the network, it may receive the
communication through vnetd (default). Legacy processes exist on all
NetBackup systems.
nbrb nbemm
bprd bpjobd
nbpem nbjm
EMM
nbgenjob Database
and Engine
Media Server
bpbrm bptm/bpdm
bpdbm
Client vnetd bpcd
3
Master Server bpbkar
Catalogs Notes: nbproxy is not shown
nbgenjob is N/A in 6.5 7
3
bpbkar and passes
it to bpdbm
bptm Tape manager bpbrm (Child) Manages transfer of
bpdm Disk manager bptm backup images
between the client
bpdm
and the storage
device (tape or disk)
Part
Part of
of the
the •• Stores
Stores and
and
IRM
IRM manages
manages
media
media and
and
nbrb nbemm device
device
information
information
•• Provides
Provides an
an
interface
interface to
to
access
access this
this
information
information
EMM Database
and Engine
3
8
GUI / CLI
3
Policy Change bpjobd
3
train2 servers full 9:30 PM Not Due
4 train5 unix-client incr 9:30 PM
10:00 Not Due
710
1
2
3
4
5
6
bprd
bprd 2 bpjobd
child
3
1 6
611
1
2
3
4
5
3
net start “NetBackup Request
Manager”
12
bpjobd
Media and
5 drive Physical nbemm
information resources
Job try
4
nbjm nbrb 3 MDS DA
1
bpdbm 2
Logical
resources
Master Server
Catalogs
13
Media Server 2
Establish client
connection
vmd bpbrm 3 A
Move media
Robot Arm txxcd 4
7 ltid 5 bptm
Tape 8 txxd 6
Drive Load and
3
scan drive
avrd
14
Media Mount
The slide shows the process flow during a media mount.
1 The nbjm process on the master server receives information on which specific
drive and media are available for the operation.
2 Explicit instructions on which resources to use are passed from nbjm to
bpbrm, the backup and restore manager, on the media server.
3 Before proceeding with the mount request, bpbrm establishes a socket
connection with the client system.
4 After the connection is established, and before backup data flow begins,
bpbrm sends instructions to mount a specific drive and media to bptm, the
tape manager.
5 bptm forwards the request to the device manager daemon (ltid).
6 ltid calls the robotic drive daemon, for example txxd.
7 txxd calls the robotic control daemon, for example txxcd, on the robot
control host to issue SCSI commands to mount the media.
8 After the media is mounted, txxd scans the drive to verify that the correct tape
is loaded.
vmd is used for remote media and device management and avrd is used for bar
code and recorded label recognition.
Now nbjm is notified, and the backup data flow from the client to the media
server begins.
1
4
Media Server Establish client
connection
A
bpbrm 2 bpbkar
Metadata
bptm
4
parent
2
3
bptm
child Backup image
15
3
16
Network
Port 1556
PBX-Based PBX-Based
pbx_exchange
Process Process
Legacy
Process
18
Network
Port 13724
PBX-Based
(bp)inetd vnetd
Process
Legacy Legacy
bpcd
Process Process
3
19
Network
Port 13724
PBX-Based
(bp)inetd vnetd
Process
Legacy Legacy
bpcd
Process Process
20
nbproxy nbproxy
bpdbm
Master Server
3
Catalogs
21
nbproxy
The nbproxy process is used on master and EMM server systems to
communicate with bpdbm for access to the master server catalogs. nbproxy acts
as an adapter between the IRM and EMM processes to communicate with bpdbm.
This communication is necessary because the bpdbm process is single-threaded
and cannot receive direct communications from the multithreaded IRM processes.
In this way, the IRM processes can access the master server in a synchronized,
consistent way. The policy configurations, Image database, and global settings for
the server group are made available to IRM through nbproxy.
Processes that use nbproxy in this way are nbpem, nbjm, and nbrb. A separate
instance of nbproxy runs persistently for all three processes.
1 5
EMM
Master Server bpdbm nbpem nbemm Database
Catalogs and Engine
4
EMM
Master Server bpdbm nbpem nbemm Database
Catalogs and Engine
3
bptm Client
24
EMM
Master Server bpdbm nbpem nbemm Database
Catalogs and Engine
11a
Media Server Client
bpbrm 11a bpbkar Data
11b
bptm 11b
Client
25
15
EMM
Master Server bpdbm nbpem nbemm Database
Catalogs 12 and Engine
3
bptm Client
26
nbjm nbrb
2
EMM
Master Server bpdbm nbemm Database
Catalogs and Engine
4
Media Server Client
Data
bptm
tar
bpbrm Client
28
nbjm 8 nbrb
9
EMM
Master Server bpdbm nbemm Database
Catalogs and Engine
3
6
5
bpbrm Client
29
nbjm nbrb
EMM
Master Server bpdbm nbemm Database
Catalogs and Engine
11
bpbrm Client
30
17
Jobs bpjobd bprd
Database
16
nbjm 15 nbrb
EMM
Master Server bpdbm nbemm Database
Catalogs and Engine
14
3
bpbrm Client
31
• Key Points
In this lesson, you reviewed the functions of the master server, EMM
server, media server, client, and their daemons or services and
processes. You looked at the communication methods used between
NetBackup processes. You also followed the processes through an
automatic backup operation and a restore operation.
• Reference Materials
– NetBackup System Administrator’s Guide
– NetBackup Troubleshooting Guide: Appendix A, Functional Overview
– http://entsupport.symantec.com/docs/282015 for Details on the
VERITAS NetBackup Backup and Restore Process Flow
– TechNote 278996
32
Labs and solutions for this lesson are located on the following pages:
• Appendix A provides step-by-step lab instructions. See “Lab 3 Details:
NetBackup Process Flow,” page A-25
• Appendix B provides complete lab instructions and solutions. See “Lab 3
Solution: NetBackup Process Flow,” page B-29
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
Topic 3: Viewing Debug Logs • Extract relevant data from legacy logs.
• Use the vxlogview command to format
data from raw unified logs.
bprd X bpdm X
bpdbm X ltid X
bpjobd X vmd X
nbpem X txxd/txxcd X
nbjm X vnetd X
nbproxy X bpcd X
nbgenjob X bpbkar X
nbsvcmon X tar X
pbx_exchange X bpmount X
nbrb X bpbackup X
nbemm X bprestore X
bpbrm X bparchive X
bptm X bplist X
5
4
After completing this topic, you will be able to:
• Enable legacy logging.
• Use the vxlogcfg command to manage unified and robust log settings.
bpbrm
Media bpdm
bptm
EMM vnetd
bprd bpcd
Master
bpdbm
bpdbjobs
vnetd
bpcd
Clients
admin
bpcd bpbkar
vnetd tar
Step Action
4
Perform the following steps to enable the NetBackup debug logs:
1 Create directories for the NetBackup logs.
For those processes that use legacy logging, create a log directory for each
process to be logged as follows:
– UNIX: /usr/openv/netbackup/logs/process_name
– Windows: install_path\NetBackup\logs\process_name
The mklogdir script can be used to create all the legacy logging directories.
However, this script also enables all daemons or services and processes to be
logged the next time that they are restarted, causing excessive disk I/O
operations and disk space usage. Use caution with this script and tailor it
accordingly.
Run the mklogdir script as follows:
– UNIX: /usr/openv/netbackup/logs/mklogdir
– Windows: install_path\NetBackup\logs\mklogdir.bat
10
4
Media Manager logging is a form of legacy logging that may be requested by
NetBackup Technical Support. The processes logged by media manager logging
play a lesser role in NetBackup 6.x than in previous versions due to the
introduction of the EMM server.
Media Manager debug logs are enabled by creating a directory with a name, as
shown in the slide.
Additional debug and informational messages from the robot and drive processes
(txxcd and txxd) are also logged to the /var/adm/messages file (UNIX)
and the Event Viewer application log (Windows).
• The txxd/txxcd processes write to log files in the robots directory.
• The vmd daemon writes to log files in the daemon directory.
• The tpcommand log directory contains debug log files used by the
tpconfig and the tpautoconf commands.
MM_SERVER_NAME = pc1train07
MEDIA_ID_BARCODE_CHARS = 0 8 1:2:3:4:5:6
VERBOSE VERBOSE entry
11
nbpem Master
nbjm
nbsvcmon
nbsl
Clients
12
4
Unified logging was introduced with NetBackup 6.0. Unified logging is simply a
format for log file names and messages that is planned to become standard across
most Symantec products.
By default, unified logging is enabled on all NetBackup 6.x master servers and
EMM servers. Unified logging primarily covers components new to NetBackup
6.x. These components include the Intelligent Resource Manager (IRM), the
Enterprise Media Manager (EMM), Private Branch Exchange (PBX), Bare Metal
Restore (BMR), and NetBackup Operations Manager (NOM).
Unified logging cannot be completely disabled; however, the level of detail can be
adjusted by configuring its verbose levels without requiring a restart of the
daemons or services.
Raw unified logs are located in the /usr/openv/logs directory (UNIX) or the
install_path\NetBackup\logs folder (Windows).
Note: Legacy logging is still required with NetBackup 6.x. Unified logging has not
superseded legacy logging, particularly in the case of the bp commands.
13
Product IDs
NetBackup uses two Product IDs, as follows:
• NetBackup (NB) is Product ID 51216.
• Infrastructure Core Services (VxICS) is Product ID 50936.
Originator IDs
The following table is a partial list of originator IDs and short names:
4
Use the NetBackup Host Properties to change the unified logging diagnostic
message level for the nbrb, nbpem, and nbjm logs on a master or EMM server.
Only diagnostic level messages are changed through Host Properties, not debug or
application messages. Unified logging message types are discussed later in this
lesson.
The diagnostic logging level has six possible numeric values, 0 - 5, representing
the amount of detail to be logged. In addition, an individual unified log may be set
to No Logging (disabled) or Same as global.
Use the vxlogcfg command to configure unified logs for processes other than
nbrb, nbpem, and nbjm, or for diagnostic level messaging.
16
Robust debug logging is also referred to as legacy logging file rotation. When
robust logging is enabled, rather than each NetBackup process creating a single log
file of unpredictable size per process per day, each process is limited on the size to
which its debug log file can grow before starting a new log file (rollover mode).
Additionally, robust logging controls the number of logs (per process) that can
exist at one time. After the limit is reached, robust logging deletes the oldest file.
Robust logging does not affect Media and Device Management debug logs.
Robust log file names include an incremental number designation:
MMDDYY_NNNNN.log, where NNNNN is an incrementing number from 00001 to
99999.
For specific unified logs, use vxlogcfg to customize rollover mode and the
number of log files. Legacy logs cannot be configured individually; to configure
the robust settings for all legacy logs, specify originator ID 112.
19
Parameter Description
-l Lists configuration settings
-a Modifies the product’s unified logging settings, in conjunction with
other vxlogcfg options
Product and Originator ID (required for both the list and change functions):
4
Periodic, None logs to roll over to the next file
MaxLogFileSize 1–4294967295 Maximum size (in Kb) a log
can grow before rollover
occurs, if FileSize is the
rollover mode
RolloverPeriodIn 1–2147483648 Period of time (in seconds) a
Seconds log file is used before rollover
occurs, if Periodic is the
rollover mode
RolloverAtLocal 00:00–23:59 Time of day a log rollover
Time occurs, if LocalTime is the
rollover mode
NumberOfLogFiles 1–4294967295 Maximum number of log files
that can exist before the oldest
file is removed
LogRecycle True or False Enables or disables automatic
purging of old log files,
keeping the latest
NumberOfLogFiles
Note: A restart of the daemons or services is not required when enabling verbose
debug logs or when changing vxlogcfg options.
21
4
After completing this topic, you will be able to use the vxlogmgr command to
prepare unified logs for sending to NetBackup Technical Support.
22
Master
nbjm
Clients:
Logging is not relevant
23
4
This slide shows an example of where to create logs and which logs to create for a
given problem. The problem identified is a status code 96, indicating that
NetBackup does not have sufficient media to complete the backup operation.
1 Based on knowledge of this type of failure, and the roles of the various
NetBackup tiers, first eliminate the need to collect logs on the client side.
Failure to locate sufficient media is clearly a server-side problem.
2 Based on knowledge of process flow, determine which specific debug logs
should be enabled, or which logs should be examined first. In this example,
nbjm, nbrb, nbemm, MDS, bpbrm, and bptm play some role in media
allocation, or communicating with processes involved in media allocation.
For example, although bptm does not directly determine media availability, it
does report on failed media operations that may result in frozen media.
3 Determine which specific NetBackup hosts are involved in the failure.
In this example, the master server is able to successfully perform backups to
media, indicating that the master and EMM servers are not the most likely
sources of this problem. Focus your investigation on the media server involved
in the failing backup.
24
25
4
The vxlogmgr command is used to manage unified log files. Log file
management includes actions such as copying, moving, or deleting log files. The
following is a partial list of parameters associated with the copy function (-c) of
the vxlogmgr command, with examples.
Originator ID
Time Period
*Note: The date format is set from the current locale at run-time and is locale-
specific. In UNIX, use single quotes to enclose the date. In Windows, use double
quotes.
Use the -s option in place of the –c option in the examples on the slide to list logs
meeting the specified criteria before actually copying them. For example, to view
the raw log entries for originator ID 117 over the last two days, enter:
vxlogmgr –s –o 117 –n 2
Disable debug logs unless you are actively trying to log a failure.
26
4
Unless you are actively trying to capture a failure scenario, it is very important to
disable, or at least, reduce debug logging. Logs can have a tremendous impact on
NetBackup performance and disk space on the volume where logs exist.
Disable and purge legacy NetBackup logs by deleting the folder for the
corresponding logs in the NetBackup logs directory. The AltPath and
user_ops folders, if present, are part of normal NetBackup operations, and must
not be deleted. Also, do not delete the mklogdir script.
Unified logs cannot be completely disabled, but the amount of detail logged and
the corresponding performance overhead varies a great deal based on the verbose
settings.
The following commands disable debug and diagnostic unified log messages by
setting the verbose message levels to zero (0):
vxlogcfg –a –p 51216 –o ALL –s DiagnosticLevel=0
vxlogcfg –a –p 51216 –o ALL –s DebugLevel=0
Note: Changing the unified logging message level in the Host Properties only
affects debug messaging, and only for the nbpem, nbrb, and nbjm logs.
To reclaim disk space, use the vxlogmgr command. The following command
purges existing unified logs based upon the NumberOfLogFiles configuration
setting.:
vxlogmgr –d -a
28
29
4
By default, legacy logs store a full day of logging in a single file per process, and
can grow quite large. Reading a log can become complicated when multiple
instances of a process are running and logging concurrently to the same log file.
When viewing the logs from a UNIX or Linux host, the inherent tools, such as
grep and vi, are often sufficient to manage the large, complex text files. From a
Windows host, it may be preferable to seek a third-party text editor with advanced
features for searching and marking lines. Isolating a specific subset of log entries
can greatly improve the readability of a log.
Note: The host in question is the system viewing the log files. It does not matter
whether a UNIX or Windows host originally wrote the log file.
30
• Message text:
The message text contains the actual activity or message reported by the
logged process.
4
Use the following procedure to extract a set of process IDs from legacy debug
logs.
1 Determine a relevant log and process ID (PID).
In this example, the log is bpbrm and the PID is 11340.
Note: The appropriate PID may be found in the detailed status of the Activity
Monitor for the failed job. You may also find the PID by navigating to the time
of the job start or failure within the log, or by searching the log for part of the
error message.
4
4 Copy and paste the selected entries to a new text file.
5 Repeat for all relevant PIDs.
This log excerpt can now be examined for clues regarding the job failure.
34
Message Text
35
4
The fields shown in a unified log are determined by the –d switch on the
vxlogview command. The excerpt shown on this slide was taken using the
–d all switch, which displays all available columns. By changing the
parameters used with –d, you may remove any of these columns. The columns
shown with the -d all switch are:
• Date: The date (mm/dd/yyyy) on which the log entry was posted
• Time: The timestamp (hh:mm:ss.millisecond) at which the log entry was
posted
• Message Type: The classification of the message as Application, Diagnostic or
Debug
• Product ID (short name): For example, NB = NetBackup
• Product ID (numeric value): For example, 51216 is NetBackup
• Originator ID (short name): For example, mds is MDS
• Originator ID numeric value: For example, 143 is MDS
• PID: Process ID of the process posting this log entry
• TID: Thread ID of the process posting this log entry
• Message Text: The content of this log entry
At present, unified log outputs do not report their verbose level or the
vxlogview command syntax used. When providing raw logs to NetBackup
Technical Support, specify which verbose level was used, and in the case of a log
output, what command syntax was used.
4
Use the vxlogview command to selectively pull and format data from the raw
unified logs. The vxlogview command produces human-readable text output
that is concise and relevant to the problem. Unless the output is redirected to a text
file, as shown in the examples on this slide, the data is displayed on the screen.
The following tables describe the switches and parameters for the vxlogview
command.
Message Type
Time Frame
-n number_of_days Limits the time frame of the log entries to the last number
of days specified
The vxlogview command starts at the beginning of the
current day, and counts backwards in 24-hour increments.
For example, if the command is run with the option -n 2
at 8:00 a.m., 32 hours of logging are displayed.
-t hh:mm:ss Limits the time frame of the log entries displayed from
the current time, looking backwards the amount of time
specified
-b "mm/dd/yy Limits the time frame of the log entries displayed to the
hh:mm:ss AM/PM" period of time specified with the begin (-b) and end (-e)
-e 'mm/dd/yy switches
hh:mm:ss AM/PM' Note: In UNIX, use single quotes to enclose the date. In
Windows, use double quotes.
Log Layout
D Date T Time
m Message type p Process ID
t Thread ID P Product ID
O Originator ID s Application entry severity
x Text of the message o Originator short name
38
• Key Points
– In this lesson, you reviewed how to establish legacy logging for both
NetBackup and Media Manager. You also learned how to use the vxlogcfg
command to set both unified and robust log settings.
– You learned how to use the vxlogmgr command to prepare unified logs to
be sent to NetBackup Technical Support.
– You learned how to extract relevant data from legacy logs and how to use
the vxlogview command to format data from raw unified logs.
• Reference Materials
– NetBackup System Administrator’s Guide
– NetBackup Troubleshooting Guide
– NetBackup Commands
– The support Web site at: http://entsupport.symantec.com
– TechNote 279929
39
4
Lab 4: Using Debug Logs
Labs and solutions for this lesson are located on the following pages:
• Appendix A provides step-by-step lab instructions. See “Lab 4 Details: Using
Debug Logs,” page A-32.
• Appendix B provides complete lab instructions and solutions. See “Lab 4
Solution: Using Debug Logs,” page B-37.
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
Topic 2: The EMM Domain Identify the components of the EMM domain and view excerpts
of component logs to develop a familiarity with the EMM server
interactions.
Topic 3: Client Backup Process Trace a successful job as it flows through the EMM server
Flow Through the EMM Server components, in order to detect problems with nbrb, MDS,
nbemm, DA, and the NBDB.
Topic 4: Catalog Backup and Back up, recover, and protect the NBDB.
Recovery
Topic 5: Maintaining the NBDB Perform various tasks to maintain the NBDB.
5
The NetBackup Database (NBDB)
The NetBackup Relational Database (NBDB) is created during the installation of
the master server. The NBDB stores information used by both the system and the
EMM server. The NBDB runs on Sybase Adaptive Server Anywhere (ASA) 9.0.x.
The NBDB consists of three database files and is supported by configuration files
and a transaction log.
Database Files
The database files contain information used by the Sybase server, NetBackup
daemons or processes and services, and the EMM server. Each of these files is
considered a dbspace. The database files that compose the database are:
• NBDB.db
Used by the Sybase server and some NetBackup daemons, services, or
processes
• EMM_DATA.db
Accessed by the EMM server (nbemm)
• EMM_INDEX.db
Note: Do not install two instances of Adaptive Server Anywhere (ASA) on the
same Netbackup server. For example, do not install NOM, which installs its
own Sybase Server, on the EMM Server, which also installs its own Sybase
Server.
Configuration Files
The configuration files are used for startup and during other operational tasks. The
configuration files include:
• server.conf
• vxdbms.conf
• databases.conf
Scenario Command
NBDB-Related Commands
This slide shows common NBDB-related commands.
Commands that start all daemons or processes and services, such as
netbackup start (UNIX) and bpup (Windows), also start the NBDB.
Commands that stop all daemons or processes and services, such as
5
netbackup stop (UNIX) and bpdown (Windows), also stop the NBDB.
By default, when NetBackup starts, the Sybase server automatically starts the
NBDB (NBDB autostart). Use the nbdb_admin -auto_start NONE
command to prevent the database from being started.
ASA-Related Commands
This slide shows common ASA-related commands. In addition, you can use the
NetBackup Activity Monitor to determine if ASA is running, or to start or stop
ASA.
9. Populate
8. Set up the
7. Create the mapping and
database
NBDB files. attribute
schema.
information.
5
For more detailed information about the steps shown on this slide, see the NBDB
Creation Process appendix at the back of the Lab Guide for this course.
Master Master
Server Server
Media Media
Servers Servers
5
To list table information from the NBDB about the systems in the EMM domain,
enter:
nbemmcmd –listhosts -verbose
nbemm
components
nbemm
OID=111
nbrb MDS DA
nbjm emmlib
OID=118 OID=143 OID=144
nbproxy
bpdbm
GUI/ Scan
CLI hosts
Legacy
Master Server processes
Catalogs
5
If there is a problem with EMM server components, you may see Resources
Cannot be Allocated or Resources Unavailable messages (status
code 800). To troubleshoot these types of errors, you must understand the
functions of the EMM server components, the role they play in backup and restore
operations, and what to look for in their log files.
Communication to bpdbm
The NetBackup Notification Service (nbnos) and the NetBackup Job Manager
(nbjm) are used by the EMM server to connect with bpdbm to obtain resource
information, such as images information and policy details, which are not stored in
the NBDB.
nbemm
components
nbemm
OID=111
nbrb MDS DA
nbjm emmlib
OID=118 OID=143 OID=144
nbproxy
bpdbm
nbemm
nbemm queries and modifies the NBDB by sending SQL statements to the Sybase
server’s database engine for execution. nbemm interacts with the NBDB for the
following operations:
• Configuring resources (devices, storage units, volume pools, media)
• Allocating and deallocating resources
• Displaying resource information through the command line and the GUI
• Changing or deleting resources
nbemm must always be running on the EMM server. To determine if nbemm is
running, use the Activity Monitor or the bpps command.
nbemm
vxlogview –o 111
nbrb MDS DA
nbjm emmlib
OID=118 OID=143 OID=144
nbproxy
bpdbm
Master Server
Catalogs
nbrb
The Resource Broker (nbrb) manages resource requests and allocations and
monitors configuration changes. nbrb receives resource requests from the Job
Manager (nbjm) and passes allocation requests to media and device selection
(MDS). When jobs are complete, nbrb notifies MDS that allocated resources can
be released.
nbrb also monitors configuration changes by performing a full evaluation of the
configuration every 30 minutes and a partial evaluation every 5 minutes.
nbrb must always be running on the EMM server, even though nbrb is a part of
the Intelligent Resource Manager (IRM). To determine if nbrb is running, use the
Activity Monitor or the bpps command.
nbrb
vxlogview –o 118
nbemm
components
nbemm
OID=111
nbrb MDS DA
nbjm emmlib
OID=118 OID=143 OID=144
nbproxy
bpdbm
Master Server
Catalogs
MDS
vxlogview –o 143
nbemm
components
nbemm
OID=111
nbrb MDS DA
nbjm emmlib
OID=118 OID=143 OID=144
nbproxy
bpdbm
GUI/
CLI
Legacy
Master Server processes
Catalogs
emmlib
emmlib allows legacy processes and services to communicate with the NBDB.
(Legacy processes and services typically start with bp.) emmlib is also the path
for command-line-initiated display and change operations to be sent to nbemm.
As a component of nbemm, emmlib is not visible as a running activity, and
emmlib does not log to an originator ID (OID).
nbemm NBDB
Heartbeat
DA
Scan Host
Shared
Tape
Drives
5
five minutes to keep connections open. The persistent connection is used to start
and stop scan host functions and dynamically reassign scan host responsibilities to
a different media server.
When a drive has been allocated, the DA notifies the scan host to stop scanning
that drive. When the drive is deallocated, the DA notifies the scan host to start
scanning the drive again. If a scan host has a problem, the EMM server is notified
and can dynamically reassign a new scan host. The new scan host does not need to
register or provide any configuration information.
DA
vxlogview –o 144
nbjm
2
EMM Server
nbrb 3 MDS
5
4
nbemm 4 NBDB
1
2
3
4
5
5
2 nbjm sends a resource request to nbrb for backup job xx.
nbjm 7
8
EMM Server 6
nbrb 9 MDS
10
nbemm 10 NBDB
1
2
3
4
7 nbjm initiates the job on the media server. The commands are now sent to
bpbrm and job processing continues as usual.
8 nbjm tells nbpem and nbrb that the job was successful.
9 nbrb notifies MDS that allocation IDs xxx, xxx, xxx have been released.
5
The following table defines terminology that is used in parameters in the
nbrbutil command:
Terminology Description
Orphaned media Media that has been reserved in the NBDB, but has not been
allocated
Orphaned drives Drives that have been reserved in the NBDB, but have not
been allocated
Orphaned storage units Storage Units that have been reserved in the NBDB, but
(STUs) have not been allocated
nbrbutil -dump
If Technical Support
Then use …
instructs you to …
CAUTION Use the commands shown on this slide only under the direction of
5
Technical Support.
2
NBDB.log 1
~ ~ ~
~ ~ ~ 4 Truncated
~ ~ ~
Image DB 3
and Other
Files
1
2
3
4
2 A child job starts, which backs up the files from the staging directory or folder
to the storage unit specified in the catalog backup policy. This job backs up the
files in a single stream. The files do not remain in the staging directory or
folder; they are deleted automatically.
3 Another child job starts, which backs up files from the following directories:
– UNIX:
/usr/openv/netbackup/db
/usr/openv/var
/usr/openv/netbackup/vault
/usr/openv/var/global
Note: At the completion of a hot catalog backup, a disaster recovery (DR) file is
written to a user-defined directory on the master server. This file is required
for successful catalog recovery. For additional protection, configure
NetBackup to e-mail the DR file to one or more valid e-mail addresses.
CAUTION You must recover the catalogs to a system running the same version,
5
including the MP levels. For example, a catalog backup taken on a
NetBackup 6.0 system and recovered to a NetBackup 6.0 MP3
system fails with a schema mismatch error. Likewise, a catalog
backup taken on a NetBackup 6.0 MP3 system cannot be recovered
to a NetBackup 6.0 system.
It is a best practice to back up the catalogs prior to and immediately after any
upgrade, including Maintenance Packs.
The graphic on this slide describes what happens when you use either the
NetBackup Catalog Recovery wizard or the bprecover -wizard command to
recover the NBDB files after a hot, online catalog backup.
1 The NBDB files are written to the staging directory at
/usr/openv/db/staging (UNIX) or
install_path\VERITAS\NetBackupDB\staging (Windows).
2 The database files are moved to /usr/openv/db/data (UNIX) or
install_path\VERITAS\NetBackupDB\data (Windows).
3 Transactions recorded in the NBDB.log are applied.
4 The configuration files are created.
Step Action
3 Use the full path of that copy to recover the catalogs using the
Catalog Recovery Wizard or the bprecover -wizard
command.
Step Action
5
• If the catalog backup spanned tapes
• The media ID of any incremental catalog backups that occurred after the last
full backup
Step Action
5
• If the catalog backup spanned tapes
• The media ID of any incremental catalog backups that occurred after the last
full backup
NBDB files
3
Configura-
tion files 3
NBDB.log
~ ~ ~ 3
~ ~ ~
~ ~ ~4 Truncated
Image DB
and other 3
files
5
1
2
3
4
5
The slide shows what happens when you perform a cold, offline catalog backup.
1 The Sybase Server is queried for the location of the NBDB files.
2 The NBDB is shut down, but the Sybase Server daemon or service continues to
run.
3 The NBDB files, the Image database files, and the other files are backed up.
The NBDB files that are backed up include:
The images and the other files that are backed up include:
– UNIX:
/usr/openv/db/data
/usr/openv/netbackup/db
/usr/openv/var
5
To run a cold, offline catalog backup, select Catalog—>Actions—>Perform
offline backup of the NetBackup catalog.
If you are using the Vault option to back up catalogs, you can only perform hot,
online catalog backups.
Step Action
1 Re-create the NBDB directories, if necessary.
5
› Service Monitor (nbsrvmon)
› Request Manager (bprd)
› Database Manager (bpdbm)
› Device Manager (ltid)
› Volume Manager (vmd)
3 On every media server, stop nbsvcmon, ltid, and vmd.
5
CAUTION Only perform this procedure under the direction of Technical
Support.
1 Record the media ID and density of the most recent catalog backup; it can be
from a hot or cold catalog backup.
2 Stop NetBackup.
3 Prevent the Sybase server from starting the NBDB automatically as follows:
nbdb_admin -auto_start NONE
4 Start the Sybase server as follows:
– UNIX: nbdbms_start_stop start
– Windows: bpup -e ASANYs_VERITAS_NB
Note: After the catalogs have been completely recovered, verify that the EMM
Device Mappings are in sync with remote media servers. Type
tpext -get_dev_mappings_ver on each media server and update
media server mappings as necessary.
Note: A job does not appear in the Activity Monitor when the commands on the
slide are executed.
Step Action
1 Perform a full catalog backup.
Step Action
1 Perform a full catalog backup.
bpdbm -consistency
Step Action
1 Stop all NetBackup daemons or services and processes.
CAUTION If you have changed the database password from the default, you
must provide the new password to Technical Support. Without the
password, Technical Support cannot access your database.
Step Action
1 Create the script.
Use the following parameters in your script (optional):
5
1 Create the script.
– UNIX: Create
/usr/openv/netbackup/bin/mail_dr_info.sh. In order to
execute in UNIX, the mail_dr_info script must have a permission of
755.
– Windows: From the master server, copy
install_path\NetBackup\bin\nbmail.cmd to
install_path\NetBackup\bin\mail_dr_info.cmd.
Optionally, you may use the following four parameters in your script. These
parameters are passed by NetBackup after a hot, online catalog backup:
Parameter Description
%1 The e-mail addresses specified in the DR tab
%2 The subject line of the DR e-mail
%3 The name of the DR e-mail
%4 The name of the DR file
Note: If the mail_dr_info script exists, NetBackup does not send the DR e-
mail, even if it was configured in the catalog backup policy; NetBackup
passes the parameters to the script and runs the script instead.
• Key Points
– In this lesson, you were introduced to the components of the NBDB and the
EMM domain.
– You learned how the NBDB is created during a master server installation.
– You followed the flow of a backup job through the EMM Server.
– You looked at how to back up, recover, protect, and maintain the NBDB.
• Reference Materials
– NetBackup System Administrator’s Guide
– NetBackup Troubleshooting Guide
– NetBackup Media Manager System Administrator’s Guide
– NetBackup Commands
– The support Web site at: http://entsupport.symantec.com
– TechNotes 240584, 276098, 281818
5
• Display EMM domain information.
• View resource allocations.
• Protect and recover the NBDB.
Labs and solutions for this lesson are located on the following pages:
• Appendix A provides step-by-step lab instructions. See “Lab 5 Details:
Understanding the NetBackup Database,” page A-38.
• Appendix B provides complete lab instructions and solutions. See “Lab 5
Solution: Understanding the NetBackup Database,” page B-46.
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
6
• Did devices suddenly stop working in a stable environment with no recent
changes?
Ensure that robots and drives are enabled. Disabled devices appear with a red X in
the Device Manager, but backups and restores fail. If the device is disabled, enable
it by clicking the Enable Device button in the Drive Properties dialog box.
Symantec provides tape drivers for most tape drives; however the manufacturer’s
drivers should work with NetBackup. Check the support Web site at
entsupport.symantec.com to verify that the driver you are using is
supported.
6
• Locate-block positioning
• Quantum SDLT performance optimization
• SCSI reserve/release operations
The table on the slide identifies the SCSI pass-through driver requirements.
An online text version of the NetBackup Device Configuration Guide is available
in /usr/openv/volmgr/NetBackup_DeviceConfig_Guide.txt
(UNIX) or C:\Program Files\VERITAS\Volmgr\
NetBackup_DeviceConfig_Guide.txt (Windows).
Operating
Tape device commands
system:
• robtest
NetBackup: • Tape device commands
• Robot and drive diagnostics
6
Using Tape Device Commands
• To move tapes between slots and drives, enter m s# d#, where # is the slot or
drive number.
• To move tapes between drives and slots, enter m d# s#.
Note: Drive diagnostics fail if there is not an available tape in the NetBackup
volume pool.
3 Use the Device Monitor or use the vmoprcmd command to verify that the tape
was mounted in the drive.
5 Use the Device Monitor or the vmoprcmd command to verify that the tape
was unmounted from the drive.
If existing devices are replaced, the database should be updated with the
new robot and drive information automatically.
Determine if
tpautoconf
there are –report_disc
discrepancies:
Index tpconfig
Hosts vmoprcmd
Ready vmoprcmd
Shared tpconfig
• tpconfig -dl
• robtest
• scan -changer
• vmglob -listall -b
• vmdareq
You may also use the vmglob -listall command to confirm the physical
drive number, as follows:
6
• 1 = SSO
• 2 = NDMP
• 4 = Remote Client
• 7 = all
3 If the EMM server is not aware of the media server, simply restart NetBackup
on the the media server. If the EMM server is still unaware of the media server,
run the nbemmcmd -addhost command from the EMM server, as follows:
nbemmcmd -addhost -machinename machine_name
-machinetype machine_type
-masterserver master_server_name
-netbackupversion version_number
-operatingsystem operating_system
tpext
6
The file names are similar to Mappings_6_nnnnnn.tar (UNIX) and
Mappings_6_nnnnnn.zip (Windows), where nnnnnn is the TechNote
Document ID. For example, Mappings_6_293476.tar relates to
http://seer.entsupport.symantec.com/docs/293476.htm.
2 Download and extract the mapping file to a temporary location on the EMM
server. This creates two files: Readme.txt and external_types.txt.
3 Copy the external_types.txt file to the following location:
– UNIX: /usr/openv/var/global
– Windows: install_path\NetBackup\var\global
4 Run the tpext command as follows to update the EMM database and related
device mappings from the new external types file.
a NetBackup 6.0: Type tpext on the EMM server.
In certain situations, you may need to deactivate the media server. If a media
server needs maintenance, it is possible to deactivate the media server using the
GUI (Media and Device Management—>Devices—>Hosts) or the vmoprcmd
command. Jobs currently in progress on that media server continue to run to
completion. Queued jobs are routed to alternate media servers, if possible. After
the maintenance is complete, the media server can be activated again using the
GUI or the vmoprcmd command.
LAN
EMM
Server
1 2 3 4
Shared
Tape
Drives
Device Discovery
NetBackup automatically discovers devices if:
• The SCSI pass-through driver exists
• The robot and tape drives support serialization
Path 2 is
DOWN. Shared
Tape
Drives
Tape
Alert
All paths are
DOWNed. Shared
Tape
Drives
*CRT = Critical
Robot control
vmd, txxcd, ltid
host
6
txxcd handles robotic arm requests for TLD and TL8. All other robot controls
are handled by txxd drivers. In addition, robot and drive control are separate, and
can be shared under different servers. The xx in txxd or txxcd is interpreted as
follows ([c] indicates the robot control daemon for some types of robots):
• The first x is the device type
• The second x is the density or media type
t x x [c] d
Library = l 4 = 4mm
Stacker= s d = DLT
8 = 8mm
h = ½”
• Key Points
In this lesson, you learned how to troubleshoot device errors.
• Reference Materials
– NetBackup Device Configuration Guide
– NetBackup Troubleshooting Guide
– OEM Web Sites
– http://entsupport.symantec.com
Labs and solutions for this lesson are located on the following pages:
• Appendix A provides step-by-step lab instructions. See “Lab 6 Details:
Troubleshooting Devices,” page A-44.
• Appendix B provides complete lab instructions and solutions. See “Lab 6
Solution: Troubleshooting Devices,” page B-55.
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
An input/output (I/O) error occurred while NetBackup was writing the backup.
• NetBackup status code 85
An input/output (I/O) error occurred while NetBackup was reading the backup.
• NetBackup status code 86
An I/O error occurred while NetBackup was positioning media.
• NetBackup status code 129
The disk storage unit is full.
nbemmcmd -listhosts
NBEMMCMD, Version:6.5
The following hosts were found:
server train11
master train11
media train12
Command completed successfully.
– If a media server has the wrong host defined for its EMM server, change
the EMM server host name as follows:
tpautoconf -set_gdbhost emm_server_host
• Is the correct host specified for the storage unit in the NetBackup
configuration?
bpstulist -U refers to the media server as the Host Connection. This
should be the host that has drives attached to it.
• Is the media in the correct volume pool and unassigned, or is the active media
available at the required retention level (also known as retention period)?
There are several ways to display the retention level for each piece of active
media:
– bpmedialist -p pool_name
– The available_media script
– NetBackup Administration Console Media—>Volume Pools section
The reason for an exhausted pool may be because the retention period is too
generous; which means that images on media are retained longer than
necessary. This affects how quickly media can be recycled for reuse. The best
retention period is based upon your business needs and legal data recovery
obligations.
NetBackup has 25 retention levels available for use. There are 10 default levels
(retention levels 0 through 9), which offer typical retention values. Retention
level 9, with a value of infinity, is the only retention level that cannot be
changed.
Define a custom retention level that best suits your needs by using the
NetBackup Administration Console Host Properties—>Master Server—>
Properties—>Retention Periods or by using the bpretlevel command.
7
CAUTION Do not relabel the media if the media contains valid data that needs
to be restored. Confirm that the media does not contain NetBackup
images by entering the following command:
bpmedialist -mcontents -m media_ID
Monitor the job using the Activity Monitor or bpdbjobs, and verify that the
label operation is successful.
bplabel -m media_id -d media_density -o
-p volume_pool_name -n device_name
The media types and densities listed in step 2 also apply to the bplabel
command.
The vmpool command can also be used to verify the volume pool name. To
obtain the name of the drive, enter:
tpconfig -d
If … Then …
The tape drive needs cleaning, Use tpclean on the affected devices.
configured for variable mode. If the drive is configured incorrectly for fixed
mode, a status code 84, 85, or 86 may result. Variable mode is set in the
st.conf file.
Note: Ensure that you do not have duplicate volume labels, which is also known to
cause status code 86 (positioning) errors.
Sources:
• ANSI document on SCSI primary commands:
http://www.t10.org/ftp/t10/drafts/spc4/spc4r06.pdf
• ASC/ASCQ additional sense data information:
http://www.t10.org/lists/1spc-lst.htm
The media server catalogs read, write and position errors in an errors file located
at:
• UNIX: /usr/openv/netbackup/db/media/errors
• Windows: install_path\NetBackup\db\media\errors
Displaying the content of the errors file produces:
08/03/06 05:56:58 GAZ715 0 POSITION_ERROR rob0d1
08/03/06 07:58:48 GAZ715 1 POSITION_ERROR rob0d2
08/03/06 08:05:16 GAZ715 0 POSITION_ERROR rob0d1
08/03/06 08:16:34 GAZ715 0 POSITION_ERROR rob0d1
08/03/06 05:21:59 GAZ715 1 POSITION_ERROR rob0d2
Notice that the same media is giving errors on multiple drives more than once.
This strongly indicates a media- rather than a drive-related problem.
Problem Resolution
Clean the drive heads and attempt to duplicate any salvageable images from the
tape if you are experiencing a positioning problem (and not truly erased, as in this
example). Whether or not the duplication was successful, expire the images on the
known faulty media using the bpexpdate command, as shown on the slide,
delete the volume from the EMM database, and dispose of the tape securely.
Frozen Media
NetBackup does not write to frozen media. After a tape volume has been frozen, it
can still be used to restore data, to duplicate images, to import images, or to verify
files, but no further backup images can be written to it.
To determine which media in your NetBackup environment are frozen, run the
available_media script, the Media List report, the Media Logs report, or the
All Log Entries report.
NetBackup freezes media for various reasons, including when:
• NetBackup attempts to write a backup to a piece of media that contains non-
7
Note: Both of these workarounds destroy any data contained previously on the
media.
Note: The first time NetBackup encounters a critical error on a piece of media or a
drive, the media is frozen, or the drive is downed.
tape.
3 To correct the problem, use one of the following methods:
– If the robot has a bar-code reader, change the bar-code label to match the
recorded media ID.
– If the robot does not have a bar-code reader, use the NetBackup
Administration Console to move and track volumes in and out of the
library, using the correct (recorded) media ID.
Suspended Media
NetBackup does not write to suspended media. After a tape volume has been
suspended, backups stored on the suspended media are still available for restores,
but no further backup images can be written to it.
If the backup images on a suspended volume have expired, you must import them
before you can restore.
NetBackup does not automatically suspend media; however NetBackups options
may need to suspend media. For example, NetBackup Vault suspends media when
they go offsite.
You may use the bpmedia command to suspend or unsuspend volumes manually
as a temporary means of software write-protecting volumes, just like freezing
volumes.
A condition similar to suspension occurs when media reaches its volume
expiration. A volume may expire because it has exceeded its:
• Maximum mount count 7
• Media expiration date (not to be confused with NetBackup image expiration
based on retention periods)
By default media has an infinite mount count and no expiration date. A media
expiration count or expiration date must be set manually by the NetBackup
administrator when new media is introduced to NetBackup in order to enforce a
limited use of media. The values are based upon:
• Your previous experience with overused media (referred to as tired or spent
media)
• Key Points
In this lesson, you learned how to troubleshoot common media
errors and how to correct media that is FROZEN or SUSPENDED.
• Reference materials
– NetBackup Media Manager System Administrator’s Guide
– NetBackup Troubleshooting Guide
– NetBackup Commands
– The support Web site at: http://entsupport.symantec.com
– TechNotes 234412, 269177, 270101, 273849, 278996, 280309
Labs and solutions for this lesson are located on the following pages:
• Appendix A provides step-by-step lab instructions. See “Lab 7 Details:
Troubleshooting Media,” page A-46.
• Appendix B provides complete lab instructions and solutions. See “Lab 7
Solution: Troubleshooting Media,” page B-57.
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
Master
Server
Tape
Library
master01
WAN 10.1.5.20
Media
Server Client
media02 client21
10.7.4.5 LAN 10.7.4.17
Packets
TCP/IP transfers data using packets. Packets have headers that contain information
8
used to deliver the packets. Packets include the MAC (media access control)
address, the IP address, and the port number.
IP Addresses
The IP address is a unique number assigned by the system administrator to identify
a logical network interface. IP addresses are hardware-independent and are used at
the OSI Network layer.
Routing
TCP/IP uses routing to direct packets from the source to the destination. Packets
may have to hop through many different gateways (routers) to reach the
destination. When a gateway receives a packet, it either delivers the packet to its
destination or delivers the packet to the next gateway as determined by the routing
table.
Host Names
To isolate physical network failures, you need to know the host names of the
systems. Host names are configured in the NetBackup policy configuration or in
the hosts file in /etc (UNIX) or in
%SystemRoot%\system32\drivers\etc (Windows). Host names are
also configured in NIS and DNS (if applicable).
ping passes, but a telnet to one of the NetBackup ports fails to connect, then a
firewall/router/NetBackup configuration problem may exist.
Recent Modifications
If there have been recent modifications, ensure that the modifications have not
introduced the problem by verifying the following items:
• Is the client operating system (OS) supported by this version of NetBackup?
• Has the client software been installed?
• Are the servers at the same or a higher NetBackup version than the client?
• Is the client’s binary still supported?
• Have all the latest patches been installed?
Note: Mismatched binaries may cause a status code 25 error: cannot
8
connect on socket.
bpclntcmd -sv
The -sv option displays the NetBackup version number of the master server.
Similar to the –pn query, this operation waits for a response back from the known
master server. Both bpclntcmd –pn and bpclntcmd –sv are effective, yet
simple commands that can be used to prove high-level connectivity between a
NetBackup host and its master server.
Master Server
Client Media Server
REQUIRED_INTERFACE
NetBackup vm.conf
Configuration
Note: In this context, the NIC name is the name associated with the IP address of
the interface, for example, with DNS. This is not the operating system’s
name for the interface, for example, as returned by ipconfig /all
(Windows) or ifconfig -a (UNIX).
When there are multiple NICs on a system, each NIC has a unique name and IP
address. If there are multiple NICs on a system and NetBackup has not been
configured to use a specific NIC, the operating system determines which NIC is
used for backup and restore operations.
If the NIC chosen by the operating system does not have an entry in the
NetBackup configuration, a communication error occurs. Depending on the
direction of the communication (client—>master server, master server—>client,
master server—>media server), error symptoms may include:
• Status code 59: Access to the client not allowed
• Hung oprd processes
CLIENT_NAME = rocky-bu
REQUIRED_INTERFACE = rocky-bu
• bpcd/vnetd startup
– On Windows servers and clients, verify that the Client service is
running.
– On UNIX servers and clients, verify that the /etc/inetd.conf
file has a bpcd and a vnetd entry.
• Ensure that DNS, WINS, or NIS host name information
matches the NetBackup policy and host name
configuration.
– Check the Servers tab and Client Names tab for Windows
servers and clients.
– Check the bp.conf file on UNIX servers and clients.
• If Network Information Service (NIS) is being used,
ensure that NetBackup services are included.
Network Errors
If the NetBackup configuration is not the problem, you may have a network error.
Network errors may be caused by:
• NetBackup Client service startup
• DNS, WINS, or NIS
• Port problems
• Clients with multiple network interface cards (NICs)
• Firewall issues
• Network timeouts
DNS
The domain name system (DNS) handles the mapping between host names and IP
addresses, in addition to other host information in distributed environments.
Before DNS, system administrators had the overwhelming job of maintaining the
hosts file. The hosts file needed an entry for every possible host with which
the user communicated. DNS uses a client-server method of providing network
services.
WINS
The Windows Internet name service (WINS) is a mechanism implemented by
Windows systems to centrally record host name-to-IP translations. WINS is
especially useful when dynamically assigned IP addresses are used. A server or
servers must be configured to run WINS, and all clients that use it must be
configured with the address of the WINS servers to which they register and query.
NIS
The Network Information Service (NIS) is a mechanism implemented by SUN for
keeping major files synchronized between hosts. NIS can be used to manage
/etc/hosts, /etc/services, /etc/password, and other files required
for hosts on a network. On Solaris you can use NIS, DNS, or /etc/hosts. You
can also use DNS as a backup to NIS.
Port Problems
Common causes of port problems include:
• The socket is being used by a non-NetBackup process.
• The operating system has not timed out of its previous use of a socket.
• Random port assignment has been disabled, and a port range that is too small
has been defined for NetBackup’s use.
To verify that a port can be contacted, enter the telnet command as follows:
telnet host_name port_number
Status code
Storage unit is not available
213/219:
Note: When using debug logs to verify a port problem, set the verbose level to 5
before retrying the backup.
Step Action
Stop the NetBackup processes (if possible) on the server and
1
the client.
Check NetBackup
bptestbpcd
connectivity to a
–host server_name
server,
bptestbpcd –verbose
Display connectivity –debug
check details, –host server_name|client_name
• Problem: Test from a master server to a system that does not have NetBackup
installed (train5):
bpcd connect-
Port type to use
back method
DEFAULT_CONNECT_OPTIONS = [ 0 | 1 ] [ 0 | 1 ] [ 0 | 1 | 2 ]
Daemon
connection port
About CONNECT_OPTIONS
Use CONNECT_OPTIONS to specify one of three options designed to enhance
firewall efficiency with NetBackup:
• Connect to the host using a reserved or non-reserved port number.
• Connect to the host using the traditional call-back method or using the Veritas
Network Daemon (vnetd).
• Connect to the host using one of the following methods:
– vnetd or, if vnedt fails, the daemon's port number
– vnetd only
– The daemon's port number only
The bptestbpcd utility can help identify which NetBackup
CONNECT_OPTIONS are required for your network environment. By default,
NetBackup is “firewall friendly,” using the fewest possible ports required, but the
default may not work for a legacy site.
After you find an operable and acceptable connectivity method, you may force
NetBackup to use a new DEFAULT_CONNECT_OPTIONS or even allow per-
client/per-server CONNECT_OPTIONS overrides.
8
CONNECT_OPTIONS = pc3train11 1 0 2
CONNECT_OPTIONS = pc3train12 1 1 0
following scenario:
• A firewall is installed between the master/media server and the client.
Network Timeouts
Status codes 41 and 54 are similar errors (network timeouts) caused by problems
connecting to a client or maintaining a connection to a client.
• Status code 41: Network connection timed out
• Status code 54: Timed out connecting to a client
• arp -a
• ifconfig –a
• ipconfig /all
• netstat –an
• netstat –nr
• nslookup
• netsh
• tasklist /svc
• tasklist /m
• Key Points
In this lesson, you learned how to detect and correct physical
network errors, NetBackup and network configuration errors, and
other errors, such as port shortages, multiple network interface
issues, and firewall issues.
• Reference Materials
– NetBackup Release Notes
– NetBackup System Administrator’s Guide
– NetBackup Troubleshooting Guide
– NetBackup Commands
– NetBackup Port Usage Guide (TechNote 281623)
– The support Web site at: http://entsupport.symantec.com
– TechNote 278569, Flowchart for Error Code 25 Troubleshooting
– TechNote 278427, Flowchart for Error Code 54 Troubleshooting
– Additional TechNotes: 234618, 267977, 286035
Labs and solutions for this lesson are located on the following pages:
• Appendix A provides step-by-step lab instructions. See “Lab 8 Details:
Troubleshooting Network Issues,” page A-48.
• Appendix B provides complete lab instructions and solutions. See “Lab 8
Solution: Troubleshooting Network Issues,” page B-59.
• Course Introduction
• Lesson 1: Ensuring a Successful Upgrade
• Lesson 2: Troubleshooting Methods and
Tools
• Lesson 3: NetBackup Process Flow
• Lesson 4: Using Debug Logs
• Lesson 5: Understanding the NetBackup
Database
• Lesson 6: Troubleshooting Devices
• Lesson 7: Troubleshooting Media
• Lesson 8: Troubleshooting Network
Issues
• Lesson 9: Troubleshooting Performance
Issues
9
• Achieve all backups within the allotted
window
Performance
• Minimize backup impact on client
goals:
systems
• Reduce time to recover data
Tape
Device
SAN
Media
Server
Network
Client
Processing
Legend:
Data Stream 6
Throughput Bottlenecks
For a given flow of data, the slowest single point in that flow is considered a
bottleneck. The bottleneck may exist because it is simply the slowest device in the
chain, or because it is where multiple data streams converge.
Every data stream in NetBackup has a bottleneck, and addressing one bottleneck
always exposes another. In an ideal configuration, each point in the data flow has
throughput potential roughly equal to the next. A situation to avoid is where a
single low performance operation throttles the throughput of high performance
operations in the chain.
When maximizing performance in an environment, it is important to identify not
only where a bottleneck is, but how close it is to the maximum potential
performance for the environment (the subsequent bottlenecks). This helps
determine whether addressing a particular bottleneck is worthwhile.
The slide shows a sample data path, and how addressing one bottleneck leads to
the next.
9
200 MB/s
Total Network
Concurrent NetBackup
Data (all 1000BaseT Media Server
clients): Network Links
~400 MB/s 90 MB/s each 66 MHZ Bus: Quad
1000BaseT NIC
HBA 100 MB/s
NetBackup
Clients
Tape Drives
75 MB/s each
(compressed)
1000BaseT
Network Links
Total 90 MB/s each
Concurrent Network Tape Robot
Data (all
clients):
~300 MB/s
1
2
3
4
5
6
7
Nonmultiplexed backup
9
1155138742 1 4 4 pc1train07 438 438 0 pc1train07 bptm
successfully wrote backup id pc1train07_1155138161, copy
1, fragment 2, 2796542 Kbytes at 6926.923 Kbytes/sec
Multiplexed backup
Restore
No
Make an adjustment
and measure
performance. 10
9
collecting the information already available:
• How many clients are affected?
• Do they have anything in common?
– Network segment
– Operating system
– Data type (database, many small files, and so on)
• Which media servers are affected?
• Has performance always been poor?
12
Shared Memory
Buffers
64K
64K
64K
Producer: 64K
Consumer:
bpbkar (local backup) 64K
bptm (local backup)
bptm child (remote backup) 64K
bptm parent (remote backup)
64K
64K
Tape
Example Values Storage
NUMBER_DATA_BUFFERS = 8 Unit
SIZE_DATA_BUFFERS = 65,536 (64K) 13
9
Shared Memory
Buffers
64K
64K
64K
Consumer: 64K
Producer:
tar (local restore) 64K
bptm (local restore)
bptm child (remote restore) 64K
bptm parent (remote restore)
64K
64K
Tape
Example Values Storage
NUMBER_DATA_BUFFERS = 8 Unit
SIZE_DATA_BUFFERS = 65,536 (64K) 14
Shared Memory
Buffers
64K
64K
64K
Producer: 64K
Consumer:
bpbkar (local backup) 64K
bptm (local backup)
bptm child (remote backup) 64K
bptm parent (remote backup)
64K
64K
Consumer
WAIT DELAY
c e
d
c
f Tape
Storage
Unit
1
2
15
9
Shared Memory
Buffers
64K
64K
64K
Producer: 64K
Consumer:
bpbkar (local backup) 64K
bptm (local backup)
bptm child (remote backup) 64K
bptm parent (remote backup)
64K
64K
Producer
WAIT DELAY
c e
d
c
f Tape
Storage
Unit
1
2
16
• Producer (bpbkar):
<4> tar_backup::OVPC_EOFSharedMemory: INF - bpbkar
waited 3959 times for empty buffer, delayed 3991
times
• Consumer (bptm):
<2> write_data: waited for full buffer 13530 times,
delayed 30376 times
• For both producer and consumer calculate:
(data transferred)/(block size)=(# of blocks)
(# of waits)*100/(# of blocks)= wait percentage
• Guideline: Values greater than 5% are “high.”
17
9
stream?
Both:
Producer Low/Consumer Low
Producer High/Consumer High
18
Client-Side Bottlenecks
Using bpbkar to process client data is an effective way of isolating a bottleneck
to the client system. The bpbkar process is used to process and package data to
be sent from the client to the media server for backup. When used as documented
in the slide, the client goes through the normal motions for processing the data, but
stops short of sending it to the media server.
The process is complete when the bpbkar (UNIX) or bpbkar32 (Windows)
process exits. Performance statistics can then be collected from the bpbkar log
similar to the following:
TAR - backup: 11124 files
TAR - backup: file data: 1073230716 bytes
TAR - backup: image data: 27099136 bytes 1 gigabytes
TAR - backup: elapsed time: 82 secs 13424889 bps
The most accurate measurement can be taken from the bytes per second (Bps)
report in the bpbkar log.
13424889 Bps = 13110 KBps = 12.8 MBps
If this performance is similar to that of the poorly performing backup, it is likely
that the client is the bottleneck. Increasing performance for this backup stream is
dependent upon improving the client’s ability to process data for backup.
OS utilities, such as perfmon in Windows and vmstat in Solaris, may reveal
the specific client resource that is “choking” on the data.
9
If you want to test the … Then, …
Network throughput between the FTP data from the client to the
client and media server, media server.
20
Client Data
Shared Memory
Buffers
64K
64K
64K
64K
bpbkar 64K
bptm
64K
64K
64K
64K
64K
Tape
64K
Tape Drive 64K
Status 64K
64K
Stop
Stop 64K
Rewind
Rewind 64K
Reposition
Reposition
3
1
2
21
• Debug logging
9
• Notify scripts
• Open file handling
• Many small files
• Compression
• Encryption
23
9
• Taxes client resources
Virus scanning: • Consider disabling “outbound” or “on
backup” virus scanning.
24
RAID HBA
SAN Switch
Controller 7.94 – 7/2005
Client NIC Client NIC 6.4h – 12/2003
1.73 – 7/1999 5.12 – 7/2005
Media NIC Fiber / SCSI
5.12 – 7/2005 Bridge
HBA
7.94 – 7/2005 Bridge Firmware
3.0.9 - 3/2004
Tape Device
Network Switch Firmware
4.10 – 6/2005
Switch
Firmware Media NIC Tape Device
4.5a - 8/2004 5.12 – 7/2005 Firmware
4.17 – 8/2005
OS Tape Device
Drivers
2.3.7 - 3/2000 Tape Robot
1
2
25
9
stream is delivered.
; Check with the tape device and the HBA manufacturer
for the recommended size.
; Carefully document the configuration and the results.
; Make changes in small increments.
; Thoroughly test restores on all required media servers.
26
SIZE_DATA_BUFFERS:
NUMBER_DATA_BUFFERS:
Formula:
NUMBER_DATA_BUFFERS * SIZE_DATA_BUFFERS_BYTES * drives *
MPX_level = total shared memory
Example: 16 * 262144 * 2 * 4 = 32768K (32 MB) 27
9
shared memory kernel parameters.
Recommended minimum shared memory values for the Solaris
/etc/system file:
AIX systems use dynamic shared memory allocation, and they do not require
tuning.
29
9
maximum block size.
HKEY_LOCAL_MACHINE\SYSTEM\
CurrentControlSet\Services\
controller_id\Parameters\Device\
MaximumSGList:REG_DWORD:0x21
• To determine the value based on the desired block size:
(block_size_in_KB / 4) + 1 = MaximumSGList decimal
value
30
Note: Always consult the device manufacturer before modifying this value.
Configuring a value unsupported by the device may result in data loss or
hardware failure.
31
Multiplexing Considerations
There are many conditions where individual client data stream performance cannot
be improved to the point of “maxing out” the speed of the tape device. In such
cases, multiplexing is a highly effective way of maintaining peak performance of
the tape devices.
Only raise the multiplexing level to the point of making the tape device the
bottleneck for backup operations. Setting the multiplex value too high can have a
negative impact on restore performance. Measure performance for both backups
and restores before deciding on a multiplexing level.
9
Production Network Example: 1000 megabit Full Duplex MediaServB
MediaServC
Client HOSTS
files: Populate HOSTS files for
MediaServA faster name lookup.
MediaServB
MediaServC
Media Server
HOSTS files:
Master
ClientA-bu
ClientB-bu
ClientC-bu
ClientD-bu
32
Performance
9
How to Configure It
Guideline
/usr/openv/netbackup/NET_BUFFER_SZ
Configure for a size
greater than Default = 32K
SIZE_DATA_BUFFERS NET_BUFFER_SZ_REST
(Media server).
Default = 32K
33
Performance
How to Configure It
Guideline
install_path\VERITAS\
Configure for a size NetBackup\NET_BUFFER_SZ
greater than Default = (data_buffer_size * 4) + 1024
SIZE_DATA_BUFFERS
(Media server) NET_BUFFER_SZ_REST
Default = (data_buffer_size * 2) + 1024
34
9
• Reduce the need for frequent full
Synthetic backups
backups:
• Reduce client resource usage
35
• Key Points
In this lesson, you learned how to examine an existing NetBackup
configuration in order to isolate and address bottlenecks. You also
learned how to improve performance through configuration
adjustments, such as by tuning data and network buffers, rather than
through adding physical resources.
• Reference Materials
– NetBackup Administration (Fundamentals II)
– NetBackup Troubleshooting Guide
– NetBackup Snapshot Client Quick Start Guide
– NetBackup Snapshot Client Administrator’s Guide
– NetBackup Backup Planning and Performance Tuning Guide (TechNote
2818420)
– Operating System Vendor Web sites
– The support Web site at: http://entsupport.symantec.com
– TechNote 273532, 288300
36
Glossary-1
Glossary-3
Glossary-5
Glossary-7
Glossary-9
Glossary-11
Glossary-13
Glossary-15
secondary disk See mirror. shared drive A tape drive that is shared
among hosts when the Shared Storage
server independent restore Restoring Option (SSO) is installed. SSO applies
files by using a NetBackup server other only to NetBackup Enterprise Server;
than the one that was used to write the therefore, a shared drive applies only to
backup. Because NetBackup Server NetBackup Enterprise Server. See Shared
installs on one system only, this feature is Storage Option (SSO).
available only with NetBackup Enterprise
Server. shared resource tree (SRT) In Bare
Metal Restore, a compilation source of
server list The list of servers that a baseline system resources, including the
NetBackup client or server refers to when means to rebuild the client system and
establishing or verifying connections to restore all system files.
NetBackup servers. On a Windows server
and Microsoft Windows clients, you Shared Storage Option (SSO) A
update the list through a dialog box in the separately priced Veritas software option
interface. On a UNIX server and UNIX that allows tape drives (stand-alone or in a
and Macintosh clients, the list is in the robotic library) to be shared dynamically
bp.conf file. On NetWare target and OS/ among multiple NetBackup and Storage
2 clients, the list is in the bp.ini file. Migrator servers.
Glossary-17
Glossary-19
Glossary-21
W
wakeup interval The time interval at
which NetBackup checks for backups that
are due.
wildcard characters Characters that
can be used to represent other characters in
searches.
Windows (adjective) Used to describe a
specific product or clarify a term, for
example Windows 2000, Windows .NET,
Windows servers, Windows clients, or
Windows GUI.
Windows (noun) See Microsoft
Windows.
Windows Display Console A
NetBackup-Java interface program that
runs on Windows platforms that are
supported by Symantec. Users can start
this interface on their local system, connect
to a UNIX system that has the NetBackup-
Java software installed, and then perform
any user operations that their permissions
allow.
WORM media Write-once, read-many
media. Can be tape or optical disks.
A bprecover 5-35
bpretlevel 7-6
ALLOW_MEDIA_OVERWRITE 7-18
bpsetconfig 8-17
avrd 3-15
bpstulist 7-5
bptestbpcd 8-27, 8-29
B bptm 3-15, 3-16, 3-23, 3-27, 3-28, 3-29
backup and archive program. See bpbkar. bptm log 6-39
backup and restore manager. See bpbrm. bpup 5-5
backup operation process flow 5-23–5-26
backup process flow 3-22–3-25 C
bottlenecks
client side 9-16 catalog backup
throughput 9-4 cold, offline 5-41–5-43
hot, online 5-30
bpbkar 3-16, 3-20, 3-23, 3-24 manual 5-50
bpbrm 3-15, 3-16, 3-23, 3-24, 3-25, 3-26, 3-27 catalog consistency 5-55
bpcd 3-20 catalog recovery
bpclient 1-5, 8-41 from cold, offline catalog backup 5-44
bpclntcmd 8-14 manual 5-51
bpconfig 1-5 NBDB is corrupt 5-47
without DR file
bpdbjobs 1-6
imageDB intact 5-34
bpdbm 1-6, 3-16, 3-21, 3-22 imageDB not intact in NBU 6.0 5-37
bpdbm -consistency 5-55 imageDB not intact in NBU 6.5 5-39
bpdm 3-23, 3-27 commands
bpdown 5-5 bpclient 1-5, 8-41
bpclntcmd 8-14
bperror 2-12, 7-12, 7-15, 9-7
bpconfig 1-5
bpgetconfig 8-35 bpdbjobs 1-6
bpimage 1-6 bpdbm 1-6
bpimport 5-38 bpdbm -consistency 5-55
bpinetd 3-20 bpdown 5-5
bperror 2-12, 7-12, 7-15, 9-7
bpjobd 3-22, 3-23, 3-25, 3-26, 3-29 bpgetconfig 8-35
bpjobs 7-14 bpimage 1-6
bplabel 7-7, 7-10, 7-18 bpimport 5-38
bpmedialist 7-5 bpjobs 7-14
bplabel 7-7, 7-10, 7-18
bpmedialist 7-10
bpmedialist 7-5, 7-10
bpplinfo 1-18 bpplinfo 1-18
bpps 2-11, 5-16, 8-37 bpps 2-11, 5-16, 8-37
bprd 3-26, 3-29 bprecover 5-35
bpretlevel 7-6
bprd utility 3-13
bpsetconfig 8-17
Index-1
Copyright © 2008 Symantec Corporation. All rights reserved.
bpstulist 7-5 Device Allocator 5-21
bptestbpcd 8-27, 8-29 Device Allocator. See DA.
bpup 5-5
device manager daemon. See ltid.
dmesg 6-7
ifconfig 8-6 dmesg 6-7
ipconfig 8-6 DNS 1-9, 8-14, 8-16, 8-17, 8-22
kill 8-21 DRIVE_ERROR_THRESHOLD 7-19
mt 6-12
nbdb_admin -auto_start 5-5
nbdb_move 5-54 E
nbemmcmd 1-20, 1-25, 5-11, 6-31, 6-35, 7-
4, 7-9, 7-19, 7-20, 7-24 e-mail notification
nbpushdata 1-9, 1-11, 1-12, 1-19 detecting problems using 2-17
nbrbutil 5-29 global 2-17
ndd 8-33 EMM database
netbackup start 5-5 populating 1-19
netbackup stop 5-5 upgrade issues 1-26
netstat 8-7, 8-21, 8-26 EMM domain 5-10
nslookup 8-23
EMM server 5-13
nt_ttu.exe 6-12
pbxcfg 8-38 emmlib 5-20
ping 8-7 external volume serial number 7-21
robtest 6-13, 6-23, 7-21, 7-22
route print 8-7
scan 6-24 F
sgscan 6-24
files
tctl 6-11
bp.conf 1-8, 1-26, 4-6, 6-30, 7-18, 7-20,
telnet 8-7, 8-24
8-11, 8-12, 8-17, 8-18
tpautoconf 1-8, 1-25, 6-19, 6-23, 7-4
CHILD_DELAY 9-12
tpconfig 1-8, 6-22, 7-10
device mapping 6-33
tpext 6-33
errors 7-16
tpreq 6-17
external_types.txt 6-33
tpunmount 6-18
history.log 8-10
traceroute 8-8
hosts 8-5, 8-7, 8-12
tracert 8-8
inetd.conf 8-21
vmadd 7-10
nbsvcmon.conf 3-17
vmdareq 6-25
NET_BUFFER_SZ 9-31
vmglob 6-25, 6-27
nsswitch.conf 8-23
vmoprcmd 6-21
PARENT_DELAY 9-12
vmphyinv 7-21
password 8-23
vmpool 7-10
Readme.txt 6-33
vmquery 7-24
resolv.conf 8-22
vxlogcfg 4-12, 4-14, 4-23, 4-32
services 8-19
vxlogmgr 4-21, 4-23
st.conf 7-13
vxlogview 4-30, 4-31, 4-33, 5-15, 5-17,
system 9-25
5-19, 5-22
version 8-10
version.txt 8-10
D vm.conf 7-20
firewall ports 8-39
DA 3-14
Job Manager. See nbjm. nbjm 3-10, 3-15, 3-21, 3-22, 3-23, 3-25, 3-27,
3-28, 3-29
nbpem 3-10, 3-21, 3-25
K nbpemreq utility 3-13
kill 8-21 nbproxy 3-21
nbpushdata 1-9, 1-11, 1-12, 1-19
nbrb 3-10, 3-21, 3-22, 3-25, 3-27, 3-29, 5-16
L nbrbutil 5-29
legacy logs nbsupport 2-9–2-10
description 4-3 nbsvcmon 3-17
disabling 4-23
ndd 8-33
enabling 4-5–4-6, 4-8
extracting data 4-27–4-29 NetBackup Communications Daemon. See
layout 4-26 bpcd.
location of logs 4-4 NetBackup Consistency Checker 1-7
Media Manager 4-7 NetBackup Network Daemon. See vnetd.
Netbackup 4-5 NetBackup relational database 5-3
viewing 4-25 configuration files 5-4
legacy processes 3-4 creation process 5-9
loopback address 8-7 database files 5-3
ltid 3-15 transaction log 5-4
NetBackup Service Monitor. See nbsvcmon.
parameters
ALLOW_MEDIA_OVERWRITE 7-18 R
CLIENT_NAME 8-11, 8-17
CONNECT_OPTIONS 8-29–8-31 recorded volume serial number 7-21
DRIVE_ERROR_THRESHOLD 7-19 Registry keys
for bptestbpcd 8-27 HKEY_LOCAL_MACHINE\Soft-
for mail_dr_info script 5-57 ware\VERITAS\NetBackup\Cur-
for nbrbutil 5-27 rentVersion 8-10
for NICs, modifying 8-6–8-7 HKEY_LOCAL_MACHINE\SOFT-
for robotic control configuration 6-28 WARE\VERITAS\NetBackup\Cur-
keep_hours 1-6 rentVersion\Config 8-18
kernel 9-25 HKEY_LOCAL_MACHINE\Sys-
MEDIA_ERROR_THRESHOLD 7-19 tem\CurrentControlSet\Ser-
MONITOR_INTERVAL 3-17 vices\controller_id\Paramete
MONITOR_ON 3-17 rs\Device 9-27
NUMBER_DATA_BUFFERS 9-24 HKEY_LOCAL_MACHINE\SYS-
NUMBER_DATA_BUFFERS_DISK 9-24 TEM\CurrentControlSet\Ser-
NUMBER_DATA_BUFFERS_RESTORE 9- vices\Tcpip\Parameters 8-32
24 HKLMSOFTWAREVeritas etBackupCu
reporting 2-15 rrentVersion ConfigServer 6-
REQUIRED_INTERFACE 8-17, 8-18 30
RESTART_LIMIT 3-17 Registry values
SIZE_DATA_BUFFERS 9-24 MaximumSGList 9-27
SIZE_DATA_BUFFERS_DISK 9-24 Server 8-18
tcp_time_wait_interval 8-33, 8-47 TcpTimedWaitDelay 8-32, 8-47
TIME_WINDOW 7-19 release updates 1-10
PBX 3-4, 3-18, 8-37 reports 2-14
pbx_exchange 3-18 All Log Entries 7-17, 9-7
pbxcfg 8-38 Media List 7-17
performance Media Logs 7-12, 7-16, 7-17
bottlenecks 9-4 Problems 7-11, 7-12, 7-15, 7-16
goals 9-3 resource allocation 3-14
ping 8-7 Resource Broker 5-16