Beruflich Dokumente
Kultur Dokumente
5.0
EMC CORPORATION
CORPORATE HEADQUARTERS:
HOPKINTON, MA 01748-9103
1-508-435-1000
WWW.EMC.COM
Copyright and Trademark Notices
This document contains information proprietary to EMC. Due to continuing product development,
product specifications and capabilities are subject to change without notice. You may not disclose or
use any proprietary information or reproduce or transmit any part of this document in any form or
by any means, electronic or mechanical, for any purpose, without written permission from EMC.
EMC has made every effort to keep the information in this document current and accurate as of the
date of publication or revision. However, EMC does not guarantee or imply that this document is
error free or accurate with regard to any particular specification. In no event will EMC be liable for
direct, indirect, incidental or consequential damages resulting from any defect in the documentation,
even if advised of the possibility of such damages. No EMC agent or employee is authorized to make
any modification, extension or addition to the above statements.
EMC may have patents, patent applications, trademarks, copyrights or other intellectual property
rights covering subject matter in this document. The furnishing of this document does not provide
any license to these patents, trademarks, copyrights or other intellectual property.
The Avamar Agent for Microsoft Windows incorporates Open Transaction Manager (OTM), a product
of Columbia Data Products, Inc. (CDP). CDP assumes no liability for any claim that may arise
regarding this incorporation. In addition, EMC disclaims all warranties, both express and implied,
arising from the use of Open Transaction Manager. Copyright 1999-2002 Columbia Data Products,
Inc. Altamonte Springs. All rights reserved.
Avamar, RAIN and AvaSphere are trademarks or registered trademarks of EMC in the US and/or
other countries.
All other product names and/or slogans mentioned herein may be trademarks or registered
trademarks of their respective companies. All information presented here is subject to change and
intended for general information.
Copyright 2002-2010 EMC. All rights reserved.
Protected by US Patents No. 6,704,730, 6,810,398 and patents pending.
Printed in the USA.
TABLE OF CONTENTS
Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
Scope and Intended Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Product Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Your Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Typeface Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Notes, Tips and Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
Guide Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Lifecycle Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Best Practices Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Assumptions and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Top 10 Operational Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
Avamar Client Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Restrictions and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Scheduling Activities During the Course of a Day. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Defining Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Defining Schedules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Defining Retention Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Tuning Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Tuning Client Caches to Optimize Backup Performance . . . . . . . . . . . . . . . . . . . . . . . . . 39
Client Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Overview of Cache Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
File Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Hash Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Impact of Caches on Memory Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Cache Information in the avtar Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Changing the Maximum Cache Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Tuning the Maximum Cache Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Rules for Tuning the Maximum Cache Sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Tuning the File Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Tuning the Hash Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Using cacheprefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Customizing Maximum Hash Cache Settings for Exchange and SQL Servers . . . . . 47
Tuning Replicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Product Information
For current documentation, release notes, software updates, as well as
information about EMC products, licensing and service, go to the EMC Powerlink
web site at http://Powerlink.EMC.com.
Your Comments
Your suggestions will help us continue to improve the accuracy, organization and
overall quality of the user publications. Please send your opinion of this document
to:
SSGDocumentation@emc.com
Please include the following information:
• Product name and version
• Document name, part number and revision (for example, A01)
• Page numbers
• Other details that will help us address the documentation issue
Typeface Conventions
The following table provides examples of standard typeface styles used in this
publication to convey various kinds of information.
EXAMPLE DESCRIPTION
Guide Organization
This best practices guide is organized as follows:
Lifecycle Indicators
The introduction to each chapter indicates which of the following Avamar server
lifecycle phases it covers:
Planning and design. Topology and architecture options, risks and limitations
and any other planning and design issues that must be considered prior to
implementing the design.
Practice is recommended.
Deploy the Avamar server with reliable, high-performance RAID arrays for
back-end storage.
Protect the data on the Avamar server by replicating the data to another
Avamar server.
Monitor the Avamar server on a daily basis. Interpret all system warnings
and errors.
Protect the Avamar server from the Internet by providing full firewall
protection.
Change all factory default passwords except the passwords for the
backuponly, restoreonly, and backuprestore software application users.
The chapters that follow provide more details on these best practices, and also
provide additional best practices.
Avamar Architecture
To ensure the long-term reliability, availability, and supportability of the Avamar
server, you must design it carefully.
Several processes run on the Avamar server nodes. Key processes include:
• Avamar Administrator server and the Avamar Enterprise Manager server on
the utility node.
• Avamar data server on all active data nodes.
The Avamar data server is also known as GSAN (Global Storage Area Network).
The Avamar data server stores, processes, and manages the variable-sized
chunks that the client sends during a backup. An average size chunk is about
10 KB depending on the customer data. Through the patented deduplication
technology, only unique data chunks are sent to the Avamar data server.
Stripes
The term “stripe” refers to the container an Avamar data server uses to manage
the data in the system. Stripes are files of various sizes that are based on the kind
of stripe.
Each stripe has a unique name, and the Avamar server can identify and access a
stripe by name only. The following table describes four kinds of stripes:
STRIPE DESCRIPTION
Atomic data Contains data that originates on the customer system and is read
during a backup.
Index Maps a hash to the stripe that contains corresponding data. This
is the essence of a “content addressed” store.
Parity Provides simple XOR parity that can be used to reconstruct data
when a failure occurs. If RAIN is used, every stripe belongs to a
parity group that protects it. A protected stripe is called a “safe”
stripe.
Restore. Restores the data stored on the Avamar server to the Avamar client.
hfscheck. Validates one of these checkpoints every day through a process called
hfscheck.
Garbage collection. Deletes the orphaned chunks of data that are no longer
referenced within any backups stored on the system.
Set up log scanning to monitor and report hardware issues, and set up
email home.
Best practices:
Always enable RAIN for any configuration other than single-node servers
and 1x2’s (two active data nodes). Minimum RAIN configuration is a
1x3+1 (three active data nodes plus a utility node and spare node).
Double-disk failures on a node, or a complete RAID controller failure can
occur. Either of these failures can corrupt the data on a node. Without
RAIN, the only recourse is to reinitialize the entire system and replicate
the data back from the replication target.
When deploying non-RAIN servers, you must replicate the data on them
to ensure that the data is protected. Non-RAIN servers have no data
redundancy and any loss of data requires that the system be re-
initialized.
Limit initial configurations to 12 to 14 active data nodes so that nodes can
be added later if needed to recover from high-capacity utilization
situations.
Replication. The Avamar system can efficiently replicate data from one Avamar
server to another on a scheduled basis. This ensures complete data recovery if the
primary backup Avamar server is lost.
Replication is useful for more than recovering a single client. Replication moves
data to another system that can be used for data recovery in the event of an
unexpected incident. Replication is, by far, the most reliable form of redundancy
that the system can offer because it creates a logical copy of the data from the
replication source to the destination. It does not create a physical copy of the
blocks of data. Any corruptions, whether due to hardware or software, are far less
likely to be propagated from one Avamar server to another. In addition, multiple
checks of the data occur during replication to ensure that only uncorrupted data is
replicated to the replication target.
Therefore, if maximizing the availability of the backup server for backups and
restores is important, you should set up a replication system as quickly as
possible.
Best practices:
Protect the data on the Avamar server by replicating the data to another
Avamar server.
Best practice:
Leave the checkpoint retention policy at the default values. The default is
set to retain the last two checkpoints, whenever created, and the last
validated checkpoint.
NOTE: During certain support actions, your Customer Support
Representative might temporarily change the checkpoint retention policy
to ensure that certain critical checkpoints are retained during the support
action. Ensure the checkpoint retention policy is restored to the default
setting when the support action is completed.
FACTOR DESCRIPTION
Recovery time When the Avamar system performs a restore, all data that
objective (RTO) must be restored is compressed and sent from the Avamar
server to the Avamar client, where it is uncompressed.
However, no deduplication is performed on the restored
data.
The primary advantage of backing up to a remote Avamar
backup server (Option 1) is that the restore can be done
directly from that server across the local area network to
the client. This is important if a recovery time objective
(RTO) requirement must be satisfied.
FACTOR DESCRIPTION
Large multi-node If large multi-node servers are required to back up all data
servers in a remote office, there might not be a significant
reduction in the number of Avamar servers that are
deployed, even if Option 1 is selected. In this case, the
cost of deploying, managing, and supporting the Avamar
servers is roughly the same, regardless of whether these
Avamar servers are deployed as remote Avamar backup
servers or as centralized Avamar backup servers.
Unless you cannot meet your restore time objectives, design the system
so that clients first back up directly to a large, active, and centralized
Avamar server. Then replicate the data to another large centralized
Avamar server.
Many of the operational best practices described throughout this document are
targeted at understanding the average daily change rate or managing the per-
node capacity.
Operating System Capacity. This is the total amount of data in each data
partition, as measured by the operating system. This amount is not particularly
useful to an external observer because the server manages disk space itself.
CAPACITY
DEFAULT USED FOR
THRESHOLD VALUE COMPARISON BEHAVIOR
CAPACITY
DEFAULT USED FOR
THRESHOLD VALUE COMPARISON BEHAVIOR
Monitor all variations with incoming data to prevent the system from
becoming read-only.
Monitor all variations with maintenance jobs to prevent these jobs from
failing.
Reassess retention policies to see if you can decrease the retention, and
therefore, reduce the capacity use.
RESTRICTIONS AND
LIMITATIONS IMPACT
Carefully review the Ensure that clients and applications you want to
client support matrix protect with Avamar software are fully supported. In
with the presales particular, verify that the specific details of the
technical engineer. deployment, such as revisions, clusters, third-party
plugins and add-ons, are supported.
RESTRICTIONS AND
LIMITATIONS IMPACT
Backup Window. The portion of each day reserved for performing normal
scheduled backups.
Default Settings The default backup window begins at 8 p.m. local server time
and continues uninterrupted for 12 hours until 8 a.m. the
following morning.
Blackout Window. The portion of each day reserved for performing server
maintenance activities (primarily checkpoint and garbage collection) that require
unrestricted access to the server.
Default Settings The default blackout window begins at 8 a.m. local server
time and continues uninterrupted for three hours until 11
a.m. that same morning.
Maintenance Window. The portion of each day reserved for performing routine
server maintenance activities (primarily checkpoint validation).
Defining Domains
Domains are distinct zones within the Avamar server accounting system that are
used to organize and segregate clients. The real power of domains is that they
provide the ability for a domain-level administrator to manage clients and policies
within that domain. This is known as hierarchical management.
Another possible reason for segregating clients by domain is for billing other
internal organizations for backup services. Segregating clients by department or
work group might be a convenient way to bill them.
If you are not going to use hierarchical management, then you should register all
clients in the /clients domain.
Best practice:
Minimize the number of domains you create. Ideally, you will register all
clients in the /clients domain.
Defining Groups
A group defines the backup policy for the clients assigned to it, and includes the
following three policy elements: dataset policy (including the source data,
exclusions and inclusions), backup schedule (or backup window), and retention
policy.
Best practices for defining groups:
Minimize the number of groups you define. Each dataset policy can
include separate dataset definitions for various client plugins. In other
words, a single dataset policy can define independent datasets for
Windows, Linux, Solaris and other clients. You do not need to define
separate groups to back up various kinds of operating system clients.
Leave the default group disabled. By default, all new clients that have
been activated with the Avamar server are automatically added to the
default group. If you enable the default group, then any clients that are
activated with the Avamar server will automatically be backed up
according to the default group policy. To help manage the capacity on the
Avamar server and to avoid being surprised by unexpected clients, leave
the default group disabled. To back up any clients in the default group to
the Avamar server, you can add the client to an enabled group and
remove the client from the default group.
Defining Datasets
Best practices for defining datasets:
Defining Schedules
The default schedule runs nightly during the server’s backup window. Depending
on the amount of data in your largest clients, this might not be enough time, and
you might need to extend the server's backup window. Before extending the
backup window, you must evaluate the time required for checkpoint, garbage
collection, and hfscheck to determine that extra time is available after completing
these activities on a daily basis.
Best practices for defining schedules:
Set appropriate expectations for how long the longest client backups
should run every night, and validate that the long-running client backups
meet the expectations.
Minimize the number of clients that need to be backed up outside of the
server's backup window. When setting up backup schedules, remember
that mobile laptop clients might need to be scheduled to back up during
the day when they are connected to the network. The system can handle
a small number of exceptions. In this case, you will want to overlap the
backup of this handful of exception clients with the server’s maintenance
window.
METHOD DESCRIPTION
Email notification You can set up email notification in either of the following
system ways:
• To batch email notifications that are sent twice daily
according to the default notification schedule.
• To send emails as the selected events occur.
METHOD DESCRIPTION
Avamar Enterprise To manually monitor the Avamar system, check the overall
Manager health of the Avamar backup infrastructure through the
dashboard Avamar Enterprise Manager dashboard. Avamar server
issues are immediately obvious because they are flagged
by a red “X” under Server Status.
Monitor the Avamar Activity Monitor on a daily basis and understand how
to interpret all activity warnings and errors.
• Activities - Failed
This report lists all activities in the specified period that failed due to errors.
• Clients - No Activities
This report lists all clients that did not have any activities in the specified
period.
Refer to the Avamar System Administration Guide for descriptions of these and
other reports that are available.
Best practice:
Monitor backups every day and investigate all failed backups, missing
clients, and backups that completed with exceptions.
Client Caches
At the beginning of a backup, the avtar process loads two cache files from the var
directory into memory. The var directory is found in the Avamar installation path.
The first of the cache files is the file cache (f_cache.dat). The file cache stores a
20-byte SHA-1 hash of the file attributes, and is used to quickly identify which files
have previously been backed up to the Avamar server. The file cache is one of the
main reasons subsequent Avamar backups that occur after the initial backup are
generally very fast. Typically, when backing up file servers, the file cache screens
out approximately 98 percent of the files. When backing up databases, however,
the file cache is not effective since all the files in a database appear to be modified
every day.
The second cache is the hash cache (p_cache.dat). The hash cache stores the
hashes of the chunks and composites that have been sent to the Avamar server.
The hash cache is used to quickly identify which chunks or composites have
previously been backed up to the Avamar server. The hash cache is very important
when backing up databases.
The following figure shows the effectiveness of the file cache when backing up file
servers in contrast to databases. In file servers, the file cache is typically 98
percent effective in filtering previously backed-up files.
File Cache
If the file cache is deleted, unused, or is undersized, every file that is not a hit in
the file cache must be read, chunked, compressed, and hashed before the avtar
process finds that the hashes were previously sent to the Avamar server. If a file
hits in the file cache, then the file is never read, which saves significant time and
CPU.
By default, the file cache could consume up to one-eighth of the physical RAM on
the Avamar client. For example, if the client has 4 GB of RAM, the file cache will be
limited to 4 GB/8, or 512 MB maximum.
The file cache doubles in size each time it needs to increase. The current file cache
sizes are in megabytes: 5.5 MB, 11 MB, 22 MB, 44 MB, 88 MB, 176 MB, 352 MB,
704 MB and 1,408 MB. Since the avtar program is a 32-bit application, the
maximum file cache size that can be used is limited to less than 2 GB. In an
example where a client has 4 GB of RAM, the maximum size of the file cache will
be 352 MB.
Each entry in a file cache comprises a 4-byte header plus two 20-byte SHA-1
hashes (44 bytes total):
• SHA-1 hash entry of the file attributes.
The file attributes include: filename, filepath, modification time, file size,
owner, group and permissions.
• SHA-1 hash entry for the hash of the actual file content, independent of the
file attributes.
The file cache rule: If the client comprises N million files, the file cache must be at
least N million files x 44 million bytes/million files. This means that the file cache
must be at least N x 44 MB, where N is the number of millions of files in the
backup.
Example If a client has 4 million files, the file cache must be at least 176 MB (4 x 44 MB). In
other words, the file cache must be allowed to increase to 176 MB to
accommodate all the files.
Hash Cache
If the avtar process finds that a hash of a chunk is not contained in the hash
cache, it queries the Avamar server for the presence of the hash.
By default, the hash cache could consume up to one-sixteenth of the physical RAM
on the Avamar client. Using the same client with 4 GB of RAM described in File
Cache (page 41), the hash cache will be limited to 4 GB/16, or 256 MB maximum.
The hash cache also doubles in size each time it needs to increase. The current
hash cache sizes are in megabytes: 24 MB, 48 MB, 96 MB, 192 MB, 384 MB,
768 MB, and so forth. In this example where a client has 4 GB of RAM, the
maximum size of the hash cache will be 192 MB.
Each entry in a hash cache comprises a 4-byte header plus one SHA-1 hash per
chunk or composite, which is the hash of the contents of the chunk or composite.
The hash cache rule: If the client comprises Y GB of database data, the hash
cache must be at least Y GB/average chunk size x 24 million bytes/million chunks.
Use 24 KB as the average chunk size for all backups. The hash cache must be at
least Y MB, where Y is the number of gigabytes of database data in the backup.
Example If a database client has 500 GB of database data, the hash cache must be allowed
to grow to at least 500 MB. In other words, the hash cache must be allowed to
grow to the next incremental size (768 MB) to accommodate the hashes for all the
chunks in a database backup.
also fluctuates during the backup depending on the structure of the files that are
being backed up by avtar.
The file cache and hash cache can increase to maximum sizes of one-eighth and
one-sixteenth of the total RAM in the system, respectively. For a client that has
more than one-half GB of RAM, for example, the file and hash caches contribute
more to the overall memory use than the rest of the avtar process. This is because
both caches are read completely into memory at the start of the avtar backup.
Also, by default, the overall memory that client caches use is limited to
approximately three-sixteenth of the physical RAM on the Avamar client.
Millions of Small Files. If the client has millions of small files, then you might
need to increase the file cache from the default size. Generally, for every one
million files on the Avamar client, the client requires 512 MB of physical RAM. If a
client has one million files, then, by using the formula in File Cache (page 41), a
minimum of 44 MB (1 x 44 MB) is required just to store all the file hashes for a
single backup. Since the file hashes must be stored for several backups, more
than 44 MB is required.
The file cache doubles in size each time it needs to increase. The current file cache
sizes are in megabytes: 5.5 MB, 11 MB, 22 MB, 44 MB, 88 MB, 176 MB, 352 MB,
704 MB, and 1,408 MB.
By default, since one-eighth of the physical RAM of 512 MB is used, cache can
increase to a limit of 64 MB, which means that the default value of one-eighth of
RAM for the file cache is adequate.
You must be particularly alert to this situation when configuring a Windows client
to back up a large file server (for example, a NAS filer) through a network mount.
Large Databases. If the client has a few large files, then usually the default of
one-sixteenth for the hash cache is insufficient. For example, for a 240 GB
database, a minimum of 240 MB is required when using the formula in Hash Cache
(page 42). This amount can only store the hashes for a single backup.
The hash cache also doubles in size each time it needs to grow. The current hash
cache sizes are in megabytes: 24 MB, 48 MB, 96 MB, 192 MB, 384 MB, 768 MB,
and 1,536 MB.
The next increment available is 384 MB. Therefore, if this client has 4 GB of RAM,
the hash cache must increase to one-eighth of the RAM. If the default of one-
sixteenth of the RAM is used, the hash cache will be limited to 192 MB, and an
undersized hash cache will result. In the case of databases, since very few files
are backed up, the file cache will be considerably smaller, so the net memory use
is still about one-eighth to three-sixteenth of the RAM.
Never allow the total combined cache sizes to exceed one-fourth of the
total available physical RAM.
Set the maximum file and hash cache sizes to a fraction of the total available
physical RAM. Specify the file and hash cache sizes by using negative integers.
Limit the total cache sizes to approximately one-fourth of the physical RAM. Set
one of the caches to be -5 (20 percent), and set the other cache to be -32 (3
percent). For example, for a large database client use the following settings:
--filecachemax=-32
--hashcachemax=-5
If you use something other than the default cache sizes, include the customized
maximum cache settings in the avtar.cmd file on the client.
Sometimes your only choice may be to increase the amount of physical RAM on
the client. You might also be able to back up the client by using multiple smaller
datasets.
If you need to limit the sizes of the caches below the optimum values, then
remember the following:
• For a typical file server, first allocate the required RAM to the file cache.
• For a typical database client, first allocate the required RAM to the hash cache.
Using cacheprefix
If the client does not have enough memory to accommodate the cache files of
appropriate sizes, you can back up the client and get the full benefit of
appropriately-sized cache files. To do so:
• Break the client file system into multiple smaller datasets.
• For each dataset, ensure that the maximum file and hash caches assign a
unique cacheprefix attribute.
Example Assume a client has 5.5 million files but only 1.5 GB of RAM. One volume has 2.5
million files and three other volumes have 1 million files each. You can break this
client file system into four datasets. For the volume with 2.5 million files, a file
cache of at least 110 MB (2.5 x 44 MB) is required. The next increment that
accommodates this is 176 MB. The other datasets could be defined as follows:
• C:\ drive (2.5 M files)
filecachemax=220
hashcachemax=30
cacheprefix=driveC
• E:\ drive (1.0 M files)
filecachemax=88
hashcachemax=30
cacheprefix=driveE
Tuning Replicator
Work with your EMC Practice Consultant to configure and tune the replicator. The
Practice Consultant will perform the following tasks:
1. Compute the bandwidth-delay-product (BDP) to determine whether the
BDP is high enough to require customized tuning.
2. Verify that the expected bandwidth is available between the replicator
source utility node and the replicator destination data nodes.
3. Test the WAN link with the Avamar system components to verify that the
Avamar system can utilize about 60 to 80 percent of the available
bandwidth.
4. Set up the appropriate replication parameters to optimize utilization of the
available bandwidth.
5. Test the replicator to verify its performance.
Host. The client hostname as defined in DNS. During backups, this is the client
that backs up data to the Avamar server. During restores, this is the client that
receives the restored data. This is not the client that sourced the data.
StartValue. The Unix start time of this activity. The Unix start time is in the local
time of the Avamar server.
StartTime. The date and time this activity was initiated. The StartTime is in UTC
(or GMT).
Root. The name of the dataset that was used during the activity, if applicable.
NumFiles. The total number of files scanned during this activity (less those files
that were excluded through exclusion rules).
NumModFiles. The total number of modified files associated with this activity.
Overhead. The number of bytes for COMPOSITEs and DIRELEMs used to store
data. It is the amount of nonfile data sent by the client to the server, and includes
such things as indexing information, requests from the client to the server for the
presence of specific data chunks, ACLs, directory information and message
headers. On any relatively active file system, this is generally a very small
percentage of the file data that is sent to the server.
ClientVer. The Avamar client software version, where the Avamar client is as
defined previously under Host.
Status. The FINAL status of the client activity is one of the following:
• Activity completed successfully
• Activity completed with exceptions
• Activity cancelled
• Activity failed - timed out before starting
• Activity failed - timed out before completion
• Activity failed - client was given a workorder, but did not acknowledge its
receipt
• Activity failed - client error(s)
• Activity failed - timed out before completion
• Activity failed - client has no data specified by dataset
• Dropped Session - No progress reported
Background on Backups
This section provides background information about how the Avamar client
performs backups, including key statistics.
Dataset Size. Begin with the dataset size (TotalBytes - column L) equal to
451,940,965,688 bytes (421 GB). This is the total bytes in the dataset. Any files
that are excluded due to exclusion rules are not counted in this total. However,
this total includes those open files that could not be backed up, perhaps because
the file system was not frozen.
Modified Files. When scanning through a file system, obtain file metadata and
compute the SHA-1 hash of the metadata. Then look up SHA-1 hash in the file
cache on the client. If it is present, the opening and reading of the files is not
necessary. Therefore, a high percentage of hits in the file cache makes the overall
backup proceed very quickly.
Any file whose metadata hash gets a miss in the file cache is considered a
modified file (that is, a file that was modified since the last backup). Therefore,
the Mod bytes in NumModFiles (column H), ModReduced (column I), ModNotSent
(column J), and ModSent (column K) are really shorthand for bytes associated
with modified files (that is, files you must open and read so that all the data in the
file can be chunked, compressed, and hashed).
The following flowchart shows the roles of the file cache, the hash cache, and the
server is_present requests in determining which data to send to the server.
ModReduced. When the modified files are backed up, the data is chunked,
compressed, and then hashed. Since the compression takes place on the client,
the amount that the data was compressed is reported as ModReduced. In the
example row, the ModReduced = 55,023,086,382 (51 GB).
ModNotSent. If subfile level commonality exists, then the data is not sent.
Therefore, ModNotSent is shorthand for bytes in modified files that do not have to
be sent to the Avamar server because of subfile-level commonality factoring. In
the example row, ModNotSent = 4,115,205,370 (3.8 GB), which means that 3.8
GB of compressed chunks were already on the Avamar server.
ModSent. If new bytes must be sent to the server, they would be reported as
ModSent. In this case, ModSent = 393,498,485 (0.37 GB).
Definition of Commonality
Avamar change rate is equivalent to the ModSent / TotalBytes.
The Avamar commonality = 1 - (change rate).
Running Backups More Frequently Than the Retention Policy (page 62)
Inclusions C:\Users\*\Desktop
C:\Users\*\Documents
C:\Users\*\Favorites
Exclusions /Users/*
Inclusions /Users/*/Desktop
/Users/*/Documents
/Users/*/Library/Favorites
Dataset Caveat
In an environment that contains Windows XP desktop or laptop computers along
with Vista or Windows 7 desktop or laptop computers, backups can appear to fail if
both Windows XP clients and Vista or Windows 7 clients use a single dataset that
specifies the C:\Documents and Settings folder and the C:\Users folder. A backup
in such an environment displays a backup failure on the status bar and writes an
error similar to the following to the log file:
Path not found
In an environment that contains both Windows XP and Vista or Windows 7 clients,
add the --x18=256 flag to the dataset to prevent the Path not found error.
To add the --18=256 flag to the dataset, perform the following:
1. Start Avamar Administrator.
2. Select Tools > Manage Datasets.
The Manage All Datasets dialog box appears.
3. Select the dataset from the list and click Edit.
The Edit Dataset dialog box appears.
4. Click Options and select the plug-in from the Select Plug-In Type list.
5. Click More.
6. Type --x18 in the Enter Attribute text box.
7. Type 256 in the Enter Attribute Value text box.
8. Click .
The attribute/value pair (--x18=256) appears in the large text box as
shown in the following figure:
9. Click OK twice.
• Fast Ethernet
• 802.11g
• Ethernet
• V.92
• V.34
The recommendation to bring clients online in sequentially targeted groups can
best be achieved by using the directory information tree or search capability of
Avamar Activation Manager. You can select appropriately sized and situated
organizational units by using the tree structure. Or, you can use search terms to
define a group with your target number and type of clients. Then, using its drag-
and-drop capability, you can activate and initiate first-time backups of correctly
sized and connected groups of clients. This way, you avoid the problems
associated with overtaxing the throughput capabilities of your network.
In a standard deployment, the servers are backed up at night when they are least
active. To accommodate this, the Avamar daily maintenance tasks run during the
day.
For Avamar Desktop/Laptop, the client backups must run during the day, when the
clients are most likely to be powered-up and connected. The Avamar daily
maintenance tasks often must be changed to run during the night to avoid
conflicts with client backups and restores. Refer to Scheduling Activities During
the Course of a Day (page 25) for more information.
The daily maintenance task of garbage collection requires a quiet system. If any
backups are running, the task will not start. Since garbage collection reclaims
space on the Avamar server, the failure to successfully complete this task can
quickly create capacity problems.
Protect the Avamar server from the Internet by providing full firewall
protection.
Changing Passwords
If your Avamar passwords have not changed from the factory default values,
change them by running the change-passwords utility. Refer to your Avamar
System Administration Guide for additional information.
There are three operating system users (root, admin, and dpn), two SSH keys
(admin_key and dpnid), and three root-level software application users (root,
MCUser, and repluser). You must change the passwords for all of these users. Just
changing the passwords for the operating system users is not sufficient to prevent
someone from logging in to the Avamar server nodes. If you have not changed the
two SSH keys, then someone could use the factory-default SSH keys to log in to
your Avamar server.
Best practice:
Change all factory default passwords except the passwords for the
backuponly, restoreonly, and backuprestore software application users.
Assigning Users
The Avamar software includes access audit logging capability. The broad intent of
this feature is to maintain a log of every action taken on vital system components/
objects.
The data in this log enables enterprises deploying Avamar to do the following:
• Enforce security policies
• Follow up on security breaches or deviations from policies
• Hold appropriate users accountable for those actions
To fully leverage the audit logging capability, follow these best practices:
checkpoint 27 RAID
storage capacity impact 20 Redundant Array of Independent Nodes. See
maintenance window 18, 26, 27, 31, 57 RAIN
Management Console Server 19 remote backups 16
monitoring remote clients 16, 62
Avamar server 33, 34 remote offices. See remote clients
backup operations 32, 35 replication
daily backups 35, 36 bottlenecks 27
event notifications 33 central destination 16
hardware 13 described 14, 27
replication 10, 37 domain 15
server capacity 26 monitoring 37
nightly 37
N options 15
network overlapping with backups 27
bandwidth 49, 55 scheduling 28
issues 35, 36 using temporary server 21
mount 44 replicator
traffic 62 bandwidth usage 15
transmission rates 60 testing 48
notifications. See event notifications tuning 48
restore failures 37
restore time 17
O restore work order 23
on-demand backups 51 restoring data 12, 16, 23, 24, 55
on-demand restores 26, 51 retention policies 19, 21, 28, 30, 31, 62
optimizing performance 38 rework work order 36
orphaned data chunks 12, 22
S
P schedules
passwords 10, 65, 66 default 31
performance garbage collection 31
addressing issues 37 hfscheck 31
bottlenecks 15 notification 33
degradation 63 scheduling
policies activities 23, 25
backup 30 email notification 33
dataset 30 large backups 24
default group 30 replication 28
retention 16, 30, 31, 62 security administration 67
power management 63 server
precrunching 12 backup window 10, 27
protected network 65 capacity 26, 30
server maintenance plan 66
R SHA-1 hash entry 42
RAID 10, 13 SNMP 33
RAIN 12, 13, 14, 18, 19, 22 status messages
RAM cancelled 36
allocations for cache files 45 failed 36
file and hash cache usage 43 steady state 22, 49, 63
insufficient amount 44, 45 storage capacity. See capacity
read-only state 20, 21, 63 storage groups. See groups
read-only threshold 19, 20 stripes 12, 21
redundancy methods 11 suspended jobs 19, 21
Redundant Array of Independent Disks. See swapping 38, 45
syslog 33
T
thresholds, capacity 19
tuning
Avamar system 7, 24
cache file sizes 44
client caches 36, 39
file cache 45
hash cache 46
performance 38
replicator 48
U
unacknowledged events 34
uncompressed chunks 16
unique data 11, 12, 18, 21, 31
V
VPN 56
W
Wake-on-Lan (WoL) 63
WAN
bottlenecks 15
connection 23
throughput 17, 27
warning threshold 21, 63
wireless connection 62
work order 23
work order ID 51