Beruflich Dokumente
Kultur Dokumente
Supporting
PATROL Agent version 3.6
PATROL Central Operator Web Edition version 7.1.10.01
PATROL Central Operator Microsoft Windows Edition version 7.5.00
PATROL Configuration Manager version 1.5.01
PATROL Console Server version 7.5.00
PATROL Knowledge Module for Event Management/
PATROL Notification Server version 2.6.00
PATROL Knowledge Module for Log Management version 2.0.01
PATROL RTserver version 6.6.00
PATROL Infrastructure Monitor version 7.5.00 (formerly the PATROL
Infrastructure Knowledge Module version 7.1.10)
Distribution Server version 7.1.20
BMC Impact Integration for PATROL version 7.1.01
PATROL Enterprise Manager Console Server Connection version 7.1.00
PATROL Integration for HP Network Node Manager version 7.1.00
Telephone
Fax
Fax
Copyright 2005 BMC Software, Inc., as an unpublished work. All rights reserved.
BMC Software, the BMC Software logos, and all other BMC Software product or service names are registered trademarks
or trademarks of BMC Software, Inc.
IBM is a registered trademark of International Business Machines Corporation.
Oracle is a registered trademark, and the Oracle product names are registered trademarks or trademarks of Oracle
Corporation.
All other trademarks belong to their respective companies.
PATROL technology holds U.S. Patent Number 5655081.
BMC Software considers information included in this documentation to be proprietary and confidential. Your use of this
information is subject to the terms and conditions of the applicable End User License Agreement for the product and the
proprietary and restricted rights notices included in this documentation.
Customer support
You can obtain technical support by using the Support page on the BMC Software website or by contacting Customer
Support by telephone or e-mail. To expedite your inquiry, please see Before Contacting BMC Software.
Support website
You can obtain technical support from BMC Software 24 hours a day, 7 days a week at
http://www.bmc.com/support_home. From this website, you can
read overviews about support services and programs that BMC Software offers
find the most current information about BMC Software products
search a database for problems similar to yours and possible solutions
order or download product documentation
report a problem or ask a question
subscribe to receive e-mail notices when new product versions are released
find worldwide BMC Software support center locations and contact information, including e-mail addresses, fax
numbers, and telephone numbers
product information
product name
product version (release number)
license number and password (trial or permanent)
machine type
operating system type, version, and service pack or other maintenance level such as PUT or PTF
system hardware configuration
serial numbers
related software (database, application, and communication) including type, version, and service pack or
maintenance level
messages received (and the time and date that you received them)
Contents
Chapter 1
Introduction
13
17
Analysis Tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Completing the Enterprise Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Chapter 3
23
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RT Cloud Logical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RTserver Design Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Characterizing RTserver Traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Deciding Initial RTserver Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RTserver Backup/Failover Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Visualization Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Which PATROL Central Console is Appropriate? . . . . . . . . . . . . . . . . . . . . . . . . . .
Which Locations Need PATROL Console Servers? . . . . . . . . . . . . . . . . . . . . . . . . .
Is One PATROL Console Server Enough? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Logically Sizing the PATROL Console Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PATROL Console Server Failover Considerations. . . . . . . . . . . . . . . . . . . . . . . . . .
PATROL Central Operator Web Edition Placement . . . . . . . . . . . . . . . . . . . . . . .
Visualization Performance Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Event Management Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Notification Server Usage and Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Centralized Customization Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Designing for PATROL Management/Upgrade/ Patching . . . . . . . . . . . . . . . . . . . . .
Using the Distribution Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Security Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
24
24
26
26
26
28
28
28
29
30
30
32
32
33
34
35
36
36
37
Chapter 4
39
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Hardware Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Single Server Environments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RTserver Hardware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PATROL Console Server Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents
40
40
40
42
45
53
55
77
78
78
78
78
79
79
80
80
80
80
81
81
83
87
Sample Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example Stakeholder Roster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example Location Analysis Worksheet for Houston . . . . . . . . . . . . . . . . . . . . . . . .
Example Location Analysis Worksheet for Austin. . . . . . . . . . . . . . . . . . . . . . . . . .
Example Location Analysis Worksheet for New York . . . . . . . . . . . . . . . . . . . . . .
Example Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
89
90
91
92
93
Glossary
97
Contents
Figures
Example of a Multiple RT Cloud Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example of Multiple RT Clouds with Hub and Spoke . . . . . . . . . . . . . . . . . . . . . . . . .
Example of Cross-Linked RT Clouds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example of RT Cloud-to-Cloud Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Case 1 "Before" Small Hub-and-Spoke Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Case 1A "After" One Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Case 1B "After" Small Multi-Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Case 2 "Before" Large Hub-and-Spoke Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MainCore, Inc. Enterprise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MainCore, Inc. Houston Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MainCore Inc. Austin Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MainCore, Inc. New York Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figures
60
63
65
67
69
70
71
72
93
94
95
96
10
Tables
Requirements for Small Single-Server PATROL Central Environment
(Up to 500,000 managed objects) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Requirements for a Medium Single-Server PATROL Central Environment
(Up to 3 million managed objects) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Requirements for One RTserver on a Dedicated Computer
(Up to 750 RT clients) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Requirements for Two RTservers on a Dedicated Computer
(Up to 1000 total RT clients) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Requirements for Two to Four RTservers on a Dedicated Computer
(Up to 2000 RT clients) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Requirements for a Medium PATROL Console Server Environment
(Up to 3 million managed objects) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Requirements for a Large PATROL Console Server Environment
(Up to 10 million managed objects) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Requirements for PATROL Central Operator Microsoft Windows Edition
(based on profile size) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Tables
41
41
42
43
44
45
46
47
11
12
Chapter
Introduction
This chapter presents the following topics:
Purpose of this Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Planning the Implementation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Chapter 1
Introduction
13
While this guide does not answer every possible technical question or provide for
every design contingency, it will almost always lead to a design that uses PATROL in
the best way possible. Choosing not to follow the guidelines presented in this
document, or neglecting to take into account the factors discussed herein, can lead to
a PATROL implementation that is unstable, that requires an excessive amount of time
to maintain, or that can experience a major failure. If, at the end of the design process,
14
solution architects have doubts about the proposed implementation, they are
encouraged to consult with BMC Software and the PATROL Line Of Business for
design validation before proceeding. Failure to do so can result in lost time, added
expense, and disruption of service.
Best Practice Recommendation BP2
Always consult with BMC Software and with the PATROL Line of Business before deviating
from the Best Practices presented in this guide. Failure to do so can lead to major software
failures and system outages.
To view the latest BMC Software manuals, technical bulletins, flashes, and white
papers visit the BMC Software Customer Support page at
http://www.bmc.com/support_home.
Solution architects should start with the "big picture" and work their way down to
progressively greater levels of detail until all their objectives have been met. Chapters
2 and 3 in this document will discuss this process in detail. In general, though, there
are four steps in the process.
1. Solution architects should first identify the implementation stakeholders in order
to engage them in the design and review process. A solution diagram should then
be built that provides PATROL support for the number of managed servers and
consoles that the solution requires.
Every infrastructure component should be treated as if it required a dedicated
server, but with geography and network topology in mind. The diagram should
indicate the speed of all network connections, note the locations of any firewalls
that may be present, and reference any existing equipment to be used for PATROL
infrastructure.
Chapter 1
Introduction
15
2. The number of RT clouds and RTservers needed to support the PATROL solution
should be determined. Provision should not yet be made for recoverability or
failover, but key considerations include
16
Chapter
This chapter describes how to complete the planning forms used to record
information collected during analysis of the enterprise to be managed with PATROL.
The forms can be found in Appendix B, Infrastructure Planning Forms.
This chapter presents the following topics:
Analysis Tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Completing the Enterprise Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
17
Analysis Tools
Analysis Tools
The Stakeholder Roster Worksheet provides a convenient place to note the people
and groups who will be involved with the implementation and day-to-day operations
using PATROL. Identifying contacts early in the process will aid in analysis and
communication and help reduce missteps during implementation planning.
The Location Analysis Worksheet provides a standardized format for recording
relevant information about each logical and physical location to be managed. The
analysis performed and the data recorded on this form will be used to develop a
logical design for the PATROL implementation. The thoroughness of the analysis and
accuracy of the information recorded will have a significant impact on the quality of
the design and the overall success of the PATROL implementation. Use of this form
will facilitate communication and review of the proposed design.
Best Practice Recommendation BP4
When planning a PATROL Central implementation, the minimum recommended PATROL
Agent version is 3.5.32.20 and version 3.6.00.05 or later is preferred. These versions
incorporate many performance and scalability improvements related to PATROL Central. If
existing agents must be upgraded they should be upgraded to the latest Generally Available
version.
ACTION 1
Begin by completing the Stakeholder Roster. List all the groups or individuals who
will participate or have an interest in this implementation of PATROL. If a group is
listed, identify a person to serve as communications contact for the group. Refer to
the roster during the remainder of the process when questions arise or when
communicating about the progress of the implementation. Frequent communication
will help highlight issues early in the process and reduce rework later. At a
minimum, identify the people directly involved with:
18
Document the role or primary interest of each stakeholder. Typical stakeholders will
include System Administrators, Database Administrators, managers of business
applications such as SAP, PeopleSoft, or Oracle Financials, Operations personnel, and
PATROL Administrators. Key management sponsors or business partners may also
be listed. Include anyone who will be directly involved in the implementation or
needs to be kept informed of its progress.
Make a high-level diagram of the entire business enterprise to be managed using
PATROL, similar to the sample shown in Appendix C, Sample Solution Planning
Forms & Diagrams. While it need not be overly detailed, it must indicate any widearea, lower-speed (less than 10 Mb/sec), or other non-local network links, and any
firewalls, network switches, or other infrastructure that could impact connectivity. If
portions of a physical location are isolated by local firewalls, identify each isolated
area as a separate logical location and treat it as a physically separate location for the
remainder of the analysis process.
Make a copy of the Location Analysis Worksheet for each location shown on the
diagram and write the name of each location in the field provided on each worksheet.
Enter the firewall and network reliability information on each location worksheet.
Some PATROL Central components must reside on the same LAN speed/quality
network segment because WAN or other lower-speed/reliability connections will
degrade performance. Accurate connectivity information is key to the design process.
Enter the total number of nodes to be managed by PATROL on the worksheet for
each location. Include all systems where a PATROL Agent will be installed in this
count. This data is for environment sizing, so individual host names, OS vendors, and
other similar details are unnecessary.
List the name of each PATROL KM to be run on ANY managed node at each location.
This information is vital to estimating the total number of objects to be managed and
impacts PATROL Console Server placement and management profile definition.
Note any KM which may have an unusually high number of Application Class
instances. An example might be an operating system KM running on a large file
server with an unusually large number of disk and file system instances.
Enter on each worksheet whether or not PATROL Reporting will be used to
aggregate data from managed nodes at that location.
If events are to be forwarded from managed nodes to an event management tool,
enter the name of the destination tool on the worksheet for each location.
19
For each location where PATROL Central consoles are to be used, complete the
Visualization Information section of the worksheet.
To determine the number and placement of PATROL Console Servers, an estimate of
the number of managed objects to be concurrently visualized is needed. The number
of managed objects in a PATROL Console Server is the product of the number of
concurrent console sessions multiplied by the number of managed objects in each
session profile. The Visualization Information section of the worksheet is used to
document estimates for these factors.
Estimate by location the typical number of KMs needed per node, and enter that
information on each worksheet. Using the object count guidelines in the following
list, determine the total number of managed objects needed on the typical managed
node for each location and enter the number on each location's worksheet.
1000-2000 objects per large application KM (for example, PATROL for Microsoft
Exchange Servers or PATROL for BEA WebLogic)
450-800 objects per medium-sized KM (for example, PATROL for Unix and Linux,
PATROL for Microsoft Windows Servers, or PATROL for Oracle)
200 objects per small application KM (for example, PATROL for Compaq Insight
Manager or PATROL for Dell OpenManage)
NOTE
If a KM was previously identified as likely to have an unusually large number of Application
20
The number of concurrent managed objects limits the number of console sessions that
can be handled by a single PATROL Console Server. The object count is estimated by
multiplying the number of managed objects in the typical management profile by the
number of concurrent console sessions. Record the number on the worksheet for each
location as the total number of concurrent managed objects.
21
22
Chapter
24
24
24
26
26
26
28
28
28
29
30
30
32
32
33
34
35
36
36
37
23
Overview
Overview
In a general sense, the logical design step in the process involves adding to the
enterprise diagram produced in the previous chapter and indicating where various
PATROL Central infrastructure services will reside. The logical design phase is not
concerned with the physical devices that will provide these services, only with their
locations and in what numbers the services are needed.
The first refinement to the enterprise diagram centers on designing domains which
consist of managed nodes and one or more RTservers. In other documentation
and/or presentations, these domains have variously been referred to as "messaging
domains," "agent clouds," "RT clouds," and "clouds."
The second refinement addresses the visualization needs of the enterprise. It includes
numbers and placement of PATROL Central Operator Microsoft Windows Edition
installations, PATROL Console Servers, and PATROL Central Operator Web
Edition installation(s) (if browser-based visualization is needed). Design decisions
may result in RTservers dedicated to supporting visualization components. These
domains have also been referred to by several names, such as "console domain,"
"console cloud," "RT cloud," and "cloud."
In this document, where the association of RTservers with managed nodes is
important, the term "agent cloud" will be used. Similarly, "console cloud" will be used
to specify the combination of visualization components and RTservers. If the
association is not essential to the discussion, the more generic term "RT cloud" may be
used to describe either agent or console clouds.
24
The first step involves high-level RTserver placement based on the number of
managed nodes present and the characteristics of the message traffic the RTserver
will handle. Once basic RTserver placement has been completed, a review of the
enterprise will be conducted to identify potential points of unacceptable service
interruption in the event of an RTserver outage, and to consider strategies to reduce
the impact of any such outages.
It is generally desirable to design a self-contained RT cloud for each location with 50
or more managed nodes, placing RTservers in the same geographical location as the
managed nodes they support. Starting with the release of PATROL Central version
7.3, RTserver design decisions have become much simpler.
Best Practice Recommendation BP5
No RT cloud should include more than 500-700 managed nodes. If PATROL Reporting will
be used to aggregate data from a location, or if a Notification Server will be used, AND the
RTserver will be installed on the same node as either one, that RTserver can support no more
than 500 managed nodes.
Typically, an RT cloud will need no more than two RTservers (a primary and a
backup). Scalability is achieved by creating additional RT clouds rather than
configuring "hub and spoke" RTservers to create larger domains.
NOTE
An exception to this general rule is presented in Appendix A, Frequently Asked
Questionsin the question When should hub-and-spoke RT cloud architecture be used? on
page 62.
In the case of managed nodes within DMZs, many companies have policies dictating
the direction in which connections can be initiated to/from the DMZ. Typically,
connecting to a node in the DMZ from the corporate intranet is permissible but
connecting from a node in the DMZ into the intranet is not. In these scenarios, an RT
cloud can be created for each DMZ and PATROL Console Server can be configured to
connect to the RTservers in that RT cloud.
For additional information that may affect the number and placement of RT clouds,
see RTserver and RT clouds on page 57.
25
ACTION 3
Using the traffic characteristics and best practice recommendations for guidance,
update the solution diagrams for each location to reflect the number and placement of
RTservers.
26
1. An RTserver that is servicing managed nodes fails. In this case, all managed nodes
connected to the failed RTserver will be shown as disconnected in their respective
management profiles on all active consoles and the managed nodes will begin their
failover process (if available). Since a managed node RTserver typically services
only a portion of a given management profile, recovery times are shorter than for
services which impact an entire management profile.
2. An instance of PATROL Central Operator Web Edition fails. When this happens,
all browser-based sessions being served by the failed instance will cease to
function. Only browser-based sessions will be affected; any PATROL Central
Operator Microsoft Windows Edition sessions will continue to function.
3. An RTserver supporting a console cloud fails. When this happens, consoles served
by the failed RTserver will be unable to communicate with the PATROL Console
Server and will not be usable until the RTserver service is restored or the consoles
and PATROL Console Server establish communication through a backup RTserver
(if available).
4. A PATROL Console Server fails. As with a console RTserver failure, all console
sessions served by the failed PATROL Console Server will cease to function. For
more information, see PATROL Console Server Failover Considerations on
page 30.
The way in which a component goes offline also affects recovery time. If a component
is stopped gracefully, all other components recognize the loss almost immediately
and begin their respective failovers. If a component is lost unexpectedly due to a
power or hardware failure, abrupt system shutdown, or physical network
interruption, loss of the component may not be recognized by other components for
several minutes and recovery time may be lengthened.
Best Practice Recommendation BP7
If a PATROL Central design includes backup RTservers, place a backup RTserver in each RT
cloud. If a separate RTserver is provided to handle traffic between the PATROL Console
Server and PATROL Central consoles (or PATROL Central Operator Web Edition) and
redundancy is needed, place one backup RTserver in the console cloud.
ACTION 4
Review the solution diagrams for all locations, looking for areas where an RTserver
outage would create an unacceptable interruption of monitoring. If any such areas are
identified, reconfigure the RT cloud and add backup RTservers to allow the
infrastructure to continue to function if an RTserver service fails.
27
Visualization Design
Visualization Design
This phase of the design process involves determining the placement and number of
PATROL Console Servers (and optionally PATROL Central Operator Web Edition
instances) to support the anticipated level of PATROL Central console usage. At this
stage, placement decisions should be made without regard for the possibility of
sharing hardware with other services. The only consideration should be logical
placement of visualization services based on best practice recommendations.
There are multiple PATROL Console Server placement scenarios that will work for
nearly any given situation. The best practice recommendations presented here
represent a compromise between implementation complexity, run-time performance,
and recovery time in the event of a service outage.
Starting with the release of PATROL Central version 7.3, the PATROL Console Server
has the ability to visualize managed nodes from multiple RT clouds. This simplifies
PATROL Console Server placement design decisions.
28
If a location with console users does not have a local PATROL Console Server,
examine nearby locations for excess PATROL Console Server capacity that can be
used to service it. If no excess capacity exists, either an additional PATROL Console
Server must be installed at the remote location or a PATROL Console Server (and
potentially other PATROL Central infrastructure components) must be installed
locally. Bandwidth and reliability of the network connection to the remote location
must be factored into the decision to use a remote PATROL Console Server, since the
network link is a single point of failure.
The primary reason to install a PATROL Console Server where it would not
otherwise be justified is for improved performance. If console users in a particular
location only need to view managed nodes in the same location, the improved
reliability and performance gained by keeping network traffic local may justify the
additional administrative and equipment cost of an additional PATROL Console
Server.
Maintenance of multiple PATROL Console Servers requires additional
administrative work. While PATROL Console Server 7.5 introduced an online backup
capability, there are still no provisions for automatically copying data between
PATROL Console Servers, nor is there any functionality to synchronize or reconcile
changes when the same type of data (privilege assignments, for instance) has been
changed on two or more PATROL Console Servers. Operational policies and manual
procedures must be implemented to work around these limitations.
To arrive at an optimal design, each location must be evaluated independently and
PATROL Console Server placement decisions must be made on a case-by-case basis.
29
Version 7.3 of the PATROL Console Server also introduced overload protection. The
overload protection feature allows the PATROL Console Server to reject additional
operator connections when they will cause preconfigured limits to be exceeded.
Best Practice Recommendation BP9
Regardless of the estimated number of concurrent managed objects, plan for no more than 25
(on 32-bit hardware) or 100 (on 64-bit hardware) concurrent console sessions to be served by
a single PATROL Console Server.
Update the solution diagram for each location to show the placement of primary
PATROL Console Servers.
30
Install the PATROL Console Server on high-availability cluster hardware, store the
configuration files on a shared disk, and define a cluster failover package that
includes the PATROL Console Server.
Install a secondary PATROL Console Server on a separate node and use the online
backup feature to periodically snapshot data from the primary PATROL Console
Server.
The only way to provide transparent service restoration in the event of a PATROL Console
Server failure is by using a high-availability cluster. In this scenario, the PATROL Console
Server service will be restarted by the cluster management software. Once restarted,
the console cloud will reestablish connectivity and console sessions will resume
functioning.
In the absence of high-availability hardware, a secondary PATROL Console Server
allows PATROL Central availability to continue during planned outages of the
primary PATROL Console Server or during a hardware or other failure local to the
primary PATROL Console Server node. A secondary PATROL Console Server does not,
however, provide unattended recovery for a failed primary PATROL Console Server.
If a secondary PATROL Console Server is activated in order to recover from an
outage, its configuration data (for example, profiles and impersonations) will only be
current to the last time the configuration files for the secondary PATROL Console
Server were copied from the primary PATROL Console Server. Depending on the
configuration of the online backup feature in the primary PATROL Console Server
and the network-shared disks between the primary and the secondary, administrator
intervention may be required to copy the latest data to the backup PATROL Console
Server and then restart it.
NOTE
Copying configuration files from a running or active PATROL Console Server is supported
only by using the online backup capability. Copying configuration files into a running or
active PATROL Console Server is not supported. When using the admin_copy tool, copying
configurations between PATROL Console Servers requires that both PATROL Console Server
processes be shut down.
After a secondary PATROL Console Server has been started either by highavailability cluster management software or through manual intervention, additional
time is required for console sessions to be reestablished. This additional time will
vary with the number and size of the management profiles involved.
The way in which a component goes offline also affects recovery time. If a component
is stopped gracefully (PATROL Console Server or RTserver), other components
recognize the loss almost immediately and begin their respective failovers. If a
component is lost unexpectedly due to a power or hardware failure, abrupt system
shutdown, or physical network interruption, loss of the component may not be
recognized by other components for several minutes and overall recovery time may
be lengthened.
For more information, see the PATROL Console Server and RTserver Getting Started
guide.
31
ACTION 6
Using these best practice recommendations and the keeping in mind the failover
considerations discussed in this section, update the solution diagram for each
location to reflect the presence of any secondary PATROL Console Servers.
32
NOTE
Starting with version 7.5.00 of PATROL Central Operator Microsoft Windows Edition,
management profiles can be configured such that all KMs from an agents preloaded KM list
are loaded in the profile automatically. This feature allows profiles to automatically stay in
sync with changes to the preloaded KM list on each managed node.
If there are distinct groups of console users, PATROL Console Server performance
can be improved by configuring separate PATROL Console Servers for each group of
users. This approach is only practical if the enterprise is large and complex, and if
there is no need to share management profiles between groups.
If the console user community is large but there is overlap in management profile use,
it is better to host the PATROL Console Server on hardware with more capacity than
to divide the console session workload between PATROL Console Servers. The
additional manual effort needed to keep common management profiles synchronized
usually outweighs the performance improvement provided by a second PATROL
Console Server.
PATROL Central Operator Microsoft Windows Edition is fully supported under
Microsoft Windows Terminal Services. The memory requirement for each console
session depends on the number of concurrent managed objects, and the CPU
utilization depends on the activity of each console session. A good technique for
sizing the Terminal Services host system is to configure a typical session and multiply
its resource consumption by the anticipated number of concurrent console sessions.
33
34
Preloading KMs
Poll times
Disabling selected application classes
Placement of PCM in the enterprise is dictated by the needs of its users. A single
instance of PCM can propagate rulesets to all nodes across an enterprise, so its
geographical or network location is not critical. Distribution of rulesets through a
firewall requires an open port for PCM.
Some managed node attributes cannot be controlled using PCM because of the way
those attributes are implemented. An example is data collection blackout. Some KMs
are hard-coded to assure that their data collectors are available continuously, so if
data collection is stopped with PCM, the local KM code will automatically re-enable
it. Always verify compatibility of KM attributes before attempting to control them
with PCM.
PCM implements locking on its configuration files, so only one modification session
may be open at a time.
If responsibilities for the enterprise are divided among multiple administrators,
consider implementing separate PCM instances for each administrative team.
Best Practice Recommendation BP17
Include no more than 700 managed nodes in any agent group (agent.ini file). Configure PCM
to allow no more than 20 concurrent ruleset distributions.
35
NOTE
All components and patches for PATROL products are currently packaged for use
with the Distribution Server.
36
The more components defined in a collection, the greater the chance for a version
conflict and subsequent distribution failure
Deployment times are higher for large collections than for small ones
Security Considerations
Small collections deploy more rapidly than large ones, but may require multiple
reboots of the same computer (depending the software being deployed)
Large collections may require more administrative effort than small ones, since the
collection must be updated when any component in it changes
Security Considerations
PATROL security level settings are defined by each customer's security policy and
are largely beyond the scope of this document except to state that PATROL Console
Server 7.5.00 supports the limited interoperability between consoles and agents
running at different security levels.
Starting with the 7.5.00 release, the RT cloud connection to a PATROL Console Server
can be configured with its own security level. All other components in that cloud
(agents and consoles) have to be at the same security level, but RT clouds at different
security levels can be connected to the PATROL Console Server transparently to the
consoles and agents in those different clouds. For more information, see the PATROL
Console Server and RTserver Getting Started guide.
37
Security Considerations
38
Chapter
Chapter 4
40
40
40
42
45
46
47
47
48
49
49
51
51
39
Overview
Overview
Once the logical design is complete, PATROL Central infrastructure components
must be assigned to specific servers. To accomplish this, all components must be
reviewed by type and assigned to computers in the solution diagram. This process
typically requires several passes through the design as the capacities of existing
servers and their abilities to host particular components are assessed.
ACTION 7
Using guidelines presented in this section, determine the hardware needed to host all
components identified in the logical design phase. Update solution diagrams to
reflect CPU, memory, and disk capacity of all infrastructure servers.
Hardware Requirements
The servers needed to host PATROL infrastructure vary in configuration based on the
components they are called upon to run and the workload they are required to bear.
While these factors vary widely from one environment to another, broad guidelines
are presented in the following sections to help with computer selection and
configuration. Except as noted, these computers are dedicated systems that only run
PATROL infrastructure and do not run major applications.
40
Table 1
Resource
Minimum Requirements
Recommendation
Processor
Single Processor
Intel Pentium III at 800 MHz
Solaris
Dual Processor
Intel Pentium III at 800 MHz
Solaris
Single Processor
SUN Netra X1 at 500 MHz
Dual Processor
SUN 420R UltraSPARC II at
450 MHz
AIX
AIX
Single Processor
IBM pSeries POWER 3 II at 450
MHz
Dual Processor
IBM pSeries POWER 3 II at 375
MHz
Server Memory
512 MB
1 GB
Disk Space
300 MB
300 MB
Table 2
Resource
Minimum Requirements
Recommendation
Processor
Dual Processor
Intel Pentium 4 at 2000 MHz
Xeon
Solaris
Quad Processor
Intel Pentium III at 900 MHz
Solaris
Dual Processor
SUN V240 UltraSPARC III at
1000 MHz
AIX
Quad Processor
SUN V480/880 at 900 MHz
AIX
Dual Processor
IBM pSeries POWER 4 at 1000
MHz
Quad Processor
IBM pSeries POWER 4 at 1000
MHz
Server Memory
3 GB
4 GB
Disk Space
1 GB
2 GB
Chapter 4
41
RTserver Hardware
RTserver Hardware
If the requirements of a PATROL infrastructure server exceed the recommended
capacity of a single computer or if special circumstances warrant, RTservers can be
installed on separate computers (primarily to support one or more clouds of PATROL
Agents). In such circumstances it is usually best to choose a computer with a
relatively fast CPU and large memory in order to allow the RTserver to maintain high
throughput under peak conditions. This need notwithstanding, a dedicated RTserver
computer will typically have spare capacity for use by other PATROL infrastructure
components such as a Notification Server or a PATROL Reporting aggregator. There
are scenarios where multiple RTservers can be hosted on the same computer to
reduce the hardware costs associated with hosting PATROL infrastructure.
Table 3 shows the hardware requirements for a single RTserver running on a
dedicated computer, separate from the PATROL Console Server or PATROL Central
Operator Web Edition.
Table 3
Resource
Minimum Requirements
Recommendation
Processor
Single Processor
Intel Pentium III at 733 MHz
Solaris
Single Processor
SUN UltraSPARC IIi at 400
MHz
AIX
Single Processor
IBM pSeries POWER 3-II at 333
MHz
Single Processor
Intel Pentium 4 at 1.4 GHz
Solaris
Single Processor
SUN UltraSPARC IIe at 500
MHz
AIX
Single Processor
IBM pSeries POWER 3-II at 500
MHz
Server Memory
512 MB
1 GB
Disk Space
300 MB
300 MB
Starting with the release of RTserver version 6.5.01, PATROL Central supports
configurations with more than one RTserver on the same computer. Depending on its
speed and workload (the total number of RT clients connected to all RTservers on the
computer), a single computer can host up to four RTservers. Table 4 on page 43 and
Table 5 on page 44 show the recommended hardware for different workloads:
42
RTserver Hardware
Table 4
Resource
Recommendation
Processor
Single Processor
Intel Pentium 4 at 1400 MHz
Solaris
Single Processor
SUN UltraSPARC IIe at 500 MHz
AIX
Single Processor
IBM pSeries POWER 3-II at 375 MHz
Server Memory
1 GB
Disk Space
300 MB
Chapter 4
43
RTserver Hardware
Table 5
Resource
Recommendation
Processor
Linux
Dual Processor
Intel Pentium 4 at 1400 MHz
Solaris
Single Processor
SUN V210 UltraSPARC IIIi at 1 GHz or
Dual Processor
SUN 280R UltraSPARC III at 750 MHz
AIX
Single Processor
IBM pSeries POWER 4 at 1200 MHz or
Dual Processor
IBM pSeries POWER 3 II at 375 MHz
Windows
No more than two RTservers can run on the
same Windows-based computer. For more
information on configuration, see
RTserver Hardware on page 42.
Server Memory
2 GB
Disk Space
300 MB
For more information about configuring more than one RTserver to run on the same
computer, see the PATROL Console Server and RTserver Getting Started guide.
When running more than one RTserver on the same computer, the number of backup
and primary RTservers on the same system should be balanced. If two agent RT
clouds are supported by two computers, for example, one computer should serve as
the primary for cloud A and the backup for cloud B while the other computer serves
as the primary for cloud B and the backup for cloud A. This distributes the workload
across both RTservers and minimizes the impact of losing a single computer.
44
An RTserver can be run on the PATROL Console Server computer to support a cloud
of PATROL Central consoles, but all agent RT clouds should be hosted on other
hardware.
Best Practice Recommendation BP22
Set PATROL Console Server overload protection limits based on the estimated workload of
each PATROL Console Server in the solution design. For more information about overload
protection and the type of limits that are available, see the PATROL Console Server and
RTserver Getting Started guide.
Table 6
Resource
Minimum Requirements
Recommendation
Processor
Dual Processor
Intel Pentium 4 at 1400 MHz
Solaris
Dual Processor
Intel Pentium 4 at 2000 MHz
Solaris
Dual Processor
SUN 280R UltraSPARC III at
750 MHz
AIX
Dual Processor
SUN V240 UltraSPARC IIIi at 1
GHz
AIX
Dual Processor
IBM pSeries POWER 3-II at 450
MHz
Dual Processor
IBM pSeries POWER 4 at 1200
MHz
Server Memory
3 GB
4 GB
Disk Space
1 GB
2 GB
Chapter 4
45
Table 7
Resource
Minimum Requirements
Recommendation
Processor
Linux
Linux
Dual Processor
Intel/HP Itanium 2 at 900 MHz
Solaris
Dual Processor
SUN V240 at 1280 MHz
AIX
Dual Processor
IBM pSeries POWER 4 at 1450
MHz
Quad Processor
Intel/HP Itanium 2 at 900 MHz
Solaris
Quad Processor
SUN V480/880 at 900 MHz
AIX
Quad Processor
IBM pSeries POWER 4 at 1200
MHz
Windows
Due to the virtual memory
limitations of 32-bit hardware,
large PATROL Console Server
configurations cannot be
hosted on a single Windows
computer.
Server Memory
4 GB
6 GB
Disk Space
3 GB
4 GB
46
Memory
Single Processor
Intel Pentium III at 500 MHz
128 MB
Memory
Single Processor
Intel Pentium III at 700 MHz
512 MB
Memory
Single Processor
Intel Pentium III at 700 MHz
1 GB
NOTE
Profiles with 500 or more agents take several minutes to finish loading.
Chapter 4
47
All infrastructure components can now coexist with other infrastructure components
provided the host has sufficient memory and CPU bandwidth.
A Common Connect client (for example, the PEM CSE client) can be installed on the
same host as the PATROL Console Server and RTserver.
The system requirements of the PATROL Console Server are relatively high, so when
possible place the PATROL Console Server and RTserver on a dedicated host. Try to
keep that system free of other applications.
Never install a PATROL Console Server or RTserver on a computer running any of the
following products:
Always check product release notes to obtain the latest information about
incompatibilities and configuration techniques for co-locating components.
PATROL Central Operator Web Edition version 7.1.10 cannot be installed on the same
computer as PATROL Central Alerts Web Edition version 7.2.01. See the following
Product Flash for further information:
http://documents.bmc.com/supportu/documents/37/38/43738/Output/090f44b1802
82615.htm
48
When the environment or the hardware mandates the presence of multiple RT clouds,
provide an RTserver for PATROL Central clients that is separate from the RTserver used
by the PATROL Agents. The RTservers used to implement the console RT cloud can be
hosted on the same computers as the PATROL Console Server and PATROL Central
Operator Web Edition.
Do not direct connect more than 700 PATROL Agents to a single RTserver or RT cloud.
RT Cloud Configuration
RTservers are capable of handling large numbers of client connections and can be
configured to support messaging failover through appropriate client configuration.
Their proper configuration is key to the performance and stability of a PATROL
Central implementation.
Chapter 4
49
RT Cloud Configuration
Two or more RTservers in the same infrastructure create what BMC Software refers
to as an RT cloud (or sometimes a messaging domain). An RTserver can join an RT
cloud in either of two ways:
1. By obtaining the names of other RTservers in an RT cloud from its configuration
file (rtserver.cm)
2. By invitation from another RTserver
When deciding which RTservers should establish connections with others, there are
guidelines which will prevent and/or reduce unexpected performance and stability
issues:
Best Practice Recommendation BP26
Ensure that communication between any two RTservers is unidirectional; never let the
server_names variable in the configuration files of two RTservers contain one another's
names.
For the simple case of a cloud containing just a primary and a backup RTserver, set the
server_names variable in the primary RTserver to UNKNOWN and set the
server_names variable in the secondary RTserver to point to the primary RTserver.
Plan and control the direction of initialization through firewalls. The preferred
configuration consists of separate RT clouds within each DMZ, with the PATROL
Console Server initiating connections to each through the firewall.
If several DMZs or remote sites are linked into a single RT cloud using hub-and-spoke
architecture, the directional connections should be configured as follows. To permit
server RT-1 within the corporate intranet to contact server RT-2 on the other side of the
firewall and invite it to join the RT cloud, put the name of RT-2 in the server_names
variable on RT-1 and set the server_names variable on RT-2 to "UNKNOWN". This will
prevent RT-2 from joining unless RT-1 establishes the relationship.
Do not assign more than two RTservers to a server_names variable; experience has
shown it to be of no benefit.
50
If you use the online backup capability, consider the maximum number of
concurrent read-write profile users when determining the frequency of the online
backup schedule, especially in cases where relatively large profiles (more than 200
agents) are opened for read-write use on a regular basis. Backup frequencies of
once per hour or longer will minimize performance impact in large environments.
Depending on the disk I/O performance of the PATROL Console Server computer,
the backup frequency can be increased or decreased.
Chapter 4
51
Install separate copies of the PATROL Console Server on local private disks of
each participating host in the cluster.
Use the same configuration information for all hosts in the cluster.
Use the same impersonation database for all hosts in the cluster.
Use the same access control database for all hosts in the cluster.
Use the same Management Profiles for all hosts in the cluster.
Use the same PATROL Console Server log files for all hosts in the cluster.
For more information on replication, see the PATROL Console Server and RTserver
Getting Started guide.
52
Chapter
Chapter 5
53
Perform the following steps for every RT cloud, one cloud at a time:
A. Install RTservers
B. Install PATROL Agents on all managed nodes
Install the PATROL Console Server and configure it to service the required RT
clouds
Planning Ahead
As a matter of good planning, it is helpful to determine in advance those systems
where special privileges are needed to install, configure, and test PATROL
infrastructure. Whenever possible, early arrangements should be made for creation of
the necessary accounts and privileges and for assistance from appropriate system
administrators. If changes require that any systems be re-booted before they take
effect, those activities should be coordinated in advance so that users are not
interrupted and normal business is not disrupted. The Stakeholder Roster completed
during the Analysis Phase can be used to identify people who may be affected.
54
Appendix
Appendix A
55
56
NOTE
With the release of PATROL Agent version 3.5.32.08, it is possible to configure RTserver
connection information with a PCM rule. The object name is "/AgentSetup/rtServers" and its
value is the configuration string (for example, value = tcp:bmchost:2059,tcp:bmchost2:2059).
With the release of PATROL Central Operator Microsoft Windows Edition version
7.2.00 it is possible to configure RTserver connection information through the GUI.
For more information, see the PATROL Console Server and RTserver Getting Started
guide.
Appendix A
57
In cases where only PATROL Central consoles are using the cloud, the recommended
limit of 700 agents provides for an additional 50 RT connections for other types of
PATROL applications. In cases where PATROL Reporting or Common Connect are
using the cloud as well, the recommended limit for the number of agents is 500 (550
total RT clients).
The RTserver configuration option max_client_conns controls the maximum number
of connections to the RTserver. The default value for RT 6.6.00 is 500, but it may be
increased to any value less than or equal to 750.
58
Appendix A
59
Figure 1
60
Keep the setting for max_client_conns the same for all RTservers in the same cloud.
This ensures that if a failover occurs, the backup RT will be able to accept as many
connections as the primary.
If an environment has a single cloud or a dedicated cloud for consoles, place the
primary RT server on the same computer as the PATROL Console Server and the
backup RTserver on the same computer as PATROL Central Operator Web
Edition.
An exception to this rule exists for cases where platform support for the RTserver
differs from that of the PATROL Console Server or PATROL Central Operator
Web Edition. In such cases, RTservers must be placed on separate computers from
the PATROL Console Server or PATROL Central Operator Web Edition.
In the case of a two-RTserver cloud, all RT clients connected to the cloud should
have the same RT locator string to ensure that they use the same RTserver for
primary and backup. For instance, in Figure 1 on page 60 the order of RTservers
for the "Consoles" cloud is the same in PATROL Central Operator Microsoft
Windows Edition, in PATROL Central Operator Web Edition, and in the
Acfg_7_1_0_RTCloud_Connection configuration instance for the PATROL
Console Server.
Appendix A
61
behind the corporate firewall, and by configuring the RTserver behind the firewall to
connect to the RTserver(s) in the DMZ. Starting with PATROL Console Server version
7.3.00, an alternative configuration is preferred. Instead of using RTserver-toRTserver connections across the firewall, RTservers in the DMZ can be configured as
a standalone RT cloud and the PATROL Console Server inside the firewall can
connect directly to the RT cloud in the DMZ. For added security, the PATROL
Console Server configuration for each cloud includes an option that prevents the
PATROL Console Server from advertising its presence. For more information on this
option, see the PATROL Console Server and RTserver Getting Started guide.
The total number of agents in the hub-and-spoke cloud does not exceed 700.
The total number of RTservers in the hub-and-spoke cloud does not exceed 12.
Keep in mind that each remote office should have two RTservers.
The hub RTserver should be positioned on the same side of the WAN connection
as the PATROL Console Server.
62
Figure 2
Building on the example in Figure 2, the cloud named "Agents 2" has been modified
to use a hub-and-spoke configuration. The hubs are RTservers E and F. The RTservers
in each remote office serve as spokes. In this example, all of the RTservers in remote
offices should be configured with their server_names option set to
Appendix A
63
tcp:E:2059,tcp:F:2059. Note that the total number of RTservers in the "Agents 2" cloud
is 10 (below the limit of 12). If each remote office has 50 agents, the total number of
agents in the central office (those connected to RTservers E and F) should not exceed
500 (700 max - (4 x 50)).
64
Figure 3
PATROL Console Server 7.5.00 records the following types of log messages when it
encounters this second scenario:
ERROR:1/25/2005 10:09:22 AM:::RT Cloud 'Agents - 1' has duplicate
rtservers with RT cloud 'Agents - 2'
WARNING:1/25/2005
10:09:22AM:COS_BASE_SUBS:::cos_framework.cpp(663):Cloud[Agents - 2]
RTServer[0]: /_C_8401
WARNING:1/25/2005 10:09:22
AM:COS_BASE_SUBS:::cos_framework.cpp(663):Cloud[Agents - 2]
Appendix A
65
RTServer[1]: /_D_8293
WARNING:1/25/2005 10:09:22
AM:COS_BASE_SUBS:::cos_framework.cpp(663):Cloud[Agents - 2]
RTServer[2]: /_E_8312
WARNING:1/25/2005 10:09:22
AM:COS_BASE_SUBS:::cos_framework.cpp(663):Cloud[Agents - 2]
RTServer[2]: /_F_8927
ERROR:1/25/2005 9:23:19 AM:::RT Cloud 'Agents - 1' has duplicate
rtservers with RT cloud 'Agents - 2' forcing disconnect.
INFORM:1/25/2005 9:23:19 AM::11.221:Disconnecting from 'Agents - 2'
In general, these types of messages are an indication that there is a problem either
with the RT to RT configurations or the definition of the RT clouds in the PATROL
Console Server configuration. To correct these problems, list the RTservers identified
in each of the PATROL Console Server configuration entries for the cloud names
listed in the message. For each of those RTservers, locate the server_names option in
their respective rtserver.cm files. Based on the server_names values, map out the RT to
RT connections and compare that to the list of RTservers from the PATROL Console
Server configuration entries. Depending on what you find, correct either the PATROL
Console Server configuration file or the RT to RT configuration to eliminate the
redundancy. In this particular example, the problem is in the RT to RT configuration
for Rtserver D.
NOTE
Changes to PATROL Console Server configuration require a restart of the console sever, and
changes to RTserver configuration require a restart of the RTserver.
66
Figure 4
An agent can be manually moved from one RT cloud to another using a procedure
outlined in the PATROL Console Server 7.3.00 Release Notes (September 30, 2004). Refer
to the section on "Failover of PATROL Agents From One RTserver Cloud to Another".
Appendix A
67
Can RTserver versions 6.2, 6.5.01, and 6.6 be mixed in the same cloud?
Create one cloud for consoles and one or more clouds for agents.
Remove the hub RTservers that only connected to other RTservers in the previous
architecture.
The following three figures show before-and-after scenarios for a small hub-andspoke design with four RTservers (Figure 5 on page 69). In this example there are two
possible configurations for the "after" case depending on the hardware available
(Figure 6 on page 70 and Figure 7 on page 71).
68
Figure 5
In the first case, as illustrated in Figure 6 on page 70, the hub-and spoke
implementation in Figure 5 can be reduced to just two computers.
Appendix A
69
Figure 6
In the second case, as illustrated in Figure 7 on page 71, the small hub-and-spoke
cloud in Figure 5 on page 69 can be divided into two clouds of two RTservers each:
one cloud for consoles and one cloud for agents.
70
Figure 7
Appendix A
71
Figure 8
In this example, the number of RTserver computers can be reduced by two because
hubs E and F can be removed. The resultant multi-cloud configuration is equivalent
to that illustrated in Figure 7 on page 71: one cloud for consoles composed of
RTservers A (primary) and B (backup), and one cloud for agents composed of
RTservers C (primary) and D (primary). As part of this migration, all agents must be
reconfigured to use RTservers C and D.
72
When running the RTserver through a firewall, it is best to configure the RTserver
and the firewall to use a non-well-known port.
Appendix A
73
74
Appendix A
75
76
How many users can PATROL Central Operator Web Edition support?
Appendix A
77
How large a profile can PATROL Central Operator Microsoft Windows Edition support?
78
What versions of the PATROL Central infrastructure are supported by the PATROL Infrastructure Monitor?
Place the PATROL Infrastructure Monitor on the same computer as the PATROL
Console Server.
Appendix A
79
80
Make localized versions of rules and rulesets for agents that require unique
variations
Don't be afraid to duplicate specific rules into different rulesets when it improves
ease of use
Use an effective naming convention for rulesets. This will add consistency to the
way the tool is used by different users.
Once an Agent has been configured and tested, use "Get Configuration" to save its
configuration in the History ruleset folder. This consolidated ruleset will then be
available to apply to other systems.
Appendix A
81
82
Appendix
Appendix B
83
Contact Phone/Email
Role/Interest
Deployment/Installation
Configuration/Change Management
PATROL Support
Operations/Event Management
84
6.
2.
7.
3.
8.
4.
9.
5.
10.
Visualization Information
Typical number of Knowledge Modules used on each managed node:
Typical number of managed objects on each managed node:
Number of PATROL Central Operator Microsoft Windows Edition consoles:
Number of PATROL Central Operator Web Edition users (daily / infrequently):
Number of concurrent PATROL Central Operator console sessions:
Typical number of managed nodes in each management profile:
Will management profiles include nodes at other locations?
If so, indicate which other locations:
Typical number of managed objects in each management profile:
Estimated total number of concurrent managed objects:
Appendix B
85
86
Appendix
This appendix includes forms and diagrams documenting the analysis and logical
design steps for a hypothetical PATROL Central implementation at MainCore, Inc.
This appendix presents the following topics:
Sample Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example Stakeholder Roster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example Location Analysis Worksheet for Houston . . . . . . . . . . . . . . . . . . . . . . . .
Example Location Analysis Worksheet for Austin. . . . . . . . . . . . . . . . . . . . . . . . . .
Example Location Analysis Worksheet for New York . . . . . . . . . . . . . . . . . . . . . .
Example Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Appendix C
88
89
90
91
92
93
87
Sample Scenario
Sample Scenario
MainCore has offices in Houston, Austin, and New York City. Their primary
operations center is in Houston, with a secondary center in New York. Their main
interest in PATROL is to monitor the operating system and Oracle databases on their
servers. They will also use PATROL for Microsoft Exchange Servers to monitor two
mail servers in Houston and two others in New York.
The sample Stakeholder Roster includes contacts in the Systems Administration,
Oracle DBA, and Accounting groups. Participation by the system administrators and
DBAs will be needed for the actual implementation, and the Accounting group is
interested because the Oracle databases being monitored support financial
operations.
The analysis of this environment resulted in the following conclusions that were used
to complete the logical design:
88
The network links between offices have sufficient bandwidth and reliability to
permit those offices to share infrastructure components.
The managed node count in both Houston and Austin are sufficiently small to
permit one RTserver in each location to service both (a secondary is provided in
each RT cloud for failover).
The New York office does not need any infrastructure components installed
locally.
The managed object counts were estimated using guidelines for OS and Oracle
KMs. PATROL for Microsoft Exchange Servers was not used because there are
only four instances of it in the enterprise (it is not used on a "typical" node).
The estimated total concurrent managed object count is 1,440,000, which is within
the capacity of a single PATROL Console Server.
The estimated number of concurrent console sessions is 10, which is within the
capacity of a single PATROL Console Server.
Contact Phone/Email
Role/Interest
John Smith
Deployment/Installation
John Smith
Configuration/Change Management
Janet Green
PATROL Support
John Smith
Operations/Event Management
Bill Landry
Judy Jones
Operations/Event Management
Mary Ward
Director of IT
Sam Leonard
Accounting Manager
Appendix C
89
6.
7.
8.
9.
10.
Visualization Information
Typical number of Knowledge Modules used on each managed node: 3 (OS, Oracle, KM for EM)
Typical number of managed objects on each managed node: 1800 (2 * 900)
Number of PATROL Central Operator Microsoft Windows Edition consoles: 3
Number of PATROL Central Operator Web Edition users (daily / infrequently): 0
Number of concurrent PATROL Central Operator console sessions: 3
Typical number of managed nodes in each management profile: 80
Will management profiles include nodes at other locations? Yes
If so, indicate which other locations: Austin, New York
Typical number of managed objects in each management profile: 144,000 (1800 * 80)
Estimated total number of concurrent managed objects: 432,000 (144,000 * 3)
90
6.
7.
8.
9.
5.
10.
Visualization Information
Typical number of Knowledge Modules used on each managed node: 3 (OS, Oracle, KM for EM)
Typical number of managed objects on each managed node: 1800 (2 * 900)
Number of PATROL Central Operator Microsoft Windows Edition consoles: 0
Number of PATROL Central Operator Web Edition users (daily / infrequently): 0 / 2
Number of concurrent PATROL Central Operator console sessions: 2
Typical number of managed nodes in each management profile: 80
Will management profiles include nodes at other locations? Yes
If so, indicate which other locations: Houston, New York
Typical number of managed objects in each management profile: 144,000 (1800 * 80)
Estimated total number of concurrent managed objects: 288,000 (144,000 * 2)
Appendix C
91
6.
7.
8.
9.
10.
Visualization Information
Typical number of Knowledge Modules used on each managed node: 3 (OS, Oracle, KM for EM)
Typical number of managed objects on each managed node: 1800 (2 * 900)
Number of PATROL Central Operator Microsoft Windows Edition consoles: 3
Number of PATROL Central Operator Web Edition users (daily / infrequently): 2/ 0
Number of concurrent PATROL Central Operator console sessions: 5
Typical number of managed nodes in each management profile: 80
Will management profiles include nodes at other locations? Yes
If so, indicate which other locations: Houston, Austin
Typical number of managed objects in each management profile: 144,000 (1800 * 80)
Estimated total number of concurrent managed objects: 720,000 (144,000 * 5)
92
Example Diagrams
Example Diagrams
Figure 9
Appendix C
93
Example Diagrams
Figure 10
94
Example Diagrams
Figure 11
Appendix C
95
Example Diagrams
Figure 12
96
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Glossary
A
Agent
Agents are the active management components of PATROL infrastructure. The agent is used to
control monitoring and management of host computers, their resources, and their applications.
PATROL Agents work autonomously from one another and from other components, but
communicate with PATROL client applications, common infrastructure components,
integration products, and Third Party applications developed to run and cooperate with
PATROL.
Agent RT Cloud
An agent RT cloud is a defined combination of PATROL agents and the RTserver that services
them (including a secondary RTserver if one is present). It is an RT cloud intended primarily to
service PATROL agents.
Availability Checker
The Availability Checker is a PATROL Knowledge Module that performs PATROL Agent and
host availability checking. This KM does not scale well to large numbers of managed nodes and
should not be deployed on an enterprise scale; it should only be used to troubleshoot problems.
B
Blackout
Blackout is a generic term for predefining a change in PATROL's behavior based on date and
time. In various contexts it may refer to interrupting data collection, preventing parameter
threshold checking, suppressing event generation, or blocking event propagation. When using
the term, establish the context to indicate what action is being discussed.
C
Common Connect
Common Connect is a generic event API that allows integration between PATROL Central
infrastructure and other event management infrastructure such as PEM.
Console RT Cloud
A console RT cloud is a defined combination of PATROL Central consoles, a PATROL Console
Server, and the RTserver that services them (including a secondary RTserver if one is present). It
is an RT cloud intended primarily to service PATROL Central consoles.
Glossary
97
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Console Server
The PATROL Console Server is the component of PATROL Central infrastructure that provides
non-GUI console services, including authentication, user impersonation, management profile
definition, and online help.
D
Distribution Server
The Distribution Server is used to install, re-install, deploy, and un-install BMC Software
distributed system products, upgrades, and patches from a central location.
F
Failover
Failover is the process by which functioning components identify and reconfigure themselves to
use a backup service following the failure of a primary one.
H
Hub RTserver
A hub RTserver is an RTserver which connects to other RTserver(s) as opposed to connecting to
directly to managed nodes. Starting with the release of version 7.3.00 of the PATROL Console
Server, Hub RTservers are generally no longer recommended and should be only be used in
special circumstances.
K
KeepAlive
A KeepAlive is a message sent by an RTserver process to another RTserver process. By
responding to a KeepAlive, an RTserver process signals that it is available to exchange
information.
Knowledge Module
Knowledge modules are loadable libraries of specialized knowledge and functionality that tell
the PATROL Agent how to manage various aspects of server operation. Multiple Knowledge
Modules may be loaded in a single agent, expanding the agent's ability to manage a server's
resources and applications. Knowledge Modules provide the domain-specific intelligence on
every managed server.
Knowledge Module for Event Management
The Knowledge Module for Event Management is the managed node component of PATROL
Configuration Manager that is responsible for interpreting PCM rules and applying them to
KMs and other components on the managed node. Also see Notification Server.
98
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Knowledge Module for Log Management
The Knowledge Modules for Log Management is a PATROL Knowledge Module that is
responsible for scraping text log files and generating events or initiating recovery actions in
response to the presence of selected text or text patterns.
L
Location
In the context of this document a "location" is a geographic site or network segment separated
from other similar areas by a network firewall or network link with less than 10Mb bandwidth.
M
Managed Node
A managed node is a logical or physical computer system where a PATROL Agent is executing
for the purpose of monitoring and managing the business workload running on that system.
Managed Node RTserver
A managed node RTserver is an RTserver which connects directly to and services managed
nodes.
Managed Object
A managed object is an item whose attributes and status are maintained by a PATROL Console
Server, such as an Application Class instance or parameter.
Management Profile
A management profile is a defined collection of managed nodes and Knowledge Modules to be
displayed in a PATROL Central console session.
Messaging Domain
See RT cloud.
N
Notification Server
The Notification Server is a configuration of the PATROL Knowledge Module for Event
Management that provides message rewording, event consolidation and forwarding, and
centralized recovery in response to events forwarded from other managed nodes.
P
PATROL 7 (P7) Knowledge Module
See PATROL Infrastructure Monitor.
PATROL Central Administrator
PATROL Central Administrator is a console module optionally installed into the PATROL
Central console application. It provides the functionality required for administrators of
Glossary
99
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
PATROL infrastructure to manage the access, privileges, and rights of those in the organization
that will view and interact with PATROL Agents installed throughout the enterprise. PATROL
Central Administrator is generally only installed into PATROL Central client applications
responsible for administering PATROL infrastructure and other PATROL users.
PATROL Central Operator
PATROL Central Operator is a console module installed into the PATROL Central Console
infrastructure component. It provides the functionality needed by end users (operators) of the
information gathered by PATROL Agents. PATROL Central Operator is generally installed in
all PATROL Central client applications.
PATROL Central Operator Microsoft Windows Edition
PATROL Central Operator Microsoft Windows Edition provides management console
infrastructure that can be extended by installing console modules. Once console modules are
installed, PATROL Central Operator Microsoft Windows Edition provides a native Windows
interface to all modules regardless of the functionality they provide.
PATROL Central Operator Web Edition
PATROL Central Operator Web Edition provides management console infrastructure that can
be extended by installing console modules. Once console modules are installed, PATROL
Central Operator Web Edition provides a Web browser-based interface to all modules
regardless of the functionality they provide.
PATROL Configuration Manager
PATROL Configuration Manager is used to perform PATROL Agent and Knowledge Module
configuration and customization all from a centralized location. PCM implements a "rule and
ruleset" model, where customizations and configuration specifics are encapsulated in one or
more rules. Logically related rules may be aggregated together in rulesets, and both rules and
rulesets can be deployed to one or more PATROL Agents in a single action.
PATROL Infrastructure Knowledge Module
See PATROL Infrastructure Monitor.
PATROL Infrastructure Monitor
The PATROL Infrastructure Monitor (formerly called the PATROL Infrastructure Knowledge
Module or the PATROL 7 Knowledge Module) is a PATROL Knowledge Module that is
responsible for monitoring the RT cloud and the general health of the RT infrastructure.
R
RT Cloud
An RT cloud is a collection of managed nodes and the RTserver which services them; in the case
where Hub RTservers are used, the RT cloud extends to include the Hub and all other RTservers
and Hubs which are configured to function together. Sometimes referred to as an "RTserver
cloud" or a "messaging domain".
100
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
RTserver
Real time servers are the components of PATROL Central infrastructure that provide
communication services between PATROL components. RTservers communicate data between
PATROL Agents, common service components such as PATROL Console Servers, and PATROL
Central client applications such as PATROL Central Operator Microsoft Windows Edition and
PATROL Central Operator Web Edition.
S
Stakeholder
A stakeholder is a person with an interest in the success of a project; a stakeholder may or may
not actively participate in the execution of the project.
W
Web Server
In the context of PATROL Central infrastructure, a Web server is a special instance of a Tomcat
or Apache Web server running a proprietary plug-in which allows it to communicate and serve
content from a PATROL Central Console Server.
Glossary
101
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
102
Notes
*52790*
*52790*
*52790*
*52790*
*52790*