Beruflich Dokumente
Kultur Dokumente
• Why does netca always creates the listener which listens to public ip and not VIP
only?
• Ct wants to use rconfig to convert a single instance to RAC but ct is using raw
devices in RAC. Does rconfig support RAW ?
• Can we designate the place of archive logs on both ASM disk and regular file
system, when we use SE RAC?
• WARNING: No cluster interconnect has been specified. I get this error starting
my RAC database, what do I do?
• Is it supported to install CRS and RAC as different users.
• I have changed my spfile with alter system set <parameter_name> =....
scope=spfile. The spfile is on ASM storage and the database will not start.
• Is it difficult to transition from Single Instance to RAC?
• What are the dependencies between OCFS and ASM in Oracle Database 10g ?
• What is Cache Fusion and how does this affect applications?
• Do we have to have Oracle RDBMS on all nodes?
• What software is necessary for RAC? Does it have a separate installation CD to
order?
• What kind of HW components do you recommend for the interconnect?
• Is rcp and/or rsh required for normal RAC operation ?
• Are there any suggested roadmaps for implementing a new RAC installation?
• Are there any issues for the interconnect when sharing the same switch as the
public network by using VLAN to separate the network?
• Can my customer use Veritas Agents to manage their RAC database on Unix with
SFRAC installed?
• How do I check for network problems on my interconect?
• Is there a need to renice LMS processes in Oracle RAC 10g Release 2?
• I had a 3 node RAC. One of the nodes had to be completely rebuilt as a result of a
problem. As there are no backups, What is the proper procedure to remove the 3rd
node from the cluster so it can be added back in?
• Where can I find a list of supported solutions to ensure NIC availability (for the
interconnect) per platform?
• What combinations of Oracle Clusterware, RAC and ASM versions can I use?
• Is relink required for CRS_HOME after OS upgrade?
• Does Oracle Clusterware or Real Application Clusters support heterogeneous
platforms?
• Is Infiniband supported for the RAC interconnect?
• Can I run more than one clustered database on a single RAC cluster?
• What is Standard Edition RAC?
• Can I run 9i RAC and RAC 10g in the same cluster?
• I could not get the user equivalence check to work on my Solaris 10 server when
trying to install 10.2.0.1 Oracle Clusterware. The install ran fine without issue. <<
Message: Result: User equivalence check failed for user "oracle". >>
• Does changing uid or gid of the Oracle User affect Oracle Clusterware?
• How many NICs do I need to implement RAC?
• Can we output the backupset onto regular file system directly (not onto flash
recovery area) using RMAN command, when we use SE RAC?
• Should the SCSI-3 reservation bit be set for our Oracle Clusterware only
installation?
• A client is a new RAC user and are using it in conjunction with BEA weblogic.
Can they use Connection Load Balancing and Services? What about FCF, FAN,
RCLB?
• Why is validateUserEquiv failing during install (or cluvfy run)?
• How can a NAS storage vendor certify their storage solution for RAC ?
• What are the restrictions on the SID with a RAC database? Is it limited to 5
characters?
• What storage is supported with Standard Edition RAC?
• Can I use iSCSI storage with my RAC cluster?
• What would you recomend to customer, Oracle clusterware or Vendor
Clusterware (I.E. MC Service Guard, HACMP, Sun Cluster, Veritas etc.) with
Oracle Database 10g Real Application Clusters?
• Can I use RAC in a distributed transaction processing environment?
• Is it a good idea to add anti-virus software to my RAC cluster?
• When configuring the NIC cards and switch for a GigE Interconnect should it be
set to FULL or Half duplex in RAC?
RAC Assistance
High Availability
• Can I use the 10.2 JDBC driver with 10.1 database for FCF?
• What clients provide integration with FAN through FCF?
• Can I use TAF and FAN/FCF?
• How does the datasource properties initialLimit, minLimit, and maxLimit affect
Fast Connection Failover processing with JDBC?
• Do I need to install the ONS on all my mid-tier serves in order to enable JDBC
Fast Connection Failover (FCF)?
• Will FAN/OCI work with Instant Client?
• What type of callbacks are supported with OCI when using FAN/FCF?
• Does FCF for OCI react to FAN HA UP events?
• Can I use FAN/OCI with Pro*C?
• Do I have to link my OCI application with a thread library? Why?
Scalability
• I am seeing the wait events 'ges remote message', 'gcs remote message', and/or
'gcs for action'. What should I do about these?
• What are the changes in memory requirements from moving from single instance
to RAC?
• How can I validate the scalability of my shared storage? (Tightly related to RAC /
Application scalability)
• How many nodes are supported in a RAC Database?
• How do I measure the bandwidth utilization of my NIC or my interconnect?
• Does Database blocksize or tablespace blocksize affect how the data is passed
across the interconnect?
• What are my options for setting the Load Balancing Advisory GOAL on a
Service?
• What is the Load Balancing Advisory?
• What is Runtime Connection Load Balancing?
• How do I enable the load balancing advisory?
Manageability
• I found in 10.2 that the EM "Convert to Cluster Database" wizard would always
fall over on the last step where it runs emca and needs to log into the new cluster
database as dbsnmp to create the cluster database targets etc. I changed the
password for the dbsnmp account to be dbsnmp (same as username) and it worked
OK. Is this a known issue?
• What storage option should I use for RAC 10g on Linux? ASM / OCFS / Raw
Devices / Block Devices / Ext3 ?
• How do I stop the GSD?
• What is the purpose of the gsd service in Oracle 9i RAC?
• How should I deal with space management? Do I need to set free lists and free list
groups?
• I was installing RAC and my Oracle files did not get copied to the remote node(s).
What went wrong?
• I have 2 clusters named "crs" (the default), how do I get Grid Control to recognize
them as targets?
• If I am using Vendor Clusterware such as Veritas, IBM, Sun or HP, do I still need
Oracle Clusterware to run Oracle RAC 10g?
• If using plsql native code, the plsql_native_library_dir needs to be defined. In
RAC environement, must the directory be in the shared storage?
• How do I determine whether or not an OneOff patch is "rolling upgradeable"?
• Does RAC work with NTP (Network Time Protocol)?
• What is the Cluster Verification Utiltiy (cluvfy)?
• What versions of the database can I use the cluster verification utility (cluvfy)
with?
• What are the implications of using srvctl disable for an instance in my RAC
cluster? I want to have it available to start if I need it but at this time to not want
to run this extra instance for this database.
Platform Specific
• Client is running Veritas cluster on a SunOS 2.9. When we ran the ran the 10.2.0
installer it did not discover the nodes but with 9i it was able to discover both the
nodes. Is there anything specific to be done for 10.2.0 db install?
• Can I configure IPMP in Actie/Active to increase bandwidth of my interconnect?
• Does Oracle Support RAC with Solaris 10 Containers (aka Zones)?
• Does Sun Solaris have a multipathing solution ?
• Does the Oracle Cluster File System (OCFS) support network access through
NFS or Windows Network Shares?
• My customer has a failsafe cluster installed, what are the benefits of moving their
system to RAC?
• When I try to login to the +ASM2 on node2 with asmcmd (after setting
ORACLE_HOME and ORACLE_SID correctly) I get: ORA-01031: insufficient
privileges (DBD ERROR: OCI SessionBegin). When I try to login to +ASM2
using sqlplus (connect / as sysdba) I get the same ORA-01031: insufficient
privileges. When I try to login to +ASM2 using sqlplus (connect sys/passwd as
sysdba) I get connected successfully.
• Can I run my 9i RAC and RAC 10g on the same Windows cluster?
• My customer wants to understand what type of disk caching they can use with
their Windows RAC Cluster, the install guide tells them to disable disk caching?
• Is HACMP needed for RAC on AIX 5.2 using GPFS file system?
• Do I need HACMP/GPFS to store my OCR/Voting file on a shared device.
• Is VIO supported with RAC on IBM AIX?
• Can I run Oracle RAC 10g on my IBM Mainframe Sysplex environment (z/OS)?
Diagnosibility
• How do I gather all relevant Oracle and OS log/trace files in a RAC cluster to
provide to Support?
• What are the cdmp directories in the background_dump_dest used for?
• What is the optimal migration path to be used while migrating the E-Business
suite to RAC?
• Is the Oracle E-Business Suite (Oracle Applications) certified against RAC?
• Can I use TAF with e-Business in a RAC environment?
• How to configure concurrent manager in a RAC environment?
• Should functional partitioning be used with Oracle Applications?
• Which e-Business version is prefereable?
• Can I use Automatic Undo Management with Oracle Applications?
Oracle Clusterware
• Customer is hitting bug 4462367 with an error message saying low open file
descriptor, how do I work around this until the fix is released with the Oracle
Clusterware Bundle for 10.2.0.3 or 10.2.0.4 is released?
• In the course of failure testing in an extended RAC environment we find entries in
the cssd logfile which indicate actions like 'diskShortTimeout set to (value)' and
'diskLongTimeout set to (value)'.
Can anyone please explain the meaning of these two timeouts in addition to
diskTimout?
• Can I run a 10.1.0.x database with Oracle Clusterware 10.2 ?
• Is it supported to rerun root.sh from the Oracle Clusterware installation ?
• My customer has noticed tons of log files generated under $CRS_HOME/log/
/client, is there any way automated way we can setup through Oralce Clusterware
to prevent/minimize/remove those aggressively generated files?
• Can I change the public hostname in my Oracle Database 10g Cluster using
Oracle Clusterware?
• Can I set up failover of the VIP to another card in the same machine or what do I
do if I have different network interfaces on different nodes in my cluster (I.E. eth0
on node1,2 and eth1 on node 3,4)?
• Is it possible to use ASM for the OCR and voting disk?
• Is it supported to allow 3rd Party Clusterware to manage Oracle resources
(instances, listeners, etc) and turn off Oracle Clusterware management of these?
• What is the High Availability API?
• How to move the OCR location ?
• During Oracle Clusterware installation, I am asked to define a private node name,
and then on the next screen asked to define which interfaces should be used as
private and public interfaces. What information is required to answer these
questions?
• Can I change the name of my cluster after I have created it when I am using
Oracle Clusterware?
• Which processes access the OCR ?
• What happens if I lose my voting disk(s)?
• Why does Oracle Clusterware use an additional 'heartbeat' via the voting disk,
when other cluster software products do not?
• Why does Oracle still use the voting disks when other cluster sofware is present?
• How do I identify the voting file location ?
• How much I/O activity should the voting disk have?
• What is the voting disk used for?
• How do I use multiple network interfaces to provide High Availability and/or
Load Balancing for my interconnect with Oracle Clusterware?
• Does Oracle Clusterware have to be the same or higher release than all instances
running on the cluster?
• Can I use Oracle Clusterware to monitor my EM Agent?
• Can the Network Interface Card (NIC) device names be different on the nodes in
a cluster, for both public and private?
• Can I configure HP's Autoport aggregation for NIC Bonding after the install? (i.e.
not present beforehand)
• When ct run the command 'onsctl start' receives the message "Unable to open
libhasgen10.so". Any idea why the message "unable to open libhasgen10.so" ?
• What are the IP requirements for the private interconnect?
• How to Restore a Lost Voting Disk used by Oracle Clusterware 10g
• How can I register the listener with Oracle Clusterware in RAC 10g Release 2?
• How is the voting disk used by Oracle Clusterware?
• Does Oracle Clusterware support application vips?
• Why is the home for Oracle Clusterware not recommended to be subdirectory of
the Oracle base directory?
• Can I use Oracle Clusterware to provide cold failover of my 9i or 10g single
instance Oracle Databases?
• How do I put my application under the control of Oracle Clusterware to achieve
higher availability?
• How do I protect the OCR and Voting in case of media failure?
• With Oracle Clusterware 10g, how do you backup the OCR?
• Does the hostname have to match the public name or can it be anything else?
• Is it a requirement to have the public interface linked to ETH0 or does it only
need to be on a ETH lower than the private interface?: - public on ETH1 - private
on ETH2
• How do I restore OCR from a backup? On Windows, can I use ocopy?
• What should the permissions be set to for the voting disk and ocr when doing a
RAC Install?
• What are the default values for the command line arguments?
• How do I check the Oracle Clusterware stack and other sub-components of it?
• Is there a way to verify that the Oracle Clusterware is working properly before
proceeding with RAC install?
• At what point cluvfy is usable? Can I use cluvfy before installing Oracle
Clusterware?
• What is CVU? What are its objectives and features?
• What is a stage?
• What is a component?
• What is nodelist?
• Do I have to be root to use CVU?
• What about discovery? Does CVU discover installed components?
• How do I report a(or tons of) bug?
• What are the requirements for CVU?
• How do I install 'cvuqdisk' package?
• How do I know about cluvfy commands? The usage text of cluvfy does not show
individual commands.
• Do I have to type the nodelist every time for the CVU commands? Is there any
shortcut?
• How do I get detail output of a check?
• How do I check network or node connectivity related issues?
• Can I check if the storage is shared among the nodes?
• How do I check whether OCFS is properly configured?
• How do I check user accounts and administrative permissions related issues?
• How do I check minimal system requirements on the nodes?
• Is there a way to compare nodes?
• Why the peer comparison with -refnode says passed when the group or user does
not exist?
• How do I turn on tracing?
• Where can I find the CVU trace files?
• Why cluvfy reports "unknown" on a particular node?
• What are the known issues with this release?
• When I run 10.2 CLUVFY on a system where RAC 10g Release 1 is running I get
following output:
Answers
I have changed my spfile with alter system set <parameter_name> =....
scope=spfile. The spfile is on ASM storage and the database will not
start.
How to recover:
In $ORACLE_HOME/dbs
. oraenv <instance_name>
startup nomount
Then:
shutdown immediate
startup
quit
Yes, CRS and RAC can be installed as different users. The CRS user and the RAC user
must both have "oinstall" as their primary group, and the RAC user should be a member
of the OSDBA group.
It simply means you do not have cluster_interconnects parameter set and nothing was set
in the OCR, so the private interconnect is picked at random by the database and hence the
warning...
You can either set cluster_interconnects parameter in the init.ora to the private
interconnect IP; OR play with oifcfg getif and setif (type oifcfg without anything for help
message)
$ oifcfg getif
eth0 138.2.236.0 global public
eth2 138.2.238.0 global cluster_interconnect
Note that if hardware is not identical you'll have to provide each node with it's own
correct value, if it's identical hardware you can use the -global switch.
Ct wants to use rconfig to convert a single instance to RAC but ct is using
raw devices in RAC. Does rconfig support RAW ?
Can we designate the place of archive logs on both ASM disk and regular
file system, when we use SE RAC?
Yes, - customers may want to create a standby database for their SE RAC database so
placing the archive logs additionally outside ASM is OK.
Why does netca always creates the listener which listens to public ip and
not VIP only?
This is for backward compatibility with existing clients: consider pre-10g to 10g server
upgrade. If we made upgraded listener to only listen on VIP, then clients that didn't
upgrade will not be able to reach this listener anymore.
Each node of a cluster that is being used for a clustered database will typically have the
RDBMS and RAC software loaded on it, but not actual datafiles (these need to be
available via shared disk). For example, if you wish to run RAC on 2 nodes of a 4-node
cluster, you would need to install the clusterware on all nodes, RAC on 2 nodes and it
would only need to be licensed on the two nodes running the RAC database. Note that
using a clustered file system, or NAS storage can provide a configuration that does not
necessarily require the Oracle binaries to be installed on all nodes.
Yes, Oracle Support recommends the following best practices roadmap to successfully
implement RAC:
The Purpose of this document is to provide a best practices road map to successfully
implement Real Application Clusters.
Cache Fusion is a new parallel database architecture for exploiting clustered computers to
achieve scalability of all types of applications. Cache Fusion is a shared cache
architecture that uses high speed low latency interconnects available today on clustered
systems to maintain database cache coherency. Database blocks are shipped across the
interconnect to the node where access to the data is needed. This is accomplished
transparently to the application and users of the system. As Cache Fusion uses at most a 3
point protocol, this means that it easily scales to clusters with a large numbers of nodes.
For more information about cache fusion see the following links:
Datafiles will need to be moved to either a clustered file system (CFS) or raw devices so
that all nodes can access it. Also, the MAXINSTANCES parameter in the control file
must be greater than or equal to number of instances you will start in the cluster.
For more detailed information, please see Migrating from single-instance to RAC in the
Oracle Documentation
What are the dependencies between OCFS and ASM in Oracle Database
10g ?
rcp"" and ""rsh"" are not required for normal RAC operation. However ""rsh"" and
""rcp"" should to be enabled for RAC and patchset installation. In future releases, ssh
will be used for these operations.
Real Application Clusters is an option of Oracle Database and therefore part of the Oracle
Database CD. With Oracle 9i, RAC is part of Oracle9i Enterprise Edition. If you install 9i
EE onto a cluster, and the Oracle Universal Installer (OUI) recognizes the cluster, you
will be provided the option of installing RAC. Most UNIX platforms require an OSD
installation for the necessary clusterware. For Intel platforms (Linux and Windows),
Oracle provides the OSD software within the Oracle9i Enterprise Edition release.
With Oracle Database 10g, RAC is an option of EE and available as part of SE. Oracle
provides Oracle Clusterware on its own CD included in the database CD pack.
Please check the certification matrix (Note 184875.1) or with the appropriate platform
vendor for more information.
With Oracle Database 10g, a customer who has purchased Standard Edition is allowed to
use the RAC option within the limitations of Standard Edition(SE). For licensing
restrictions you should read the Oracle Database 10g License Doc. At a high level this
means that you can have a max of 4 cpus in the cluster, you must use ASM for all
database files. Oracle Cluster File System (OCFS) is not supported for use with SE RAC.
NOTE: OCFS2 is not supported for any database related files with SE RAC.
Can I use iSCSI storage with my RAC cluster?
For iSCSI, Oracle has made the statement that, as a block protocol, this technology does
not require validation for single instance database. There are many early adopter
customers of iSCSI running Oracle9i and Oracle Database 10g. As for RAC, Oracle has
chosen to validate the iSCSI technology (not each vendor's targets) for the 10g platforms
- this has been completed for Linux, Unix and Windows. For Windows we have tested up
to 4 nodes - Any Windows iSCSI products that are supported by the host and storage
device are supported by Oracle. We don't support NAS devices for Windows, however
some NAS devices (eg NetApp) can also present themselves as iSCSI devices. If this is
the case then a customer can use this iSCSI device with Windows as long as the iSCSI
device vendor supports Windows as an initiator OS. No vendor-specific information will
be posted on Certify.
You will be installing and using Oracle Clusterware whether or not you use the Vendor
Clusterware. The question you need to ask is whether the Vendor Clusterware gives you
something that Oracle Clusterware does not. Is the RAC database on the same server as
the application server? Are there any other processes on the same server as the database
that you require Vendor Clusterware to fail over to another server in the cluster if the
server it is running on fails? IF this is the case, you may want the vendor clusterware, if
not, why spend the extra money when Oracle Clusterware supplies everything you need
to for the clustered database included with your RAC license. Note: With Oracle
Database 10g Release 2, Oracle Clusterware can be used to manage application processes
in the cluster (start, stop, check, relocate)
When configuring the NIC cards and switch for a GigE Interconnect
should it be set to FULL or Half duplex in RAC?
You've got to use Full Duplex, regardless of RAC or not, but for all network
communication. Half Duplex means you can only either send OR receive at the same
time.
Is it a good idea to add anti-virus software to my RAC cluster?
For customers who choose to run anti-virus (AV) software on their database servers, they
should be aware that the nature of AV software is that disk IO bandwidth is reduced
slightly as most AV software checks disk writes/reads. Also, as the AV software runs, it
will use CPU cycles that would normally be consumed by other server processes (e.g
your database instance). As such, databases will have faster performance when not using
AV software. As some AV software is known to lock the files whilst is scans then it is a
good idea to exclude the Oracle Datafiles/controlfiles/logfiles from a regular AV scan
YES. Best practices is to have all tightly coupled branches of a distributed transaction
running on a RAC database must run on the same instance. Between transactions and
between services, transactions can be load balanced across all of the database instances.
You can use services to manage DTP environments. By defining the DTP property of a
service, the service is guaranteed to run on one instance at a time in a RAC database. All
global distributed transactions performed through the DTP service are ensured to have
their tightly-coupled branches running on a single RAC instance.
How can a NAS storage vendor certify their storage solution for RAC ?
They should obtain an OCE test kit and complete the required RAC tests. They can
submit the request for an OCE kit to ocesup_ie@oracle.com.
The list of certified NAS vendors/solutions is posted on OTN under the OSCP program
YES. However Oracle Clusterware (CRS) will not support a 9i RAC database so you will
have to leave the current configuration in place. You can install Oracle Clusterware and
RAC 10g into the same cluster. On Windows and Linux, you must run the 9i Cluster
Manager for the 9i Database and the Oracle Clusterware for the 10g Database. When you
install Oracle Clusterware, your 9i srvconfig file will be converted to the OCR. Both 9i
RAC and 10g will use the OCR. Do not restart the 9i gsd after you have installed Oracle
Clusterware. Remember to check certify for details of what vendor clusterware can be
run with Oracle Clusterware.
For example on Solaris, your 9i RAC will be using Sun Cluster. You can install Oracle
Clusterware and RAC 10g in the same cluster that is running Sun Cluster and 9i RAC.
Today IP over IB is supported, and RDS on Linux is supported with 10.2.0.3 forward.
Qlogic (formerly SilverStorm) is the supported RDS vendor. Watch certify for updates.
As other platforms adopt RDS, we will expand support. There are no plans to support
uDAPL or ITAPI protocols.
See Note: 337737.1 for detailed support matrix. Basically the Clusterware version must
be at least the highest release of ASM or RAC. ASM must be at least 10.1.0.3 to work
with 10.2 database.
As per the licensing documentation, you must use ASM for all database files with SE
RAC. There is no support for CFS or NFS.
From Oracle Database 10g Release 2 Licensing Doc:
Oracle Standard Edition and Real Application Clusters (RAC) When used with Oracle
Real Application Clusters in a clustered server environment, Oracle Database Standard
Edition requires the use of Oracle Clusterware. Third-party clusterware management
solutions are not supported. In addition, Automatic Storage Management (ASM) must be
used to manage all database-related files, including datafiles, online logs, archive logs,
control file, spfiles, and the flash recovery area. Third-party volume managers and file
systems are not supported for this purpose.
Should the SCSI-3 reservation bit be set for our Oracle Clusterware only
installation?
If you are using only Oracle Clusterware(no Veritas CM), then you don't need to have
SCSI-3 PGR enabled, since Oracle Clusterware does not require it for IO fencing. If the
reservation is set, then you'll get the inconsistent results. So ask your storage vendor to
disable the reservation. Veritas RAC requires that the storage array support SCSI-3 PGR,
since this is how Veritas handles IO fencing. This SCSI-3 PGR is set at the array level;
for example EMC hypervolume level.
What are the restrictions on the SID with a RAC database? Is it limited to
5 characters?
The SID prefix in 10g Release 1 and prior versions was restricted to five characters by
install/config tools so that an ORACLE_SID of upto max of 5+3=8 characters can be
supported in a RAC environment. The SID prefix is relaxed upto 8 characters in 10g
Release 2, see bug4024251 for more information.
SSH must be set up as per the pre-installation tasks. It is also necessary to have file
permissions set as described below for features such as Public Key Authorization to
work. If your permissions are not correct, public key authentication will fail, and will
fallback to password authentication with no helpful message as to why. The following
server configuration files and/or directories must be owned by the account owner or by
root and GROUP and WORLD WRITE permission must be disabled.
$HOME
$HOME/.rhosts
$HOME/.shosts
$HOME/.ssh
$HOME/.ssh.authorized-keys
$HOME/.ssh/authorized-keys2 #Openssh specific for ssh2 protocol.
SSH (from OUI) will also fail if you have not connected to each machine in your cluster
as per the note in the installation guide:
The first time you use SSH to connect to a node from a particular system, you may see a
message similar to the following:
Enter |yes| at the prompt to continue. You should not see this message again when you
connect from this system to that node. Answering yes to this question causes an entry to
be added to a "known-hosts" file in the .ssh directory which is why subsequent
connection requests do not re-ask.
This is known to work on Solaris and Linux but may work on other platforms as well.
Follow the documentation for removing a node but you can skip all the steps in the node-
removal doc that need to be run on the node being removed, like steps 4, 6 and 7 (See
Chapter 10 of RAC Admin and Deployment Guide). Make sure that you remove any
database instances that were configured on the failed node with srvctl, and listener
resources also, otherwise rootdeltenode.sh will have trouble removing the nodeapps.
Just running rootdeletenode.sh isn't really enough, because you need to update the
installer inventory as well, otherwise you won't be able to add back the node using
addNode.sh. And if you don't remove the instances and listeners you'll also have
problems adding the node and instance back again.
Probably a better alternative (than the generic documentation, bug 5929611 filed) for a
remove node is Note 269320.1
A client is a new RAC user and are using it in conjunction with BEA
weblogic. Can they use Connection Load Balancing and Services?
What about FCF, FAN, RCLB?
The key item here is whether or not they are using XA. If they are using XA (Tuxedo for
example), then they should use the DTP service with 10g Release 2. Have the customer
review the Best Practices for using XA with RAC on OTN .
If it is not XA then services and Net Service Connection Load Balancing should work
fine. They can tune aspects of the recovery such as instance recovery time. Using BEA,
they do not get the advanced features such as Fast Connection Failover (FCF) and
Runtime Connection Load Balancing . To understand services, FCF, RCLB, read the
RAC Admin and Deployment Guide for 10g Release 2 Chapter 6.
At minimum you need 2: external (public), interconnect (private). When storage for RAC
is provided by Ethernet based networks (e.g. NAS/nfs or iSCSI), you will need a third
interface for I/O so a minimum of 3. Anything else will cause performance and stability
problems under load. From an HA perspective, you want these to be redundant, thus
needing a total of 6.
Can we output the backupset onto regular file system directly (not onto
flash recovery area) using RMAN command, when we use SE RAC?
Yes, - customers might want to backup their database to offline storage so this is also
supported.
Does changing uid or gid of the Oracle User affect Oracle Clusterware?
There are a lot of files in the Oracle Clusterware home and outside of the Oracle
Clusterware home that are chgrp'ed to the appropriate groups for security and appropriate
access. The filesystem records the uid (not the username), and so if you exchange the
names, now the files are owned by the wrong group.
I could not get the user equivalence check to work on my Solaris 10 server
when trying to install 10.2.0.1 Oracle Clusterware. The install ran
fine without issue. << Message: Result: User equivalence check
failed for user "oracle". >>
For details on the support of SFRAC and Veritas Agents with RAC 10g, please see
Metalink Note 397460.1 and Metalink Note 332257.1
Can I run more than one clustered database on a single RAC cluster?
You can run multiple databases in a RAC cluster, either one instance per node (w/
different databases having different subsets of nodes in a cluster), or multiple instances
per node (all databases running across all nodes) or some combination in between.
Running multiple instances per node does cause memory and resource fragmentation, but
this is no different from running multiple instances on a single node in a single instance
environment which is quite common. It does provide the flexibility of being able to share
CPU on the node, but the Oracle Resource Manager will not currently limit resources
between multiple instances on one node. You will need to use an OS level resource
manager to do this.
More information: Metalink Note 296874.1 and Auto Port Aggregation (APA)
Support Guide
• Bonding
• Teaming
On Windows teaming solutions used to ensure NIC availability are usually part of
the network card driver.
Thus, they depend on the network card used. Please, contact te respective
hardware vendor for more information.
$ chrt -p 31193
pid 31193's current scheduling policy: SCHED_OTHER
pid 31193's current scheduling priority: 0
How do I check for network problems on my interconect?
1. Confirm that full duplex is set correctly for all interconnect links on all interfaces on
both ends. Do not rely on auto negotiation.
2. ifconfig -a will give you an indication of collisions/errors/overuns and dropped packets
3. netstat -s will give you a listing of receive packet discards, fragmentation and
reassembly errors for IP and UDP.
4. Set the udp buffers correctly
5. Sheck your cabling
Note: If you are seeing issues with RAC, RAC uses UDP as the protocol. Oracle
Clusterware uses TCP/IP.
Are there any issues for the interconnect when sharing the same switch as
the public network by using VLAN to separate the network?
RAC and Clusterware deployment best practices suggests that the interconnect be
deployed on a stand-alone, physically seperate, dedicated switch. Many customers have
consolidated these stand-alone switches into larger managed switches. A consequence of
this consolidation is a merging of IP networks on a single shared switch, segmented by
VLANs. There are caveats associated with such deployments. RAC cache fusion
exercises the IP network more rigorously than non-RAC Oracle databases. The latency
and bandwidth requirements as well as availability requirements of the RAC/Clusterware
interconnect IP network are more in-line with high performance computing. Deploying
the RAC/Clusterware interconnect on a shared switch, segmented VLAN may expose the
interconnect links to congestion and instability in the larger IP network topology. If
deploying the interconnect on a VLAN, there should be a 1:1 mapping of VLAN to non-
routable subnet and the VLAN should not span multiple VLANs (tagged) or multiple
switches. Deployment concerns in this environment include Spanning Tree loops when
the larger IP network topology changes, Assymetric routing that may cause packet
flooding, and lack of fine grained monitoring of the VLAN/port.
If I already have an ASM instance/diskgroup then the following creates a RAC database
on that diskgroup:
su oracle -c "$ORACLE_HOME/bin/dbca -silent -createDatabase -templateName
General_Purpose.dbc -gdbName $SID -sid $SID -sysPassword $PASSWORD
-systemPassword $PASSWORD -sysmanPassword $PASSWORD -dbsnmpPassword
$PASSWORD -emConfiguration LOCAL -storageType ASM -diskGroupName
$ASMGROUPNAME -datafileJarLocation $ORACLE_HOME/assistants/dbca/templates
-nodeinfo $NODE1,$NODE2 -characterset WE8ISO8859P1 -obfuscatedPasswords false
-sampleSchema false -oratabLocation /etc/oratab"
No, not in the traditional Oracle Net Services Load Balancing. We have written a
document that explains the ** best practices for 9i, 10g Release 1 and 10g Release 2**
. With the 10g Services, life gets easier. To understand services, read the RAC Admin
and Deployment Guide for 10g Release 2 Chapter 6.
OCR is the Oracle Cluster Registry, it holds all the cluster related information such as
instances, services. The OCR file format is binary and starting with 10.2 it is possible to
mirror it. Location of file(s) is located in: /etc/oracle/ocr.loc in ocrconfig_loc and
ocrmirrorconfig_loc variables.
Obviously if you only have one copy of the OCR and it is lost or corrupt then you must
restore a recent backup, see ocrconfig utility for details, specifically -showbackup and
-restore flags. Until a valid backup is restored the Oracle Clusterware will not startup due
to the corrupt/missing OCR file.
The interesting discussion is what happens if you have the OCR mirrored and one of the
copies gets corrupt? You would expect that everything will continue to work seemlessly.
Well.. Almost.. The real answer depends on when the corruption takes place.
If the corruption happens while the Oracle Clusterware stack is up and running, then the
corruption will be tolerated and the Oracle Clusterware will continue to funtion without
interruptions. Despite the corrupt copy. DBA is advised to repair this hardware/software
problem that prevent OCR from accessing the device as soon as possible; alternatively,
DBA can replace the failed device with another healthy device using the ocrconfig utility
with -replace flag.
If however the corruption happens while the Oracle Clusterware stack is down, then it
will not be possible to start it up until the failed device becomes online again or some
administrative action using ocrconfig utility with -overwrite flag is taken. When the
Clusteware attempts to start you will see messages similar to:
a) Fix whatever problem (hardware/software?) that prevent OCR from accessing the
device.
b) Issue "ocrconfig -overwrite" on any one of the nodes in the cluster. This command will
overwrite the vote check built into OCR when it starts up. Basically, if OCR device is
configured with mirror, OCR assign each device with one vote. The rule is to have more
than 50% of total vote (quorum) in order to safely make sure the available devices
contain the latest data. In 2-way mirroring, the total vote count is 2 so it requires 2 votes
to achieve the quorum. In the example above there isn't enough vote to start if only one
device with one vote is available. (In the earlier example, while OCR is running when the
device is down, OCR assign 2 vote to the surviving device and that is why this surviving
device now with two votes can start after the cluster is down). See warning below
Why do we have a Virtual IP (VIP) in 10g? Why does it just return a dead
connection when its primary node fails?
If I use Services with Oracle Database 10g, do I still need to set up Load
Balancing ?
Yes, Services allow you granular definition of workload and the DBA can dynamically
define which instances provide the service. Connection Load Balancing still needs to be
set up to allow the user connections to be balanced across all instances providing a
service.
Link Aggregation (GLDv3) is bundled in the OS as of Solaris 10. IPMP is available for
Solaris 10 and Solaris 9. Neither require Sun Cluster to be installed. For the interconnect
and switch redundancy, as a best practice, avoid VLAN trunking across the switches. For
ease of configuration (e.g. fewer IP address requirements), use IPMP with link mode
failure detection in primary/standby configuration. This will give you a single failover IP
which you will define in cluster_interconnects init.ora parameter. Remove any interfaces
for the interconnect from the OCR using `oifcfg delif`. AND TEST THIS
RIGOROUSLY. For now, as Link Aggregation (GLDv3) cannot span multiple switches
from a single host, you will need to configure the switch redundancy and the host NICs
with IPMP. When configuring IPMP for the interconnect with multiple switches
available, configure IPMP as active/standby and *not* active/active. This is to avoid
potential latencies in switch failure detection/failover which may impact the availability
of the rdbms. Note, IPMP spreads/load balances outbound packets on the bonded
interfaces, but inbound packets are received on a single interface. In an active/active
configuration this makes send/receive problems difficult to diagnose. Both Link
Aggregation (GLDv3) and IPMP are core OS packages SUNWcsu, SUNWcsr
respectively and do not require Sun Clusterware.
Absolutely. RMAN can be configured to connect to all nodes within the cluster to
parallelize the backup of the database files and archive logs. If files need to be restored,
using set AUTOLOCATE ON alerts RMAN to search for backed up files and archive
logs on all nodes.
This error can occur when problems are detected on the cluster:
For more information on troubleshooting this error, see the following Metalink note:
Note 219361.1
Troubleshooting ORA-29740 in a RAC Environment
What does the Virtual IP service do? I understand it is for failover but do
we need a separate network card? Can we use the existing
private/public cards? What would happen if we used the public ip?
The 10g Virtual IP Address (VIP) exists on every RAC node for public network
communication. All client communication should use the VIPs in their TNS connection
descriptions. The TNS ADDRESS_LIST entry should direct clienst to VIPs rather than
using hostnames. During normal runtime, the behaviour is the same as hostnames,
however when the node goes down or is shutdown the VIP is hosted elsewhere on the
cluster, and does not accept connection requests. This results in a silent TCP/IP error and
the client fails immediately to the next TNS address. If the network interface fails within
the node, the VIP can be configured to use alternate interfaces in the same node. The VIP
must use the public interface cards. There is no requirement to purchase additional public
interface cards (unless you want to take advantage of within-node card failover.)
What do the VIP resources do once they detect a node has failed/gone
down? Are the VIPs automatically acquired, and published, or is
manual intervention required? Are VIPs mandatory?
When a node fails, the VIP associated with the failed node is automatically failed over to
one of the other nodes in the cluster. When this occurs, two things happen:
1. The new node re-arps the world indicating a new MAC address for this IP
address. For directly connected clients, this usually causes them to see errors on
their connections to the old address;
2. Subsequent packets sent to the VIP go to the new node, which will send error
RST packets back to the clients. This results in the clients getting errors
immediately.
In the case of existing SQL conenctions, errors will typically be in the form of ORA-3113
errors, while a new connection using an address list will select the next entry in the list.
Without using VIPs, clients connected to a node that died will often wait for a TCP/IP
timeout period before getting an error. This can be as long as 10 minutes or more. As a
result, you don't really have a good HA solution without using VIPs.
What are my options for load balancing with RAC? Why do I get an
uneven number of connections on my instances?
All the types of load balancing available currently (9i-10g) occur at connect time.
This means that it is very important how one balances connections and what these
connections do on a long term basis.
Since establishing connections can be very expensive for your application, it is good
programming practice to connect once and stay connected. This means one needs to be
careful as to what option one uses. Oracle Net Services provides load balancing or you
can use external methods such as hardware based or clusterware solutions.
The following options exist prior to Oracle RAC 10g Releae 2 (for 10g Release 2 see
Load Balancing Advisory):
Random
Either client side load balancing or hardware based methods will randomize the
connections to the instances.
On the negative side this method is unaware of load on the connections or even if they
are up meaning they might cause waits on TCP/IP timeouts.
Load Based
Server side load balancing (by the listener) redirects connections by default depending on
the RunQ length of each of the instances. This is great for short lived connections.
Terrible for persistent connections or login storms. Do not use this method for
connections from connection pools or applicaton servers
Session Based
Server side load balancing can also be used to balance the number of connections to each
instance. Session count balancing is method used when you set a listener parameter,
prefer_least_loaded_node_listener-name=off. Note listener name is the actual name of
the listener which is different on each node in your cluster and by default is
listener_nodename.
Session based load balancing takes into account the number of sessions connected to each
node and then distributes the connections to balance the number of sessions across the
different nodes.
Can our 10g VIP fail over from NIC to NIC as well as from node to node ?
Yes the 10g VIP implementation is capable from failing over within a node from NIC to
NIC and back if the failed NIC is back online again, and also we fail over between nodes.
The NIC to NIC failover is fully redundant if redundant switches are installed.
Oracle Database 10g Release 2, introduces server-side TAF when using services. After
you create a service, you can use the dbms_service.modify_service pl/sql procedure to
define the TAF policy for the service. Only the basic method is supported. Note this is
different than the TAF policy (traditional client TAF) that is supported by srvctl and EM
Services page. If your service has a server side TAF policy defined, then you do not have
to encode TAF on the client connection string. If the instance where a client is connected,
fails, then the connection will be failed over to another instance in the cluster that is
supporting the service. All restrictions of TAF still apply.
NOTE: both the client and server must be 10.2 and aq_ha_notifications must be set to
true for the service.
Sample code to modify service:
No. The listener is a subscriber to all FAN events (both from the load balancing advisory
and the HA events). Therefore server side connection load balancing leverages FAN HA
events as well as laod balancing advisory events.
With the Oracle JDBC driver 10g Release 2, if you enable Fast Connection Failover, you
also enable Runtime Connection Load Balancing (one knob for both).
The combination of Server Side load balancing and Services allows you to easily mask
cluster database configuration changes. As long as all instances register with all listeners
(use the LOCAL_LISTENER and REMOTE_LISTENER parameters), server side load
balancing will allow clients to connect to the service on currently available instances at
connect time.
The load balancing advisory (setting a goal on the service) will give advice as to how
many connections to send to each instance currently providing a service. When a service
is enabled on an instance, as long as the instance registers with the listeners, the clients
can start getting connections to the service and the load balancing advisory will include
that instance is its advice.
Is it possible to use SVRCTL start database with a user account other than
oracle ( that is other than the owner of the oracle software)?
YES. When you create a RAC db as a user different than the home/software owner
(oracle) user, the db creation assistant would set the correct permissions/ACLs on the
CRS resources that control the db/instances etc, assuming that you had setup group
membership for this user to the dba group of the home (find it using
oracle_home/bin/osdbagrp) and also part of the crs home owners primary group (usually
oinstall) and there was group write permission on the oracle_home.
The client gets this error message in Production in the ons.log file every
minute or so: 06/11/10 10:11:14 [2] Connection 0,129.86.186.58,6200
SSL handshake failed 06/11/10 10:11:14 [2] Handshake for
0,129.86.186.58,6200: nz error = 29049 interval = 0 (180 max)
These annoying messages in ons.log are telling you that you have a configuration
mismatch for ONS somewhere in the farm. RAC has its own ONS server for which SSL
is disabled by default. You must either enable SSL for RAC ONS, or disable it for OID
ONS(OPMN). You need to create a wallet for each RAC ONS server, or copy one of the
wallets from OPMN on the OID instances.
In ons.conf you need to specify the wallet file and password:
walletfile=
walletpassword=
ONS only uses SSL between servers, and so ONS clients will not be affected. You
specify the wallet password when you create the wallet. If you copy a wallet from an
OPMN instance, then use the same password configured in opmn.xml. If there is no
wallet password configured in opmn.xml, then you don't need to specify a wallet
password in ons.conf either.
How do I configure FCF with BPEL so I can use RAC 10g in the backend?
** Note:372456.1 describes the procedure to set up BPEL with a Oracle RAC 10g
Release 1 database.
If you are using SSL, ensure the SSL enable attribute of ONS in opmn.xml file has same
value, either true or false, for all OPMN servers in the Farm. To troubleshoot OPMN at
the application server level, look at appendix A in Oracle� Process Manager and
Notification Server Administrator's Guide.
I am using shared services which the following set in init.ora SQL> show
parameters dispatchers=(protocol=TCP)(listener=listen ers_nl01)
(con=500)(serv=oltp). I stopped my service with srvctl stop service
but it is still registered with the listener and accepting connections.
Is this expected?
YES. This is by design of dispatchers which are part of Oracle Net Services. If you
specify the service attribute of the dispatchers init.ora parameter, the service specified
cannot be managed by the dba.
With Oracle Database 10g Release 1, JDBC clients (both thick and thin driver) are
integrated with FAN by providing FCF. With Oracle Database 10g Release 2, we have
added ODP.NET and OCI. Other applications can integrate with FAN by using the API
to subscribe to the FAN events.
Note: If you are using a 3rd party application server, then you can only use FCF if you
use the Oracle driver and except for OCI, its connection pool. If you are using the
connection pool of the 3rd Party Application Server, then you do not get FCF. Your
customer can subscribe directly to FAN events however that is a development project for
the customer. See the white paper Workload Management with Oracle RAC 10g on OTN
With Oracle Database 10g Release 1, NO. With Oracle Database 10g Release 2, the
answer is YES for OCI and ODP.NET, it is recommended. For JDBC, you should not use
TAF and FCF even with the Thick JDBC driver.
The initialLimit property on the Implicit Connection Cache is effective only when the
cache is first created. For example, if the initialLimit is set to 10, you'll have 10
connections pre-created and available when the conn cache is first created. Pls don't be
confused between minLimit and initialLimit. The current behavior is that after a DOWN
event and the affected connections are cleaned up, it is possible for the number of
connections in the cache to be lower than minLimit.
An UP event is processed for both (a) new instance joins, as well as (b) down followed
by an instance UP. This has no relevance to initialLimit, or even minLimit. When a UP
event comes into our jdbc Implicit Connection Cache, we will create some new
connections. Assuming you have your listener load balancing set up properly, then those
connections should go to the instance that was just started. When your application does a
get connection to the pool, it will be given an idle connection, if you are running 10.2 and
have the load balancing advisory turned on for the service, we will allocate the session
based on the defined goal to provide the best service level
MaxLimit, when set, defines the upper boundary limit for the connection cache. By
default, maxLimit is unbounded - your database sets the limit.
With 10g Release 1, the middle tier must have ONS running (started by same users as
application). ONS is not included on the Client CD however is is part of the Oracle
Database 10g cd.
With 10g Release 2, they do not need to install the ons on the middle tier. The JDBC
driver allows the use of remote ONS (ie uses the ONS running in the RAC cluster) . Just
use the datasource parameter
ods.setONSConfiguration("nodes=racnode1:4200,racnode2.:4200");
Yes, FAN/OCI will work with Instant Client. Both client and server must be Oracle
Database 10g Release 2.
What type of callbacks are supported with OCI when using FAN/FCF?
There are two separate callbacks supported. The HA Events (FAN) callback is called
when an event occurs. When a down event occurs, for example, you can clean up a
custom connection pool. i.e. purge stale connections. When the failover occurs, the TAF
callback is invoked. At failover time you can customize the newly created database
session. Both FAN and TAF are client-side callbacks. FAN also has a separate server side
callout that should not be confused with the OCI client callback.
Does FCF for OCI react to FAN HA UP events?
OCI does not perform any implicit actions on an up event, however if a HA event
callback is present, it is invoked. You can take any required action at that time.
Since Pro*C (sqllib) is built on top of OCI, it should support HA events. You need to
precompile the application with the option EVENTS=TRUE, make sure you link the
application with a thread library. The database connection must use a Service that has
been enabled for AQ events. Use dbms_service.modify_service to enable the service for
events (aq_ha_notifications => true) or use the EM Cluster Database Services page.
YES, you must link the application to a threads library. This is required because the AQ
notifications occur asynchronously, over an implicitly spawned thread.
Can I use the 10.2 JDBC driver with 10.1 database for FCF?
Yes with the patch for Bug 5657975 for 10.2.0.3,the 10.2 JDBC driver will work with a
10.1 database. The fix will be part of the 10.2.0.4 patchset. If you do not have the patch
then using FCF, use the 10.2 JDBC driver with 10.2 database. If database is 10.1, use
10.1 JDBC driver.
I am seeing the wait events 'ges remote message', 'gcs remote message',
and/or 'gcs for action'. What should I do about these?
These are idle wait events and can be safetly ignored. The 'ges remote message' might
show up in a 9.0.1 statspack report as one of the top wait events. To have this wait event
not show up you can add this event to the PERFSTAT.STATS$IDLE_EVENT table so
that it is not listed in Statspack reports.
What are the changes in memory requirements from moving from single
instance to RAC?
If you are keeping the workload requirements per instance the same, then about 10%
more buffer cache and 15% more shared pool is needed. The additional memory
requirement is due to data structures for coherency management. The values are heuristic
and are mostly upper bounds. Actual esource usage can be monitored by querying
current and maximum columns for the gcs resource/locks and ges resource/locks entries
in V$RESOURCE_LIMIT.
But in general, please take into consideration that memory requirements per instance are
reduced when the same user population is distributed over multiple nodes. In this case:
Assuming the same user population N number of nodes M buffer cache for a single
system then
Thus for example with a M=2G & N=2 & no extra memory for failed-over users
=( 2G / 2 ) + (( 2G / 2 )) *0.10
=1G + 100M
The load balancing advisory requires the use of services and Oracle Net connection load
balancing.
To enable it, on the server: set a goal (service_time or throughput, for ODP.NET enable
AQ_HA_NOTIFICATIONS=>true, and set CLB_GOAL ) on your service.
For client, you must be using the connection pool.
For JDBC, enable the datasource parameter FastConnectionFailoverEnabled.
For ODP.NET enable the datasource parameter Load Balancing=true.
What are my options for setting the Load Balancing Advisory GOAL on a
Service?
The load balancing advisory is enabled by setting the GOAL on your service either
through PL/SQL DBMS_SERVICE package or EM DBControl Clustered Database
Services page. There are 3 options for GOAL:
None � Default setting, turn off advisory
THROUGHPUT � Work requests are directed based on throughput. This should be
used when the work in a service completes at homogenous rates. An example is a trading
system where work requests are similar lengths.
SERVICE_TIME � Work requests are directed based on response time. This should be
used when the work in a service completes at various rates. An example is as internet
shopping system where work requests are various lengths
A more reliable, interactive way on Linux is to use the iptraf utility or the prebuilt rpms
from redhat or Novell (SuSE), another option on Linux is Netperf . On other Unix
platforms: "snoop -S -tr -s 64 -d hme0", AIX's topaz can show that as well.. Try to look
for the peak (not average) usage and see if that is acceptably fast.
Remember that NIC bandwidth is measured in Mbps or Gbps (which is BITS per second)
and output from above utilities can sometimes come in BYTES per second, so for
comparison, do proper conversion (divide bps value by 8 to get bytes/sec; or, multiple
bytes value by 8 to get bps value).
Additionally, you can't expect a network device to run at full capacity with 100%
efficiency, due to concurrency, collisions and retransmits that happens more frequently as
the utilization gets higher. If you are reaching high levels consider a faster interconnect or
NIC bonding (multiple NICs all servicing the same IP address).
Finally, above is measuring bandwidth utilization (how much), not latency (how fast) of
the interconnect, you may still be suffering from high latency connection (slow link) even
though there is plenty of bandwidth to spare. Most experts agree that low latency is by far
more important than a high bandwidth with respect to specifications of the private
interconnect in RAC. Latency is best measured by the actual user of the network link
(RAC in this case), review statspack for stats on latency. Also, in 10gR2 Grid Control
you can view Global Cache Block Access Latency, you can also drill down to the Cluster
Cache Coherency page to see the cluster cache coherency metrics for the entire cluster
database.
Keep in mind that RAC is using the private interconnect like it was never used before, to
synchronize memory regions (SGAs) of multiple nodes (remember, since 9i, entire data
blocks are shipped accross the interconnect), if the network is utilized at 50% bandwidth,
this means that 50% of the time it is busy and not available to potential users. In this case
delays (due to collisions and concurrency) will increase the latency even though the
bandwidth might look "reasonable", it's hiding the real issue.
Oracle ships database block buffers, i.e. blocks in a tablespace configured for 16K will
result in a 16K data buffer shipped, blocks residing in a tablespace with base block size
(8K) will be shipped as base blocks and so on; the data buffers are broken down to
packets of MTU sizes.
How can I validate the scalability of my shared storage? (Tightly related to
RAC / Application scalability)
Storage vendors tend to focus their sales pitch mainly on the storage unit's capacity in
Terabytes (1000 GB) or Petabytes (1000 TB), however for RAC scalability it's critical to
also look at the storage unit's ability to process I/O's per second (throughput) in a scalable
fashion, specifically from multiple sources (nodes). If that criteria is not met, RAC /
Application scalability most probably will suffer, as it partially depends on storage
scalability as well as a solid and capable interconnect (for network traffice between nr>
Storage vendors may sometimes discourage such testing, boasting about their amazing
front or backend battery backed memory caches that "eliminate" all I/O bottlenecks. This
is all great, and you should take advantage of such caches as much as possible... however,
there is no substitute to a a real world test, you may uncover that the HBA (Host Buss
Adapater) firmware or the driver versions are outdated (before you claim poor RAC /
Application scalability issues).
It is highly recommended to test this storage scalability early on so that expectations are
set accordingly. On Linux there is a freely available tool released on OTN called ORION
(Oracle I/O test tool) which simulates Oracle I/O.
On other Unix platforms (as well as Linux) one can use IOzone, if prebuilt binary not
available you should build from source, make sure to use version 3.271 or later and if
testing raw/block devices add the "-I" flag.
In a basic read test you will try to demonstrate that a certain IO throughput can be
maintained as nodes are added. Try to simulate your database io patterns as much as
possible, i.e. blocksize, number of simultaneous readers, rates, etc.
For example, on a 4 node cluster, from node 1 you measure 20MB/sec, then you start a
read stream on node 2 and see another 20MB/sec while the first node shows no decrease.
You then run another stream on node 3 and get another 20MB/sec, in the end you run 4
streams on 4 nodes, and get an aggregated 80MB/sec or close to that. This will prove that
the shared storage is scalable. Obviously if you see poor scalability in this phase, that will
be carried over and be observed or interperted as poor RAC / Application scalability.
In many cases RAC / Application scalability is at blame for no real reason, that is, the
underlying IO subsystem is not scalable.
The conversion to cluster happens successfully but the EM monitoring credentials for the
converted database are not properly set due to this bug. This is resolved in next patchset.
In the interim, user can set the monitoring password from the "monitoring configuration"
screen for the RAC DB from GC console and proceed.
This issue has been fixed in 10.2.0.3 database and to get the complete functionality you
will need 10.2.0.2 Grid Control patch also, as the fix is spread between the two pieces of
software. For now you can proceed with setting password for dbsnmp user same as that
of sys user.
What storage option should I use for RAC 10g on Linux? ASM / OCFS /
Raw Devices / Block Devices / Ext3 ?
EXT3 is out of the question, since it's data structures are not cluster aware, that is, if you
mount an ext3 filesystem from multiple nodes, it will quickly get corrupted.
Another option of course is NFS and iSCSI both are outside the scope of this FAQ but
included for completeness.
If for any reason the above options (ASM/OCFS) are not good enough and you insist on
using 'raw devices' or 'block devices' here are the details on the two (This information is
still very useful to know in the context of ASM and OCFS).
block devices (/dev/sde9) are **BUFFERED** devices!! unless you explicitly open
them in O_DIRECT you will get buffered (linux buffer cache) IO.
Above is not a typo, block devices on Unix do buffered IO by default (cached in linux
buffer cache), this means that RAC can not operate on it (unless opened with
O_DIRECT), since the IO's will not be immediately visible to other nodes.
You may check if a device is block or character device by the first letter printed with the
"ls -l" command:
crw-rw---- 1 root disk 162, 1 Jan 23 19:53 /dev/raw/raw1
brw-rw---- 1 root disk 8, 112 Jan 23 14:51 /dev/sdh
Above, "c" stands for character device, and "b" for block devices.
Starting with Oracle 10.1 an RDBMS fix added the O_DIRECT flag to the open call
(O_DIRECT flag tells the Linux kernel to bypass the Linux buffer cache and write
directly to disk), in the case of a block device, that ment that a create datafile on
'/dev/sde9' would succeed (need to set filesystemio_options=directio in init.ora).. This
enhancement was well received, and shortly after bug 4309443 was fixed (by adding the
O_DIRECT flag on the OCR file open call) meaning that starting with 10.2 (there are
several 10.1 backports available) the Oracle OCR file could also access block devices
directly. For the voting disk to be opened with O_DIRECT you need fix for bug 4466428
(5021707 is a duplicate). This means that both voting disks and OCR files could live on
block devices. However, due to OUI bug 5005148, there is still a need to configure raw
devices for the voting or OCR files during installation of RAC, not such a big deal, since
it's just 5 files in most cases.
By using block devices you no longer have to live with the limitations of 255 raw devices
per node. You can access as many block devices as the system can support. Also block
devices carry persistent permissions across reboots, while with raw devices one would
have to customize that after installation otherwise the Clusterware stack or database
would fail to startup due to permission issues.
ASM or ASMlib can be given the raw devices (/dev/raw/raw2) as was done in the initial
deployment of 10g Release 1, or the more recommended way: ASM/ASMLib should be
given the block devices directly (eg. /dev/sde9).
Since RAW devices are being phased out of Linux in the long term, it is recommended
everyone should switch to using the block devices (meaning, pass these block devices to
ASM or OCFS/2 or Oracle Clusterware)
$ gsdctl stop
How should I deal with space management? Do I need to set free lists and
free list groups?
Automatic Segment Space Management is NOT the default, you need to set it.
I was installing RAC and my Oracle files did not get copied to the remote
node(s). What went wrong?
First make sure the cluster is running and is available on all nodes. You should be able to
see all nodes when running an 'lsnodes -v' command.
If lsnodes shows that all members of the cluster are available, then you may have an
rcp/rsh problem on Unix or shares have not been configured on Windows.
You can test rcp/rsh on Unix by issuing the following from each node:
On Windows, ensure that each node has administrative access to all these directories
within the Windows environment by running the following at the command prompt:
More information can be found in the Step-by-Step RAC notes available on Metalink. To
find these search Metalink for 'Step-by-Step Installation of RAC'.
What are the implications of using srvctl disable for an instance in my
RAC cluster? I want to have it available to start if I need it but at
this time to not want to run this extra instance for this database.
During node reboot, any disabled resources will not be started by the Clusterware,
therefore this instance will not be restarted. It is recommended that you leave the vip,
ons,gsd enabled in that node. For example, VIP address for this node is present in address
list of database services, so a client connecting to these services will still reach some
other database instance providing that service via listener redirection. Just be aware that
by disabling an Instance on a node, all that means is that the instance itself is not starting.
However, if the database was originally created with 3 instances, that means there are 3
threads of redo. So, while the instance itself is disabled, the redo thread is still enabled,
and will occasionally cause log switches. The archived logs for this 'disabled' instance
would still be needed in any potential database recovery scenario. So, if you are going to
disable the instance through srvctl, you may also want to consider disabling the redo
thread for that instance.
The Cluster Verification Utility (CVU) is a validation tool that you can use to check all
the important components that need to be verified at different stages of deployment in a
RAC environment. The wide domain of deployment of CVU ranges from initial hardware
setup through fully operational cluster for RAC deployment and covers all the
intermediate stages of installation and configuration of various components. Cluvfy does
not take any corrective action following the failure of a verification task, does not enter
into areas of performance tuning or monitoring, does not perform any cluster or RAC
operation, and does not attempt to verify the internals of cluster database or cluster
elements.
What versions of the database can I use the cluster verification utility
(cluvfy) with?
The cluster verification utility is release with Oracle Database 10g Release 2 but can also
be used with Oracle Database 10g Release 1.
YES! NTP and RAC are compatible, as a matter of fact, it is recommended to setup NTP
in a RAC cluster, for Oracle 8i/9i and 10g.
Each machine has a different clock frequency and as a result a slightly different time
drift. NTP computes this time drift within about 15 minutes, and stores this information
in a "drift" file, it then adjusts the system clock based on this known drift as well as
compares it to a given time-server the sys-admins sets up. This is the recommended
approach.
• Minor changes in time (in the seconds range) are harmless for RAC and the
Oracle clusterware. If you intend on making large time changes it is best to
shutdown the instances on that node to avoid a false eviction, especially if you are
using the 10g low-brownout patches, which allow really low misscount settings.
Apart from these issues, the Oracle server is immuned to time changes, i.e. will
not affect transaction/read consistency operations.
On Linux the "-x" flag can be added to the ntpd daemon to prevent the clock from
going backwards.
After you have downloaded a patch, you can go into the directory where you unpacked
the patch:
> pwd
/ora/install/4933522
Then use the following OPatch command:
> opatch query -is_rolling
...
Query ...
Please enter the patch location:
/ora/install/4933522
---------- Query starts ------------------
Patch ID: 4933522
....
Rolling Patch: True.
---------- Query ends -------------------
In RAC configuration, this parameter must be set in each instance. The instances are not
required to have a shared file system. On each instance the plsql_native_library_dir can
be set to point to an instance local directory. Alternately, if the RAC configuration
supports a shared (cluster) file system, you can use a common directory (on the shared
file system) for all instances. You can also check out the PL/SQL Native Compilation
FAQ on OTN: www.oracle.com/technology/tech/pl_sql/htdocs/ncomp_faq.html
I have 2 clusters named "crs" (the default), how do I get Grid Control to
recognize them as targets?
b) Prior to performing the Grid control agent install, just set CLUSTER_NAME
environment variable and run the install. This variable need to be set only for that install
session. No need to set it every time agent starts.
Yes. When ceritifed, you can use Vendor clusterware however you must still install and
use Oracle Clusterware for RAC. Best Practice is to leave Oracle Clusterware to manage
RAC. For details see Metalink Note 332257.1 and for Veritas SFRAC see 397460.1.
Sun: 8
HP UX: 16
HP Tru64: 8
IBM AIX:
* 8 nodes for Physical Shared (CLVM) SSA disk
There are also Step-by-Step notes available for each platform available on the Metalink
'Top Tech Docs' for RAC:
Note 184875.1
How To Check The Certification Matrix for Real Application Clusters
Please note that certifications for Real Application Clusters are performed against the
Operating System and Clusterware versions. The corresponding system hardware is
offered by System vendors and specialized Technology vendors. Some system vendors
offer pre-installed, pre-configured RAC clusters. These are included below under the
corresponding OS platform selection within the certification matrix.
DBCA can be used to create databases on raw devices in 9i RAC Release 1 and 9i
Release 2. Standard database creation scripts using SQL commands will work with file
system and raw.
DBCA cannot be used to create databases on file systems on Oracle 9i Release 1. The
user can choose to set up a database on raw devices, and have DBCA output a script. The
script can then be modified to use cluster file systems instead.
With Oracle 9i RAC Release 2 (Oracle 9.2), DBCA can be used to create databases on a
cluster filesystem. If the ORACLE_HOME is stored on the cluster filesystem, the tool
will work directly. If ORACLE_HOME is on local drives on each system, and the
customer wishes to place database files onto a cluster file system, they must invoke
DBCA as follows: dbca -datafileDestination /oradata where /oradata is on the CFS
filesystem. See 9iR2 README and bug 2300874 for more info.
Please check the certification matrix available through Metalink for your specific release.
Detailed Reasons:
1) cross-cabling limits the expansion of RAC to two nodes
2) cross-cabling is unstable:
a) Some NIC cards do not work properly with it. They are not able
to negotiate the DTE/DCE clocking, and will thus not function.
These NICS were made cheaper by assuming that the switch was going to
have the clock. Unfortunately there is no way to know which NICs do
not have that clock.
b) Media sense behaviour on various OS's (most notably Windows) will
bring a NIC down when a cable is disconnected.
Either of these issues can lead to cluster instability and lead to ORA-
29740 errors (node evictions).
Veritas Storage Foundation 4.0 is certified on AIX, Solaris and HPUX for 9i RAC and
Oracle RAC 10g. Veritas is production also on Linux, but it is not certified by Oracle. If
customers choose Veritas on Linux with Oracle 9i, Oracle will support the Oracle
products in the stack.
Veritas Storage Foundation is currently not certified with 10g Release 2 on any platform.
Check Certify for the latest information.
No. We do not support RAC on VMWare. Aside from the support restrictions for the
database on VMWare outlined in Metalink Note 249212.1, there is a technical issue with
VMWare periodically resynchronizing it's system clock with the underlying OS. This can
disrupt the underlying clusterware services.
No, Oracle RAC 10g does not support 3rd Party clusterware on Linux. This means that if
a cluster file system requires a 3rd party clusterware, the cluster file system is not
supported.
No, there should be only one Oracle Cluster Manager (ORACM) running on each node.
All RAC databases should run out of the $ORACLE_HOME that ORACM is installed in.
Please carefully read the following new information about configuring Oracle Cluster
Management on Linux, provided as part of the patch README:
[5000(msec) is hardcoded]
If CPU utilization in your system is high and you experience unexpected node reboots,
check the wdd.log file. If there are any 'ping came too late' messages, increase the value
of the above parameters.
Yes, OCFS (Oracle Cluster Filesystem) is now available for Linux. The following
Metalink note has information for obtaining the latest version of OCFS:
Note 238278.1 - How to find the current OCFS version for Linux
Can RAC 10g and 9i RAC be installed and run on the same physical Linux
cluster?
Yes - However Oracle Clusterware (CRS) will not support a 9i RAC database so you will
have to leave the current configuration in place. You can install Oracle Clusterware and
RAC 10g into the same cluster. On Windows and Linux, you must run the 9i Cluster
Manager for the 9i Database and the Oracle Clusterware for the 10g Database. When you
install Oracle Clusterware, your 9i srvconfig file will be converted to the OCR. Both 9i
RAC and 10g will use the OCR. Do not restart the 9i gsd after you have installed Oracle
Clusterware. Remember to check certify for details of what vendor clusterware can be
run with Oracle Clusterware.
as root user:
/sbin/lsmod | grep hangcheck
(Note that in 9i, the recommended values for tick and margin were 30
and 180, respectively).
To ensure the module is loaded every time the system reboots, verify
that the local system startup file (/etc/rc.d/rc.local) contains the
command above.
For additional information please review the Oracle RAC Install and
Configuration Guide (5-41).
After a successful installation of Oracle Clusterware a simple reboot and the Clusterware
fails to start. This is because the permissions on the raw devices for the OCR and voting
disks e.g. /dev/raw/raw{x} revert to their default values (root:disk) and are inaccessible to
Oracle. This change of behavor started with the 2.6 kernel; in RHEL4, OEL4, RHEL5,
OEL5, SLES9 and SLES10. In RHEL3 the raw devices maintained their permissions
across reboots so this symptom was not seen.
Note that this applied to all raw device files, here just the voting and OCR devices were
specified.
Yes. See Certify to find out which platforms are currently certified.
Customer did not load the hangcheck-timer before installing RAC, Can
the customer just load the hangcheck-timer ?
YES. Customer can install the hangcheck timer and load it. No need to reboot the nodes.
It is an informational message. Generally for such scripts, you can issue echo �$?� to
ensure that it returns a zero value. The message is basically saying, it did not find an
oracm. If Customer were installing 10g on an existing 9i cluster (which will have oracm)
then this message would have been serious. But since customer is installing this on a
fresh new box, They can continue the install.
The configuration takes place below Oracle. You need to talk to your Infiniband vendor.
Check certify for what is currently available as this will change as vendors adopt the
technology. The database must be at least 10.2.0.3. If you want to switch a database
running with IP over IB, you will need to relink Oracle.
$ cd $ORACLE_HOME/rdbms/lib $ make -f ins_rdbms.mk ipc_rds ioracle
You can check your interconnect through the alert log at startup. Check for the string
�cluster interconnect IPC version:Oracle RDS/IP (generic)� in the alert .log file.
Client is running Veritas cluster on a SunOS 2.9. When we ran the ran the
10.2.0 installer it did not discover the nodes but with 9i it was able to
discover both the nodes. Is there anything specific to be done for
10.2.0 db install?
You have no idea what a wild ride you are in for. It is imperative that you follow the
Symantec install guide, all the way thru the root.sh patch step of the Oracle Clusterware
install. The guide is on the install media. If you don't have it, you can pull it from Jack
Connelly's staging area.
It is an NFS mount. Just mount jacksun1.us.oracle.com:/stage to a local UNIX box.
cd to Veritas/docs .... copy sfrac_install.pdf to any local directory and you will have what
you need.
This install isn't too bad unless you are on Solaris 10. .... give yourself lots of time to
implement .... a month to six weeks is not unrealistic - your milage will vary depending
on Veritas / Solaris knowledge. Thanks to Jack Connelly in VOS Support.
Sun Solaris includes an inherent Multipathing tool: MPXIO - this is part of Solaris. You
need to have the SanFoundation Kit installed (newest version). Please, be aware that the
machines are installed following the EIS-standard. This is a quality assurance standard
introduced by Sun that mainly takes care that you always have the newest patches.
MPXIO is free of charge and comes with Solaris 8,9,10. BTW, if you have a Sun LVM, it
would use this feature indirectly. Therefore, Sun confirmed that MPXIO will work with
RAWs.
No. RAC is currently not supported with Solaris 10 Local Containers. You can use a
Global container but remember 1 global container per system or per domain. So, in case
your hardware is capable of being split up in domains, you may have more than 1 global
container on the whole system (hardware), that is per domain.
In local containers, you cannot manipulate hardware in any way, shape or form. You can't
plumb and unplumb network interfaces .... nothing ... even as the local container root
user. You can only do this in the global container. We rely on the uadmin command to
quickly bring down a node if an urgent condition is detected. As I recall, you can't do this
from the local container either. CRS has to maintain the ability to manipulate hardware
and this just is not going to happen in a local container.
The answer is the same if you are using Vendor Clusterware such as Veritas SF RAC or
Sun Cluster.
For IPMP For active/active configurations please follow the sun doc instructions
http://docs.sun.com/app/docs/doc/816-4554/6maoq027i?a=view IPMP active/active is
known to load balance on transmit but serialize on a single interface for receive. So you
are likely not to get the throughput you might have expected. Unless you experience
explicit bandwidth limitations that require active/active, it is a recommended best practice
to configure for maximum availability, as described in webiv note 283107.1.
Please note too that debugging active/active interfaces at the network layer is
cumbersome and time consuming. In an active/active configuration and the switch side
link fails, you are likely to lose both interconnect connections, whereas active/standby,
you would failover.
- 10g RAC + HMP + Itanium, "Oracle has no plans and will likely never
support RAC over HMP on IPF."
Does the Oracle Cluster File System (OCFS) support network access
through NFS or Windows Network Shares?
No, in the current release the Oracle Cluster File System (OCFS) is not
supported for use by network access approaches like NFS or Windows
Network Shares.
My customer wants to understand what type of disk caching they can use
with their Windows RAC Cluster, the install guide tells them to
disable disk caching?
If the write cache identified is local to the node then that is bad for RAC. If the cache is
visible to all nodes as a 'single cache', typically in the storage array, and is also 'battery
backed' then that is OK.
Can I run my 9i RAC and RAC 10g on the same Windows cluster?
Yes but the 9i RAC database must have the 9i Cluster Manager and you must run Oracle
Clusterware for the Oracle Database 10g. 9i Cluster Manager can coexsist with Oracle
Clusterware 10g.
Be sure to use the same 'cluster name' in the appropriate OUI field for both 9i and 10g
when you install both together in the same cluster.
The OracleCMService9i service will remain intact during the Oracle Clusterware 10g
install, as a 9i RAC database would require that the 9i OracleCMService9i, it should be
left running. The information for the 9i database will get migrated to the OCR during the
Oracle Clusterware installation. Then, for future database management, you would use
the 9i srvctl to manage the 9i database, and the 10g srvctl to manage any new 10g
databases. Both srvctl commands will use the OCR.
When I try to login to the +ASM2 on node2 with asmcmd (after setting
ORACLE_HOME and ORACLE_SID correctly) I get: ORA-01031:
insufficient privileges (DBD ERROR: OCI SessionBegin). When I
try to login to +ASM2 using sqlplus (connect / as sysdba) I get the
same ORA-01031: insufficient privileges. When I try to login to
+ASM2 using sqlplus (connect sys/passwd as sysdba) I get connected
successfully.
This sounds like the ORA_DBA group on Node2 is empty, or else does not have the
correct username in it. Double-check what user account you are using to logon to Node2
as ( a 'set' command will show you the USERNAME and USERDOMAIN values) and
then make sure that this account is part of ORA_DBA.
The other issue to check is that SQLNET.AUTHENTICATION_SERVICES=(NTS) is
set in the SQLNET.ORA
Fail Safe development is continuing. Most work on the product will be around
accomodating changes in the supported resources (new releases of RDBMS, AS, etc.)
and the underlying Microsoft Cluster Services and Windows operating system.
A failsafe protected instance is an Active/Passive instance so, as such, does not benefit
that much at all from adding more nodes to a cluster. Microsoft have a limit of nodes in a
MSCS cluster. (typically 8 nodes - but it does vary). RAC is active active so you get dual
benefits of increased scalability and availability every time you add a node to a cluster.
We have a limit of 100 nodes in a RAC cluster (we don't use MSCS). Your customer
should really consider more than 2 nodes. (because of aggregate computer power on node
failure). If the choice is 2 of 4 CPU nodes or 4 of 2CPU node then I would go for 2 CPU
nodes. Customers are using both Windows Itanium RAC and Windows X64 RAC.
Windows X64 seems more popular.
Keep in mind, though, that for Fail Safe, if the server is 64-Bit, regardless of flavor, Fail
Safe Manager must be installed on a 32-Bit client, which will complicate things just a bit.
There is no such restriction for RAC, as all management for RAC can be done via Grid
Control or Database Control.
For EE RAC you can implement an 'extended cluster' where there is a distance between
the nodes in the cluster (usually less than 20 KM).
"If you are not using HACMP, you must use a GPFS file system to store the
Oracle CRS files" ==> this is a documentation bug and this will be fixed with
10.1.0.3
-----
in order to allow AIX to access the devices from more than one node
simultaneously.
Use the /dev/rhdisk devices (character special) for the crs and voting disk and
change the attribute with the command
(for ESS, EMC, HDS, CLARiiON, and MPIO-capable devices you have to do an
chdev -l hdiskn -a reserve_policy=no_reserve)
Is HACMP needed for RAC on AIX 5.2 using GPFS file system?
The newest version of GPFS can be used without HACMP, if it is available for AIX 5.2
then you do not need HACMP.
Is VIO supported with RAC on IBM AIX?
VIO is not supported for storage. IBM is still working to improve the shared disk
capability to use it with RAC. So currently if your customer wants RAC, he must attach
all shared disks to store our database via direct attachments. But VIO delivers networks
features usable with RAC, for example if the customer is planning to use several LPARs
as support of RAC instances a VLAN could be implemented for or RAC interconnect. A
Virtual Ethernet could also be used for our VIP onfiguration. So in conclusion, you
should discuss with your customer about the global architecture and check which part of
the VIO could be used, you should also analyze some performance aspects (keep in mind
that using a shared resources can impact performance...)
YES! There is no separate documentation for RAC on z/OS. What you would call
"clusterware" is built in to the OS and the native file systems are global. IBM z/OS
documentation explains how to set up a Sysplex Cluster; once the customer has done that
it is trivial to set up a RAC database. The few steps involved are covered in in Chapter 14
of the Oracle for z/OS System Admin Guide, which you can read here. There is also an
Install Guide for Oracle on z/OS ( here) but I don't think there are any RAC-specific steps
in the installation. By the way, RAC on z/OS does not use Oracle's clusterware
(CSS/CRS/OCR).
Yes, For detailed information on the integration with the various releases of Application
Server 10g,
http://www.oracle.com/technology/tech/java/newsletter/articles/oc4j_data_sources/oc4j_
ds.htm
Can I use Oracle Clusterware for failover of the SAP Enqueue and VIP
services when running SAP in a RAC environment?
Oracle has created sapctl to do this and it is available for certain platforms. SAPCTL will
be available for download on SAP Services Marketplace on AIX and Linux. For Solaris,
it will not be available in 2007, use Veritas or Sun Cluster.
How do I gather all relevant Oracle and OS log/trace files in a RAC cluster
to provide to Support?
Use RAC-DDT (RAC Diagnostic Data Tool), User Guide is in Metalink note# 301138.1.
Quote from the User Guide:
Newer versions of RDA (Remote Diagnostic Agent) have the RAC-DDT functionality,
so going forward RDA is the tool of choice. The RDA User Guide is in Metalink note#
314422.1
These directories are produced by the diagnosibility daemon process (DIAG). DIAG is a
process related to RAC which as one of its tasks, performs cash dumping. The DIAG
process dumps out tracing to file when it discovers the death of an essential process
(foreground or background) in the local instance. A dump directory named something
like cdmp_ is created in the bdump or background_dump_dest directory, and all the trace
dump files DIAG creates are placed in this directory.
Is the Oracle E-Business Suite (Oracle Applications) certified against
RAC?
Following is the recommended and most optimal path to migrate you E-Business suite to
RAC environment:
2. Use Clustered File System for all data base files or migrate all database files to raw
devices. (Use dd for Unix or ocopy for NT)
5. In step 4, install RAC option while installing Oracle9i and use Installer to perform
install for all the nodes.
Reference Documents:
Oracle E-Business Suite Release 11i with 9i RAC: Installation and Configuration :
Metalink Note# 279956.1
E-Business Suite 11i on RAC : Configuring Database Load balancing & Failover:
Metalink Note# 294652.1
Oracle E-Business Suite 11i and Database - FAQ : Metalink# 285267.1
Large clients commonly put the concurrent manager on a separate server now (in the
middle tier) to reduce the load on the database server. The concurrent manager programs
can be tied to a specific middle tier (e.g., you can have CMs running on more than one
middle tier box). It is advisable to use specilize CM. CM middle tiers are set up to point
to the appropriate database instance based on product module being used.
If your processing requirements are extreme and your testing proves you must partition
your workload in order to reduce internode communications, you can use Profile Options
to designate that sessions for certain applications Responsibilities are created on a
specific middle tier server. That middle tier server would then be configured to connect to
a specific database instance.
To determine the correct partitioning for your installation you would need to consider
several factors like number of concurrent users, batch users, modules used, workload
characteristics etc.
Versions 11.5.5 onwards are certified with Oracle9i and hence with Oracle9i RAC.
However we recommend the latest available version.
TAF itself does not work with e-Business suite due to Forms/TAF limitations, but you
can configure the tns failover clause. On instance failure, when the user logs back into the
system, their session will be directed to a surviving instance, and the user will be taken to
the navigator tab. Their committed work will be available; any uncommitted work must
be re-started.
We also recommend you configure the forms error URL to identify a fallback middle tier
server for Forms processes, if no router is available to accomplish switching across
servers.
It is not supported to use OCFS with Standard Edition RAC. All database files must use
ASM (redo logs, recovery area, datafiles, control files etc). You can not place binaries on
OCFS as part of the SE RAC terms. We recommend that the binaries and trace files (non-
ASM supported files) to be replicated on all nodes. This is done automatically by install.
Oracle 9iRAC on Linux, using OCFS for datafiles, can scale to a maximum of 32 nodes.
For optimal performance, you should only put the following files on Linux OCFS:
- Datafiles
- Control Files
- Redo Logs
- Archive Logs
- Shared Configuration File (OCR)
- Quorum / Voting File
- SPFILE
Sun Cluster - Sun StorEdge QFS (9.2.0.5 and higher,10g and 10gR2):
No restrictions on placement of files on QFS
Sun StorEdge QFS is supported for Oracle binary executables, database data files,
database data files, archive logs, Oracle Cluster Registry (OCR), Oracle Cluster
ReadyServices voting disk and recovery area can be placed on QFS.
Solaris Volume Manager for Sun Cluster can be used for host-based mirroring
Supports up to 8 nodes
Is Red Hat GFS(Global File System) is certified by Oracle for use with
Real Application Clusters?
Sistina Cluster Filesystem is not part of the standard RedHat kernel and therefore is not
certified by Oracle but falls under a kernel extension. This however, does not mean that
Oracle RAC is not certified with it. As a fact, Oracle RAC does not certify against a
filesystem per se, but certifies against an operating system. If, as is the case with Sistina
filesystem, the filesystem is certified with the operating system, this only means that the
Oracle does not provide direct support and fix the filesystem in case of an error.
Customer will have to contact the filesystem provider for support.
Yes. Oracle Clusterware 10.2 will support both 10.1 and 10.2 databases (and ASM too!).
A detailed matrix is available in Metalink Note 337737.1
Having a short and long disktimeout, and no longer just one disktimeout, is due to patch
for bug 4748797 (included in 10.2.0.2). The long disktimeout is 200 sec by default unless
set differently via 'crsctl set css disktimeout', and applies to time outside a
reconfiguration. The short disktimeout is in effect during a reconfiguration and is
misscount-3s. The point is that we can tolerate a long disktimeout when all nodes are just
running fine, but have to revert back to a short disktimeout if there's a reconfiguration.
Customer is hitting bug 4462367 with an error message saying low open
file descriptor, how do I work around this until the fix is released
with the Oracle Clusterware Bundle for 10.2.0.3 or 10.2.0.4 is
released?
The fix for "low open file descriptor" problem is to increase the ulimit for Oracle
Clusterware. Please be careful when you make this type of change and make a
backup copy of the init.crsd before you start! To do this, you can modify the init.crsd
as follows, while you wait for the patch: 1. Stop Oracle Clusterware on the node (crsctl
stop crs)
2. copy the /etc/init.d/init.crsd
3. Modify the file changing:
# Allow the daemon to drop a diagnostic core file/
ulimit -c unlimited
ulimit -n unlimited
to
# Allow the daemon to drop a diagnostic core file/
ulimit -c unlimited
ulimit -n 65536
- stop the CRS stack on all nodes using "init.crs stop" - Edit /var/opt/oracle/ocr.loc on all
nodes and set up ocrconfig_loc=new OCR device - Restore from one of the automatic
physical backups using ocrconfig -restore. - Run ocrcheck to verify. - reboot to restart the
CRS stack. - additional information can be found at http://st-
doc.us.oracle.com/10/101/rac.101/b10765/storage.htm#i1016535
Rerunning root.sh after the initial successful install of the Oracle Clusterware is expressly
discouraged and unsupported. We strongly recommend not doing it.
In case where root.sh is failing to execute for the on an initial install (or a new node
joining an existing cluster), it is OK to re-run root.sh after the cause of the failure is
corrected (permissions, paths, etc.). In this case, please run rootdelete.sh to undo the local
effects of root.sh before re-running root.sh.
Hostname changes are not supported in Oracle Clusterware (CRS), unless you want to
perform a deletenode followed by a new addnode operation.
The hostname is used to store among other things the flag files and CRS stack will not
start if hostname is changed.
Is it supported to allow 3rd Party Clusterware to manage Oracle resources
(instances, listeners, etc) and turn off Oracle Clusterware
management of these?
In 10g we do not support using 3rd Party Clusterware for failover and restart of Oracle
resources. Oracle Clusterware resources should not be disabled.
No, the OCR and voting disk must be on raw or CFS (cluster filesystem).
Can I set up failover of the VIP to another card in the same machine or
what do I do if I have different network interfaces on different nodes
in my cluster (I.E. eth0 on node1,2 and eth1 on node 3,4)?
With srvctl, you can modify the nodeapp for the VIP to list the NICs it can use. Then VIP
will try to start on eth0 interface and if it fails, try eth1 interface.
./srvctl modify nodeapps -n -A / /eth0\|eth1
Note how the interfaces are a list separated by the �|� symbol and how you need to
quote this with a �\� character or the Unix shell will interpret the character as a
�pipe�. So on a node called ukdh364 with a VIP address of ukdh364vip and we want a
netmask (say) of 255.255.255.0 then we have:
./srvctl modify nodeapps -n ukdh364 -A ukdh364vip/255.255.255.0/eth0\|eth1
To check which interfaces are configured as public or private use oifcfg getif
example output:
eth0 138.2.238.0 global public
eth1 138.2.240.0 global public
eth2 138.2.236.0 global cluster_interconnect
An ifconfig on your machine will show what the hardware names for the interface cards
installed.
Check Note.5187351.8 You can either apply the patchset if it is available for your
platform or have a cron job that removes these files until the patch is available.
The private names on the first screen determine which private interconnect will be used
by CSS.
Provide exactly one name that maps to a private IP address, or just the IP address itself. If
a logical name is used, then the IP address this maps to can be changed subsequently, but
if you IP address is specified CSS will always use that IP address. CSS cannot use
multiple private interconnects for its communication hence only one name or IP address
can be specified.
The private interconnect enforcement page determines which private interconnect will be
used by the RAC instances.
It's equivalent to setting the CLUSTER_INTERCONNECTS init.ora parameter, but is
more convenient because it is a cluster-wide setting that does not have to be adjusted
every time you add nodes or instances. RAC will use all of the interconnects listed as
private in this screen, and they all have to be up, just as their IP addresses have to be
when specified in the init.ora paramter. RAC does not fail over between cluster
interconnects; if one is down then the instances using them won't start.
Can I change the name of my cluster after I have created it when I am
using Oracle Clusterware?
No, you must properly deinstall Oracle Clusterware and then re-install. To properly de-
install Oracle Clusterware, you MUST follow the directions in the Installation Guide
Chapter 10. This will ensure the ocr gets cleaned out.
What should the permissions be set to for the voting disk and ocr when
doing a RAC Install?
The Oracle Real Application Clusters install guide is correct. It describes the PRE
INSTALL ownership/permission requirements for ocr and voting disk. This step is
needed to make sure that the CRS install succeeds. Please don't use those values to
determine what the ownership/permmission should be POST INSTALL. The root script
will change the ownership/permission of ocr and voting disk as part of install. The POST
INSTALL permissions will end up being : OCR - root:oinstall - 640 Voting Disk -
oracle:oinstall - 644
Oracle Cluster Registry (OCR) is used to store the cluster configuration information
among other things. OCR needs to be accessible from all nodes in the cluster. If OCR
became inaccessible the CSS daemon would soon fail, and take down the node. PMON
never needs to write to OCR. To confirm if OCR is accessible, try ocrcheck from your
ORACLE_HOME and ORA_CRS_HOME.
The only recommended way to restore an OCR from a backup is "ocrconfig -restore ".
The ocopy command will not be able to perform the restore action for OCR.
Does the hostname have to match the public name or can it be anything
else?
When there is no vendor clusterware, only CRS, then the public node name must match
the host name. When vendor clusterware is present, it determines the public node names,
and the installer doesn't present an opportunity to change them. So, when you have a
choice, always choose the hostname.
There is no requirement for interface name ordering. You could have - public on ETH2 -
private on ETH0 Just make sure you choose the correct public interface in VIPCA, and in
the installer's interconnect classification screen.
As long as you can confirm via the CSS daemon logfile that it thinks the voting disk is
bad, you can restore the voting disk from backup while the cluster is online. This is the
backup that you took with dd (by the manual's request) after the most recent addnode,
deletenode, or install operation. If by accident you restore a voting disk that the CSS
daemon thinks is NOT bad, then the entire cluster will probably go down.
crsctl add css votedisk - adds a new voting disk
crsctl delete css votedisk - removes a voting disk
Note: the cluster has to be down. You can also restore the backup via dd when the cluster
is down.
The automatic backup mechanism keeps upto about a week old copy. So, if
you want to retain a backup copy more than that, then you should copy
that "backup" file to some other name.
Unfortunately there are a couple of bugs regarding backup file
manipulation, and changing default backup dir on Windows. These will be
fixed in 10.1.0.4. OCR backup on Windows are absent. Only file in the
backup directory is
temp.ocr which would be the last backup. You can restore this most
recent backup by using the command ocr -restore temp.ocr
If you want to take a logical copy of OCR at any time use : ocrconfig
-export
, and use -import option to restore the contents back.
In Oracle Database 10g Release 1 the OCR and Voting device are not mirrored within
Oracle,hence both must be mirrored via a storage vendor method, like RAID 1.
Starting with Oracle Database 10g Release 2 Oracle Clusterware will multiplex the OCR
and Voting Disk (two for the OCR and three for the Voting).
Please read Note:279793.1 and Note:268937.1 regarding backup and restore a lost
Voting/OCR and FAQ 6238 regarding OCR backup.
This needs to be done externally to Oracle Clusterware usually by some OS provided nic
bonding which gives Oracle Clusterware a single ip address for the interconnect but
provide failover (High Availability) and/or load balancing across multiple nic cards.
These solutions are provided externally to Oracle at a much lower level than the Oracle
Clusterware, hence Oracle supports using them, the solutions are OS dependent and
therefore the best source of information is from your OS Vendor. However, there are
several articles in Metalink on how to do this. For example for Sun Solaris search for
IPMP (IP network MultiPathing).
So the current Linux implementation supports either failover (HA) or load balancing, but
not both. Third party vendors may be able to provide custom tailored solutions for Linux
that (would probably fall outside the scope of Unbreakable support from Oracle but) will
provide both failover and load balancing.
First write a control agent. It must accept 3 different parameters: start-The control agent
should start the application, check-The control agent should check the application, stop-
The Control agent should start the application. Secondly you must create a profile for
your application using crs_profile. Thirdly you must register your application as a
resource with Oracle Clusterware (crs_register). See the RAC Admin and Deployment
Guide for details.
Oracle does not provide the necessary wrappers to fail over single-instance databases
using Oracle Clusterware 10g Release 2. But since it's possible for customers to use
Oracle Clusterware to wrap arbitrary applications, it'd be possible for them to wrap
single-instance databases this way.
Yes, with Oracle Database 10g Release 2, Oracle Clusterware now supports an
"application" vip. This is to support putting applications under the control of Oracle
Clusterware using the new high availability API and allow the user to use the same URL
or connection string regardless of which node in the cluster the application is running on.
The application vip is a new resource defined to Oracle Clusterware and is a functional
vip. It is defined as a dependent resource to the application. There can be many vips
defined, typically one per user application under the control of Oracle Clusterware. You
must first create a profile (crs_profile), then register it Clusterware (crs_register). The
usrvip script must run as root.
If anyone other than root has write permissions to the parent directories of the CRS home,
then they can give themselves root escalations. This is a security issue. The CRS home
itself is a mix of root and non-root permissions, as appropriate to the security
requirements. Please follow the install docs about who is your primary group and what
other groups you need to create and be a member of.
The voting disk is accessed exclusively by CSS (one of the Oracle Clusterware daemons).
This is totally different from a database file. The database looks at the database files and
interacts with the CSS daemon (at a significantly higher level conceptually than any
notion of "voting disk").
As far as voting disks are concerned, a node must be able to access strictly more than half
of the voting disks at any time. So if you want to be able to tolerate a failure of n voting
disks, you must have at least 2n+1 configured. (n=1 means 3 voting disks). You can
configure up to 32 voting disks, providing protection against 15 simultaneous disk
failures, however it's unlikely that any customer would have enough disk systems with
statistically independent failure characteristics that such a configuration is meaningful. At
any rate, configuring multiple voting disks increases the system's tolerance of disk
failures (i.e. increases reliability).
Configuring a smaller number of voting disks on some kind of RAID system can allow a
customer to use some other means of reliability than the CSS's multiple voting disk
mechanisms. However there seem to be quite a few RAID systems that decide that 30-60
second (or 45 minutes in the case of veritas) IO latencies are acceptable. However we
have to wait for at least the longest IO latency before we can declare a node dead and
allow the database to reassign database blocks. So while using an independent RAID
system for the voting disk may appear appealing, sometimes there are failover latency
consequenecs.
If you lose 1/2 or more of all of your voting disks, then nodes get evicted from the
cluster, or nodes kick themselves out of the cluster. It doesn't threaten database
corruption. For this reason we recommend that customers use an 3 or more voting disks
in 10g Release 2 (always in an odd number). Restoring corrupted voting disks is easy
since there isn't any significant persistent data stored in the voting disk. See the RAC
Admin and Deployment Guide for information on backup and restore of voting disks.
How can I register the listener with Oracle Clusterware in RAC 10g
Release 2?
NetCA is the only tool that configures listener and you should be always using it. It will
register the listener with Oracle Clusterware. There are no other supported alternatives.
Can the Network Interface Card (NIC) device names be different on the
nodes in a cluster, for both public and private?
The private NICs can be different accross nodes but public must be the same (ER
5439875 filed). If the private NIC names are different, you can either configure them
using oifcfg setif -node (rather than -global) for each node....in which case all RAC
instances on the node will use the specified one. Or if you want to use
CLUSTER_INTERCONNECTS init.ora parameter you set it for each instance to the IP
address(es) you want that instance to use.
Can I configure HP's Autoport aggregation for NIC Bonding after the
install? (i.e. not present beforehand)
You are able to add NIC bonding after the installation although this is more complicated
than the other way round.
There are several notes on webiv regarding this.
Note.271121.1 Ext/Pub How to change VIP and VIP/Hostname in 10g
Note.276434.1 Ext/Pub Modifying the VIP of a Cluster Node
Regarding the private interconnect, please use oifcfg delif / setif to modify this.
For customers on Linux, there is more information on NIC bonding, please read
Configure Redundant Network Cards / Switches for Oracle Database 10g Release 1 Real
Application Cluster on Linux
The install guide will tell you the following requirements private IP address must satisfy
the following requirements:
1. Must be separate from the public network
2. Must be accessible on the same network interface on each node
3. Must have a unique address on each node
4. Must be specified in the /etc/hosts file on each node
The Best Pratices recommendation is to use the TCP/IP standard for non-routeable
networks. Reserved address ranges for private (non-routed) use (see TCP/IP RFC 1918):
* 10.0.0.0 -> 10.255.255.255
* 172.16.0.0 -> 172.31.255.255
* 192.168.0.0 -> 192.168.255.255
Cluvfy will give you an error if you do not have your private interconnect in the ranges
above.
You should not ignore this error. If you are using an IP address in the range used for the
public network for the private network interfaces, you are pretty much messing up the IP
addressing, and possibly the routing tables, for the rest of the corporation. IP addresses
are a sparse commodity, use them wisely. If you use them on a non-routable network,
there is nothing to prevent someone else to go and use them in the normal corporate
network, and then when those RAC nodes find out that there is another path to that
address range (through RIP), they just might start sending traffic to those other IP
addresses instead of the interconnect. This is just a bad idea.
When ct run the command 'onsctl start' receives the message "Unable to
open libhasgen10.so". Any idea why the message "unable to open
libhasgen10.so" ?
Most likely you are trying to start ONS from ORACLE_HOME instead of CRS_HOME.
Please try to start it from ORA_CRS_HOME.
Does Oracle Clusterware have to be the same or higher release than all
instances running on the cluster?
Yes - Oracle Clusterware must be the same or a higher release with regards to the
RDBMS or ASM Homes.
Please refer to Note#337737.1
Check out Chapter 3 of the EM advanced configuration guide, specifically the section on
active passive configuration of agents. You should be able to model those to your
requirements. There is nothing special about the commands, but you do need to follow
the startup/shutdown sequence to avoid any discontinuity of monitoring. The agent does
start a watchdog that monitors the health of the actual monitoring process. This is done
automatically at agent start. Therefore you could use Oracle Clusterware but you should
not need to.
Why does Oracle Clusterware use an additional 'heartbeat' via the voting
disk, when other cluster software products do not?
Oracle uses this implementation because Oracle clusters always have access to a shared
disk environment. This is different from classical clustering which assumes shared
nothing architectures, and changes the decision of what strategies are optimal when
compared to other environments. Oracle also supports a wide variety of storage types,
instead of limiting it to a specific storage type (like SCSI), allowing the customer quite a
lot of flexibility in configuration.
Why does Oracle still use the voting disks when other cluster sofware is
present?
Voting disks are still used when 3rd party vendor clusterware is present, because vendor
clusterware is not able to monitor/detect all failures that matter to Oracle Clusterware and
the database. For example one known case is when the vendor clusterware is set to have
its heartbeat go over a different network than RAC traffic. Continuing to use the voting
disks allows CSS to resolve situations which would otherwise end up in cluster hangs.
This support is for 10gR2 onwards and has the following limitations:
1. As in any extended RAC environments, the additional latency induced by distance will
affect I/O and cache fusion performance. This effect will vary by distance and the
customer is responsible for ensuring that the impact attained in their environment is
acceptable for their application.
2. OCR must be mirrored across both sites using Oracle provided mechanisms.
3. Voting Disk redundancy must exists across both sites, and at a 3rd site to act as an
arbitrage. This third site may be via a WAN.
4. Storage at each site much be setup as seperate failure groups and use ASM mirroring,
to ensure at least one copy of the data at each site.
5. Customer must have a seperate and dedicated test cluster also in an extended
configuration setup using the same software and hardware components (can be fewer or
smaller nodes).
6. Customer must be aware that in 10gR2 ASM does not provide partial resilvering.
Should a loss of connectivity between the sites occur, one of the failure groups will be
marked invalid. When the site rejoins the cluster, the failure groups will need to be
manually dropped and added.
http://www.oracle.com/technology/products/database/clustering/pdf/thirdvoteonnfs.pdf
Standard NFS is only supported for the tie-breaking voting disk in an extended cluster
environment. See platform and mount option restrictions at:
http://www.oracle.com/technology/products/database/clustering/pdf/thirdvoteonnfs.pdf
Otherwise just as with database files, we only support voting files on certified NAS
devices, with the appropriate mount options. Pls refer to Metalink Note 359515.1 for a
full description of the required mount options. For a complete list of supported NAS
vendors refer to OTN at:
http://www.oracle.com/technology/deploy/availability/htdocs/vendors_nfs.html
No. When using SE RAC the nodes must be co-located in the same room. This is a
license restriction rather than a technical one.
Necessary Connections
Interconnect, SAN, and IP Networking need to be kept on separate channels, each with
required redundancy. Redundant connections must not share the same Dark Fiber (if
used), switch, path, or even building entrances. Keep in mind that cables can be cut.
The SAN and Interconnect connections need to be on dedicated point-to-point
connections. No WAN or Shared connection allowed. Traditional cables are limited to
about 10 km if you are to avoid using repeaters. Dark Fiber networks allow the
communication to occur without repeaters. Since latency is limited, Dark Fiber networks
allow for a greater distance in separation between the nodes. The disadvantage of Dark
Fiber networks are they can cost hundreds of thousands of dollars, so generally they are
only an option if they already exist between the two sites.
If direct connections are used (for short distances) this is generally done by just stringing
long cables from a switch. If a DWDM or CWDM is used then then these are directly
connected via a dedicated switch on either side.
Note of caution: Do not do RAC Interconnect over a WAN. This is a the same as doing it
over the public network which is not supported and other uses of the network (i.e. large
FTPs) can cause performance degradations or even node evictions.
For SAN networks make sure you are using SAN buffer credits if the distance is over
10km.
At the moment in Oracle 10g, if Oracle Clusterware is being used, we also require that a
single subnet be setup for the public connections so we can fail over VIPs from one side
to another.
What is a stage?
CVU supports the notion of Stage verification. It identifies all the important stages in
RAC deployment and provides each stage with its own entry and exit criteria. The entry
criteria for a stage define a specific set of verification tasks to be performed before
initiating that stage. This pre-check saves the user from entering into a stage unless its
pre-requisite conditions are met. The exit criteria for a stage define another specific set of
verification tasks to be performed after completion of the stage. The post-check ensures
that the activities for that stage have been completed successfully. It identifies any stage
specific problem before it propagates to subsequent stages; thus making it difficult to find
its root cause. An example of a stage is "pre-check of database installation", which
checks whether the system meets the criteria for RAC install.
What is a component?
CVU supports the notion of Component verification. The verifications in this category
are not associated with any specific stage. The user can verify the correctness of a
specific cluster component. A component can range from a basic one, like free disk space
to a complex one like CRS Stack. The integrity check for CRS stack will transparently
span over verification of multiple sub-components associated with CRS stack. This
encapsulation of a set of tasks within specific component verification should be of a great
ease to the user.
What is nodelist?
Nodelist is a comma separated list of hostnames without domain. Cluvfy will ignore any
domain while processing the nodelist. If duplicate entities after removing the domain
exist, cluvfy will eliminate the duplicate names while processing. Wherever supported,
you can use '-n all' to check on all the cluster nodes. Check this for more information on
nodelist and shortcuts.
Do I have to be root to use CVU?
No. CVU is intended for database and system administrators. CVU assumes the current
user as oracle user.
Please refer to the known issue/README files before filing a bug. If the issue is not
covered in those documents, file a bug against product# 5, component: OPSM and sub-
component: CLUVFY. Please provide the relevant log file while filing a bug.
CVU requires: 1._ An area with at least 30MB for containing software bits on the
invocation node. 2._ Java 1.4.1 location on the invocation node. 3._ A work directory
with at least 25MB on all the nodes. CVU will attempt to copy the necessary bits as
required to this location. Make sure, the location exists on all nodes and it has write
permission for CVU user. This dir is set through the CV_DESTLOC environment
variable. If this variable does not exist, CVU will use "/tmp" as the work dir. 4._ On
RedHat Linux 3.0, an optional package 'cvuqdisk' is required on all the nodes. This
assists CVU in finding scsi disks and helps CVU to perform storage checks on disks.
Please refer to What is 'cvuqdisk' rpm? for detail. Note that, this package should be
installed only on RedHat Linux 3.0 distribution.
How do I install 'cvuqdisk' package?
Here are the steps to install cvuqdisk package. 1._ Become root user 2._ Copy the rpm
( cvuqdisk-1.0.1-1.i386.rpm, current version is 1.0.1 ) to a local directory. You can find
the rpm in Oracle's OTN site. 3._ Set the environment variable to a group, who should
own this binary. Typically it is the "dba" group. export CVUQDISK_GRP=dba 4._ Erase
any existing package rpm -e cvuqdisk 5._ Install the rpm rpm -iv cvuqdisk-1.0.1-
1.i386.rpm
How do I know about cluvfy commands? The usage text of cluvfy does not
show individual commands.
Cluvfy has context sensitive help built into it. Cluvfy shows the most appropriate usage
text based on the cluvfy command line arguments. If you type 'cluvfy' on the command
prompt, cluvfy displays the high level generic usage text, which talks about valid stage
and component syntax. If you type 'cluvfy comp -list', cluvfy will show valid components
with brief description on each of them. If you type 'cluvfy comp -help', cluvfy will show
detail syntax for each of the valid components. Similarly, 'cluvfy stage -list' and 'cluvfy
stage -help' will list valid stages and their syntax respectively. If you type an invalid
command, cluvfy will show the appropriate usage for that particular command. For
example, if you type 'cluvfy stage -pre dbinst', cluvfy will show the syntax for pre-check
of dbinst stage.
What are the default values for the command line arguments?
Here are the default values and behavior for different stage and component commands:
Do I have to type the nodelist every time for the CVU commands? Is there
any shortcut?
You do not have to type the nodelist every time for the CVU commands. Typing the
nodelist for a large cluster is painful and error prone. Here are few short cuts. To provide
all the nodes of the cluster, type '-n all'. Cluvfy will attempt to get the nodelist in the
following order: 1. If a vendor clusterware is available, it will pick all the configured
nodes from the vendor clusterware using lsnodes utility. 2. If CRS is installed, it will pick
all the configured nodes from Oracle clusterware using olsnodes utility. 3. In none of the
above, it will look for the CV_NODE_ALL environmental variable. If this variable is not
defined, it will complain. To provide a partial list(some of the nodes of the cluster) of
nodes, you can set an environmental variable and use it in the CVU command. For
example: setenv MYNODES node1,node3,node5 cluvfy comp nodecon -n $MYNODES
Cluvfy supports a verbose feature. By default, cluvfy reports in non-verbose mode and
just reports the summary of a test. To get detailed output of a check, use the flag '-
verbose' in the command line. This will produce detail output of individual checks and
where applicable will show per-node result in a tabular fashion.
Use component verifications commands like 'nodereach' or 'nodecon' for this purpose.
For detail syntax of these commands, type cluvfy comp -help on the command prompt. If
the 'cluvfy comp nodecon' command is invoked without -i, cluvfy will attempt to
discover all the available interfaces and the corresponding IP address & subnet. Then
cluvfy will try to verify the node connectivity per subnet. You can run this command in
verbose mode to find out the mappings between the interfaces, IP addresses and subnets.
You can check the connectivity among the nodes by specifying the interface name(s)
through -i argument.
Yes, you can use 'comp ssa' command to check the sharedness of the storage. Please refer
to the known issues section for the type of storage supported by cluvfy.
You can use the component command 'cfs' to check this. Provide the OCFS file system
you want to check through the -f argument. Note that, the sharedness check for the file
sytem is supported for OCFS version 1.0.14 or higher.
Cluvfy provides commands to check a particular sub-component of the CRS stack as well
as the whole CRS stack. You can use the 'comp ocr' command to check the integrity of
OCR. Similarly, you can use 'comp crs' and 'comp clumgr' commands to check integrity
of crs and clustermanager sub-components. To check the entire CRS stack, run the stage
command 'clucvy stage -post crsinst'.
How do I check user accounts and administrative permissions related
issues?
Use admprv component verification command. Refer to the usage text for detail
instruction and type of supported operations. To check whether the privilege is sufficient
for user equivalence, use '-o user_equiv' argument. Similarly, the '-o crs_inst' will verify
whether the user has the correct permissions for installing CRS. The '-o db_inst' will
check for permissions required for installing RAC and '-o db_config' will check for
permissions required for creating a RAC database or modifying a RAC database
configuration.
The component verification command sys is meant for that. To check the system
requirement for RAC, use '-p database' argument. To check the system requirement for
CRS, use '-p crs' argument.
You can use the peer comparison feature of cluvfy for this purpose. The command 'comp
peer' will list the values of different nodes for several pre-selected properties. You can
use the peer command with -refnode argument to compare those properties of other nodes
against the reference node.
Why the peer comparison with -refnode says passed when the group or
user does not exist?
Peer comparison with the -refnode feature acts like a baseline feature. It compares the
system properties of other nodes against the reference node. If the value does not
match( not equal to reference node value ), then it flags that as a deviation from the
reference node. If a group or user does not exist on reference node as well as on the other
node, it will report this as 'matched' since there is no deviation from the reference node.
Similarly, it will report as 'mismatched' for a node with higher total memory than the
reference node for the above reason.
Yes. You can use the post-check command for cluster services setup(-post clusvc) to
verify CRS status. A more appropriate test would be to use the pre-check command for
database installation(-pre dbinst). This will check whether the current state of the system
is suitable for RAC install.
At what point cluvfy is usable? Can I use cluvfy before installing Oracle
Clusterware?
You can run cluvfy at any time, even before CRS installation. In fact, cluvfy is designed
to assist the user as soon as the hardware and OS is up. If you invoke a command which
requires CRS or RAC on local node, cluvfy will report an error if those required products
are not yet installed.
Set the environmental variable SRVM_TRACE to true. For example, in tcsh "setenv
SRVM_TRACE true" will turn on tracing.
CVU log files can be found under $CV_HOME/cv/log directory. The log files are
automatically rotated and the latest log file has the name cvutrace.log.0. It is a good idea
to clean up unwanted log files or archive them to reclaim disk place. Note that, no trace
files will be generated if tracing has not been turned on.
Why cluvfy reports "unknown" on a particular node?
Cluvfy reports unknown when it can not conclude for sure if the check passed or failed.
A common cause of this type of reporting is a non-existent location set for the
CV_DESTLOC variable. Please make sure the directory pointed by this variable exists
on all nodes and is writable by the user.
1._ Shared storage accessibility(ssa) check reports Current release of cluvfy has the
following limitations on Linux regarding shared storage accessibility check. a. Currently
NAS storage ( r/w, no attribute caching), OCFS( version 1.0.14 or higher ) and scsi
disks(if cvuqdisk package is installed) are supported. Note that, 'cvuqdisk' package
should be installed only on RedHat Linux 3.0 distribution. Discovery of scsi disks for
RedHat Linux 2.1 is not supported. b. For sharedness check on NAS, cluvfy requires the
user to have write permission on the specified path. If the cluvfy user does not have write
permission, cluvfy reports the path as not-shared. 2._ What database version is supported
by CVU? Current CVU release supports only 10g RAC and CRS and is not backward
compatible. In other words, CVU can not check or verify pre-10g products. 3._ What
Linux distributions are supported? This release supports only RedHat 3.0 Update 2 and
RedHat 2.1AS distributions. Note that, the CVU distribution for RedHat 3.0 Update 2
and RedHat 2.1AS are different; they are not binary compatible. In other words, CVU
bits for RedHat 3.0 and RedHat 2.1 are not the same. 4._ The component check for node
application (cluvfy comp nodeapp ...) command reports node app creation error if the
local CRS stack is down. This is a known issue and will be addressed shortly. 5._ CVU
does not recongnize the disk bindings ( e.g. /dev/raw/raw1 ) as valid storage paths or
identifiers. Please use the underlying disk( e.g. /dev/sdm etc ) for the storage path or
identifiers. 6._ Current version of CVU for RedHat 2.1 complains about the missing
cvuqdisk package. This will be corrected in the future release. User should ignore this
error. Note that, 'cvuqdisk' package should be installed only on RedHat Linux 3.0
distribution. Discovery of scsi disks for RedHat Linux 2.1 is not supported.
CVU requires root privilege to gather information about the scsi disks during discovery.
A small binary uses the setuid mechanism to query disk information as root. Note that
this process is purely a read-only process with no adverse impact on the system. To make
this secured, this binary is packaged in the cvuqdisk rpm and need root privilege to install
on a machine. If this package is installed on all the nodes, CVU will be able to perform
discovery and shared storage accessibility checks for scsi disks. Otherwise, it complains
about the missing package 'cvuqdisk'. Note that, this package should be installed only on
RedHat Linux 3.0 distribution. Discovery of scsi disks for RedHat Linux 2.1 is not
supported.