Beruflich Dokumente
Kultur Dokumente
for installing Oracle Database 10g with Real Application Cluster (RAC) on Red Hat
Enterprise Linux Advanced Server 3. The primary objective of this article is to
demonstrate a quick installation of Oracle 10g with RAC on RH AS 3. This article covers
Oracle Cluster File System (OCFS), Oracle's Automatic Storage Management (ASM),
and FireWire-based Shared Storage. Note that OCFS is not required for 10g RAC. In fact,
I never use OCFS for RAC systems. However, this article covers OCFS since some
people want to know how to configure and use OCFS.
If you have never installed Oracle10g on Linux before, then I'd recommend that you first
try to install an Oracle Database 10g on Linux by following my other guide Installing
Oracle Database 10g on Red Hat Linux.
I welcome emails from any readers with comments, suggestions, or corrections. You can
find my email address at the bottom of this website.
Introduction
* General
* Important Notes
* Oracle 10g RAC Setup
* Shared Disks Storage
General
FireWire-based Shared Storage for Linux
* General
* Automating Authentication for oracle ssh Logins
* Checking OCFS and Oracle Environment Variables
Checking OCFSs
Checking Oracle Environment Variables
* Installing Oracle 10g Cluster Ready Services (CRS) R1 (10.1.0.2)
* General
* Automating Authentication for oracle ssh Logins
* Checking Oracle Environment Variables
* Installing Oracle Database 10g Software R1 (10.1.0.2) with Real Application
Clusters (RAC)
Post-Installation Steps
Introduction
General
Oracle Real Application Cluster (RAC) is a cluster system at the application level. It uses
shared disk architecture that provides scalability for all kind of applications. Applications
without any modifications can use the RAC database.
Since the requests in a RAC cluster are spread evenly across the RAC instances, and
since all instances access the same shared storage, addition of server(s) require no
architecture changes etc. And a failure of a single RAC node results only in the loss of
scalability and not in the loss of data since a single database image is utilized.
Important Notes
There are a few important notes that might be useful to know before installing Oracle 10g
RAC:
(*) If you want to install Oracle 10g with RAC using FireWire-base shared storage, make
sure to read first FireWire-based Shared Storage for Linux!
(*) See also Oracle 10g RAC Issues, Problems and Errors
For this documentation I used Oracle Cluster File System (OCFS) for Oracle's Cluster
Ready Services (CRS) since some people want to know how to configure and use OCFS.
However, OCFS is not required for 10g RAC. In fact, I never use OCFS for RAC
systems. CRS requires two files, the "Oracle Cluster Registry (OCR)" file and the "CRS
Voting Disk" file, which must be shared accross all RAC nodes. You can also use raw
devices for these files. Note, however, that you cannot use ASM for the CRS files. These
CRS files need to be available for any RAC instance to run. And for ASM to become
available, the ASM instance needs to run first.
For Oracle's data files, control files, etc. I used Oracle's Automatic Storage Management
(ASM).
A requirement for Oracle Database 10g RAC cluster is a set of servers with shared disk
access and interconnect connectivity. Since each instance in a RAC system must have
access to the same database files, a shared storage is required that can be accessed from
all RAC nodes concurrently.
The shared storage space can be used as raw devices, or by using a cluster file system or
ASM. This article will address Oracle's Cluster File System OCFS and ASM. Note that
Oracle 10g RAC provides it's own locking mechanisms and therefore it does not rely on
other cluster software or on the operating system for handling locks.
Shared Storage can be expensive. If you just want to check out the features of Oracle10g
RAC without spending too much on cost, I'd recommend to buy an external FireWire-
based shared Storage for Oracle10g RAC.
NOTE: You can download a kernel from Oracle for FireWire-based shared storage for
Oracle10g RAC, but Oracle does not provide support if you have problems. It is intended
for testing and demonstration only! See Setting Up Linux with FireWire-based Shared
Storage for Oracle Database 10g RAC for more information.
NOTE: It is very important to get an external FireWire drive that allows concurrent
access for more than one server! Otherwise the disk(s) and partitions can only be seen by
one server at a time. Therefore, make sure the FireWire drive(s) have a chipset that
supports concurrent access for at least two servers or more. If you have already a
FireWire drive, you can check the maximum supported logins (concurrent access) by
following the steps at Configuring FireWire-based Shared Storage.
For test purposes I used external 250 GB and 200 GB Maxtor hard drives which support a
maximum of 3 concurrent logins. The technical specifications for these FireWire drives
are:
- Vendor: Maxtor
- Model: OneTouch
- Mfg. Part No. or KIT No.: A01A200 or A01A250
- Capacity: 200 GB or 250 GB
- Cache Buffer: 8 MB
- Spin Rate: 7200 RPM
- "Combo" Interface: IEEE 1394 and SPB-2 compliant (100 to 400
Mbits/sec) plus USB 2.0 and USB 1.1 compatible
The FireWire adapters I'm using are StarTech 4 Port IEEE-1394 PCI Firewire Cards.
Don't forget that you will also need a FireWire hub if you want to connect more than 2
RAC nodes to the FireWire drive(s).
The following steps need to be performed on all nodes of the RAC cluster unless it says
otherwise!
You cannot download Red Hat Linux Advanced Server, you can only download the
source code. If you want to get the binary CDs, you can buy licenses at
http://www.redhat.com/software/rhel/.
You don't have to install all RPMs when you want to run an Oracle Database 10g with
RAC on Red Hat Linux Advanced Server. You are fine when you select the Installation
Type "Advanced Server" and when you don't select the Package Group "Software
Development". There are only a few other RPMs that are required for installing Oracle
10g RAC, which are covered in this article.
It is recommended to use newer Red Hat Enterprise Linux kernels since newer kernels
might fix known database performance problems and other issues. Unless you are using
FireWire-based shared drives (see below), I recommend to download the latest
RHELAS3 kernel from Red Hat Network and to use Upgrading the Linux Kernel as a
guide for upgrading the kernel. However, you also need to make sure that the OCFS and
ASM drivers are compatible with the kernel version!
You can download a kernel from Oracle for FireWire-Based Shared Storage for Oracle
Database 10g RAC, but Oracle does not support it. It is intended for testing and
demonstration only! See Setting Up Linux with FireWire-based Shared Storage for
Oracle10g RAC for more information.
There are two experimental kernels for FireWire shared drives, one for UP machines and
one for SMP machines. To install the kernel for a single CPU machine, run the following
command:
su - root
rpm -ivh kernel-2.4.21-15.ELorafw1.i686.rpm
Note that the above command does not upgrade your existing kernel. This is the preferred
method since you always want the option to go back to the old kernel if the new kernel
causes problems or doesn't come up.
To make sure that the right kernel is booted, check the /etc/grub.conf file if you use
GRUB, and change the "default" attribute if necessary. Here is an example:
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Red Hat Enterprise Linux AS (2.4.21-15.ELorafw1)
root (hd0,0)
kernel /vmlinuz-2.4.21-15.ELorafw1 ro root=LABEL=/
initrd /initrd-2.4.21-15.ELorafw1.img
title Red Hat Enterprise Linux AS (2.4.21-4.EL)
root (hd0,0)
kernel /vmlinuz-2.4.21-4.EL ro root=LABEL=/
initrd /initrd-2.4.21-4.EL.img
In this example, the "default" attribute is set to "0" which means that the the
experimental FireWire kernel 2.4.21-9.0.1.ELorafw1 will be booted. If the "default"
attribute would be set to "1", the 2.4.21-9.EL kernel would be booted.
Each RAC node should have at least one static IP address for the public network and one
static IP address for the private cluster interconnect.
The private networks are critical components of a RAC cluster. The private networks
should only be used by Oracle to carry Cluster Manager and Cache Fusion inter-node
connection. A RAC database does not require a separate private network, but using a
public network can degrade database performance (high latency, low bandwidth).
Therefore the private network should have high-speed NICs (preferably one gigabit or
more) and it should only be used by Oracle.
You might want to manage the network addresses using the /etc/hosts file. This avoids
the problem of making DNS, NIS, etc. a single point of failure for the database cluster.
Make sure that no firewall is running or that it doesn't interfere with RAC, respectively.
The public virtual IP addressess are configured automatically by Oracle when you run
OUI, which starts Oracle's Virtual Internet Protocol Configuration Assistant (VIPCA), see
Installing Oracle Database 10g Software R1 (10.1.0.2) with Real Application Clusters
(RAC)
NOTE:
Make sure that the name of the RAC node is not listed for the loopback address in the
/etc/hosts file similar to this example:
127.0.0.1 rac1pub localhost.localdomain localhost
The entry should look like this:
127.0.0.1 localhost.localdomain localhost
If the RAC node is listed for the loopback address, you might later get the following
errors:
ORA-00603: ORACLE server session terminated by fatal error
or
ORA-29702: error occurred in Cluster Group Service operation
For more information, see Oracle 10g RAC Issues, Problems and Errors.
To configure the network interfaces (in this example eth0 and eth1), run the following
command on each node.
su - root
redhat-config-network
NOTE: You do not have to configure the network alias names for the public VIP. This
will be done by Oracle's Virtual Internet Protocol Configuration Assistant (VIPCA).
NOTE: When the network configuration is done, it is important to make sure that the
name of the public RAC nodes is displayed when you execute the following command:
$ hostname
rac1pub
You can verify the new configured NICs by running the command:
/sbin/ifconfig
For instructions on how to setup a shared storage device on Red Hat Advanced Server,
see the installation instructions of the manufacturer.
First make sure that the experimental kernel for FireWire drives was installed and that the
server was rebooted (see Upgrading the Linux Kernel for FireWire Shared Disks Only):
# uname -r
2.4.21-15.ELorafw1
To load the kernel modules/drivers for the FireWire drive(s), add the following entry to
the /etc/modules.conf file:
alias ieee1394-controller ohci1394
post-install ohci1394 modprobe sd_mod
The alias directive ieee1394-controller is used by Red Hat during the boot process.
When you check the /etc/rc.d/rc.sysinit file, which is invoked by /etc/inittab
during the boot process, you will find the following code that searches for the ieee1394-
controller stanza in /etc/modules.conf:
if ! strstr "$cmdline" nofirewire ; then
aliases=`/sbin/modprobe -c | awk '/^alias ieee1394-controller/
{ print $3 }'`
if [ -n "$aliases" -a "$aliases" != "off" ]; then
for alias in $aliases ; do
[ "$alias" = "off" ] && continue
action $"Initializing firewire controller ($alias): " modprobe
$alias
done
LC_ALL=C grep -q "SBP2" /proc/bus/ieee1394/devices 2>/dev/null &&
\
modprobe sbp2 >/dev/null 2>&1
fi
fi
This means that all the kernel modules for the FireWire drive(s) will be loaded
automatically during the next reboot and your drive(s) should be ready for use.
To load the modules or the firewire stack right away without rebooting the server, execute
the following commands:
su - root
modprobe ieee1394-controller; modprobe sd_mod
If everything worked fine, the following modules should be loaded:
su - root
# lsmod |egrep "ohci1394|sbp2|ieee1394|sd_mod|scsi_mod"
sbp2 19724 0
ohci1394 28008 0 (unused)
ieee1394 62884 0 [sbp2 ohci1394]
sd_mod 13424 0
scsi_mod 104616 5 [sbp2 sd_mod sg sr_mod ide-scsi]
#
And when you run dmesg, you should see entries similar to this example:
# dmesg
...
ohci1394_0: OHCI-1394 1.0 (PCI): IRQ=[11] MMIO=[f2000000-f20007ff]
Max Packet=[2048]
ieee1394: Device added: Node[00:1023] GUID[0010b9f70089de1c] [Maxtor]
scsi1 : SCSI emulation for IEEE-1394 SBP-2 Devices
blk: queue cf172e14, I/O limit 4095Mb (mask 0xffffffff)
ieee1394: ConfigROM quadlet transaction error for node 01:1023
ieee1394: Host added: Node[02:1023] GUID[00110600000032a0] [Linux
OHCI-1394]
ieee1394: sbp2: Query logins to SBP-2 device successful
ieee1394: sbp2: Maximum concurrent logins supported: 3
ieee1394: sbp2: Number of active logins: 0
ieee1394: sbp2: Logged into SBP-2 device
ieee1394: sbp2: Node[00:1023]: Max speed [S400] - Max payload [2048]
Vendor: Maxtor Model: OneTouch Rev: 0200
Type: Direct-Access ANSI SCSI revision: 06
blk: queue cd0fb014, I/O limit 4095Mb (mask 0xffffffff)
Attached scsi disk sda at scsi1, channel 0, id 0, lun 0
SCSI device sda: 398295040 512-byte hdwr sectors (203927 MB)
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12
sda13 >
In this example, the kernel reported that the FireWire drive can be shared concurrently by
3 servers (see "Maximum concurrent logins supported:"). It is very important that you
have a drive with a chipset and firmware that supports concurrent access for the nodes.
The "Number of active logins:" shows how many servers are already sharing/using the
drive before this server added the drive to its system.
If everything worked fine, you should be able to see now your FireWire drive(s):
su - root
# fdisk -l
If everything worked fine without any errors or problems, I would recommend to reboot
all RAC nodes to verify that all FireWire drive(s) are automatically added to the system
during the next boot process:
su - root
reboot
And after the reboot, execute the fdisk command again to verify that the FireWire
drive(s) were added to the system:
su - root
# fdisk -l
PROBLEMS:
Note that if you have a USB device attached, the system might not be able to recognice
your FireWire drive!
If the ieee1394 module was not loaded, then your FireWire adapter might not be
supported. I'm using the StarTech 4 Port IEEE-1394 PCI Firewire Card which works fine:
# lspci
...
00:14.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host
Controller (rev 46)
...
For more information on the "oinstall" group account, see When to use "OINSTALL"
group during install of oracle.
Note: When you set the Oracle environment variables for the RAC nodes, make sure to
assign each RAC node a unique Oracle SID! In my test setup, the database name is "orcl"
and the Oracle SIDs are "orcl1" for RAC node one, "orcl2" for RAC node two, and so on.
If you use bash which is the default shell on Red Hat Linux (to verify your shell run:
echo $SHELL), execute the following commands:
# Oracle Environment
export ORACLE_BASE=/u01/app/oracle
export ORACLE_SID=orcl1 # Each RAC node must have a unique Oracle
SID! E.g. orcl1, orcl2,...
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
NOTE: If ORACLE_BASE is used, then Oracle recommends that you don't set the
ORACLE_HOME environment variable but that you choose the default path suggested by the
OUI. You can set and use ORACLE_HOME after you finished installing the Oracle Database
10g Software with RAC, see Installing Oracle Database 10g Software R1 (10.1.0.2) with
Real Application Clusters (RAC).
The environment variables ORACLE_HOME and TNS_ADMIN should not be set. If you
already set these environment variables, you can unset them by executing the following
commands:
unset ORACLE_HOME
unset TNS_ADMIN
To have these environment variables set automatically each time you login as oracle,
you can add these environment variables to the ~oracle/.bash_profile file for the
Bash shell on Red Hat Linux. To do this you could simply copy/paste the following
commands to make these settings permanent for the oracle Bash shell:
su - oracle
cat >> ~oracle/.bash_profile << EOF
export ORACLE_BASE=/u01/app/oracle
export ORACLE_SID=orcl1 # Each RAC node must have a unique Oracle SID!
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
EOF
At the time of this writing, OCFS only supports Oracle Datafiles and a few other files.
Therefore OCFS should not be used for Shared Oracle Home installs. See Installing and
Configuring Oracle Cluster File Systems (OCFS) for more information.
Creating Oracle Directories
At the time of this writing, OCFS only supports Oracle Datafiles and a few other files.
Therefore OCFS should not be used for Shared Oracle Home installs. See Installing and
Configuring Oracle Cluster File Systems (OCFS) for more information.
For Oracle10g you only need to create the directory for $ORACLE_BASE:
su - root
mkdir -p /u01/app/oracle
chown -R oracle.oinstall /u01
But if you want to comply with Oracle's Optimal Flexible Architecture (OFA), then you
don't want to place the database files in the /u01 directory but in another directory like
/u02. This is not a requirement but if you want to comply with OFA, then you might
want to create the following directories as well:
su - root
mkdir -p /u02/oradata/orcl
chown -R oracle.oinstall /u02
Here I would recommend to take a quick look at Oracle's new Optimal Flexible
Architecture (OFA).
NOTE: In my example I will not place the database files into the OCFS directory
/u02/oradata/orcl since I will use Automatic Storage Management (ASM). However,
I will use /u02/oradata/orcl for the cluster manager files, see Installing Cluster Ready
Services (CRS).
General
Note that it is important for the Redo Log files to be on the shared disks as well.
After you finished creating the partitions, inform the kernel of the partition table
changes:
su - root
partprobe
If you use OCFS for database files and other Oracle files, you can create several
partitions on your shared storage for the OCFS filesystems. If you use a FireWire disk,
you could create one large partition on the disk which should make things easier.
For more information on how to install OCFS and how to mount OCFS filesystems on
partitions, see Installing and Configuring Oracle Cluster File Systems (OCFS).
If you want to use raw devices, see Creating Partitions for Raw Devices for more
information. This article does not cover raw devices.
The Oracle Cluster File System (OCFS) was developed by Oracle to overcome the limits
of Raw Devices and Partitions. It also eases administration of database files because it
looks and feels just like a regular file system.
At the time of this writing, OCFS only supports Oracle Datafiles and a few other files:
- Redo Log files
- Archive log files
- Control files
- Database datafiles
- Shared quorum disk file for the cluster manager
- Shared init file (srv)
Oracle says that they will support Shared Oracle Home installs in the future. So don't
install the Oracle software on OCFS yet. See Oracle Cluster File System for more
information. In this article I'm creating a separate, individual ORACLE_HOME directory
on local server storage for each and every RAC node.
NOTE:
If files on the OCFS file system need to be moved, copied, tar'd, etc., or if directories
need to be created on OCFS, then the standard file system commands mv, cp, tar,...
that come with the OS should not be used. These OS commands can have a major OS
performance impact if they are being used on the OCFS file system. Therefore, Oracle's
patched file system commands should be used instead.
It is also important to note that some 3rd vendor backup tools make use of standard OS
commands like tar.
Installing OCFS
NOTE: In my example I will use OCFS only for the cluster manager files since I will use
ASM for datafiles.
To find out which OCFS driver you need for your server, run:
$ uname -a
Linux rac1pub 2.4.21-9.ELsmp #1 Thu Jan 8 17:24:12 EST 2004 i686 i686
i386 GNU/Linux
To install the OCFS RPMs for SMP kernels (including FireWire SMP kernels), execute:
su - root
rpm -Uvh ocfs-2.4.21-EL-smp-1.0.12-1.i686.rpm \
ocfs-tools-1.0.10-1.i386.rpm \
ocfs-support-1.0.10-1.i386.rpm
To install the OCFS RPMs for uniprocessor kernels (including FireWire UP kernels),
execute:
su - root
rpm -Uvh ocfs-2.4.21-EL-1.0.12-1.i686.rpm \
ocfs-tools-1.0.10-1.i386.rpm \
ocfs-support-1.0.10-1.i386.rpm
To generate the /etc/ocfs.conf file, you can run the ocfstool tool:
su - root
ocfstool
- Select "Task" - Select "Generate Config"
- Select the interconnect interface (private network interface)
In my example for rac1pub I selected: eth1, rac1prv
- Confirm the values displayed and exit
The generated /etc/ocfs.conf file will appear similar to the following example:
$ cat /etc/ocfs.conf
#
# ocfs config
# Ensure this file exists in /etc
#
node_name = rac1prv
ip_address = 192.168.2.1
ip_port = 7000
comm_voting = 1
guid = 84D43BC8FB7A2C1B88C3000D8821CC2C
The guid entry is the unique group user ID. This ID has to be unique for each node. You
can create the above file without the ocfstool tool by editing the /etc/ocfs.conf file
manually and by running ocfs_uid_gen -c to assign/update the guid value in this file.
To load the ocfs.o kernel module, execute:
su - root
# /sbin/load_ocfs
/sbin/insmod ocfs node_name=rac1prv ip_address=192.168.2.1 cs=1795
guid=84D43BC8FB7A2C1B88C3000D8821CC2C comm_voting=1 ip_port=7000
Using /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o
#
To verify if the ofcs module was loaded, execute:
# /sbin/lsmod |grep ocfs
ocfs 305920 0 (unused)
Note that the load_ocfs command doest not have to be executed again once everything
has been setup for the OCFS filesystems, see Configuring the OCFS File Systems to
Mount Automatically at Startup.
If you run load_ocfs on a system with the experimental FireWire kernel, you might get
the following error message:
su - root
# load_ocfs
/sbin/insmod ocfs node_name=rac1prv ip_address=192.168.2.1 cs=1843
guid=AA12637FAABFB354371C000D8821CC2C comm_voting=1 ip_port=7000
insmod: ocfs: no module by that name found
load_ocfs: insmod failed
#
The ocfs.o module for the "FireWire kernel" can be found here:
su - root
# rpm -ql ocfs-2.4.21-EL-1.0.12-1
/lib/modules/2.4.21-EL-ABI/ocfs
/lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o
#
So for the experimental kernel for FireWire drives, I manually created a link for the
ocfs.o module file:
su - root
mkdir /lib/modules/`uname -r`/kernel/drivers/addon/ocfs
ln -s `rpm -qa | grep ocfs-2 | xargs rpm -ql | grep "/ocfs.o$"` \
/lib/modules/`uname -r`/kernel/drivers/addon/ocfs/ocfs.o
Now you should be able to load the OCFS module using the "FireWire kernel", and the
output should look similar to this example:
su - root
# /sbin/load_ocfs
load_ocfs /sbin/insmod ocfs node_name=rac1prv ip_address=192.168.2.1
cs=1843 guid=AA12637FAABFB354371C000D8821CC2C comm_voting=1 ip_port=7000
Using /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o
Warning: kernel-module version mismatch
/lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o was compiled for kernel
version 2.4.21-4.EL
while this kernel is version 2.4.21-15.ELorafw1
Warning: loading /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o will taint the
kernel: forced load
See http://www.tux.org/lkml/#export-tainted for information about
tainted modules
Module ocfs loaded, with warnings
#
I would not worry about the above warning.
However, if you get the following error, then you have to upgrade the modutils RPM:
su - root
# /sbin/load_ocfs
/sbin/insmod ocfs node_name=rac2prv ip_address=192.168.2.2 cs=1761
guid=1815F1C57530339EA00E000D8825B058 comm_voting=1 ip_port=7000
Using /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o
/lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o: kernel-module version mismatch
/lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o was compiled for kernel
version 2.4.21-4.EL
while this kernel is version 2.4.21-15.ELorafw1.
#
To remedy the "loading" problem, download the latest modutils RPM and enter e.g.:
rpm -Uvh modutils-2.4.25-11.EL.i386.rpm
Note that the load_ocfs command doest not have to be executed again once everything
has been setup for the OCFS filesystems, see Configuring the OCFS File Systems to
Mount Automatically at Startup.
Before you continue with the next steps, make sure you've created all needed partitions
on your shared storage.
Under Creating Oracle Directories I created the /u02/oradata/orcl mount directory for
the cluster manager files. In the following example I will create one OCFS filesystem and
mount it on /u02/oradata/orcl.
The following steps for creating the OCFS filesystem(s) should only be executed on one
RAC node!
Alternatively, you can execute the "mkfs.ocfs" command to create the OCFS
filesystems:
su - root
mkfs.ocfs -F -b 128 -L /u02/oradata/orcl -m /u02/oradata/orcl \
-u `id -u oracle` -g `id -g oracle` -p 0775 <device_name>
Cleared volume header sectors
Cleared node config sectors
Cleared publish sectors
Cleared vote sectors
Cleared bitmap sectors
Cleared data block
Wrote volume header
#
For SCSI disks (including FireWire disks), <device_name> stands for devices like
/dev/sda, /dev/sdb, /dev/sdc, dev/sdd, etc. Be careful to use the right device name!
For this article I created an OCFS filesystem on /dev/sda1.
mkfs.ocfs options:
As I mentioned previously, for this article I created one large OCFS fileystem on
/dev/sda1. To mount the OCFS filesystem, I executed:
su - root
# mount -t ocfs /dev/sda1 /u02/oradata/orcl
or
# mount -t ocfs -L /u02/oradata/orcl /u02/oradata/orcl
Now run the ls command on all RAC nodes to check the ownership:
# ls -ld /u02/oradata/orcl
drwxrwxr-x 1 oracle oinstall 131072 Jul 4 23:25
/u02/oradata/orcl
#
NOTE: If the above ls command does not display the same ownership on all RAC nodes
(oracle:oinstall), then the "oracle" UID and the "oinstall" GID are not the same
accross the RAC nodes, see Creating Oracle User Accounts for more information.
To ensure the OCFS filesystems are mounted automatically during reboots, the OCFS
mount points need to be added to the /etc/fstab file.
To make sure the ocfs.o kernel module is loaded and the OCFS file systems are
mounted during the boot process, enter:
su - root
# chkconfig --list ocfs
ocfs 0:off 1:off 2:off 3:on 4:on 5:on 6:off
If the flags are not set to "on" as marked in bold, run the following command:
su - root
# chkconfig ocfs on
You can also start the "ocfs" service manually by running:
su - root
# service ocfs start
When you run this command it will not only load the ocfs.o kernel module but it will
also mount the OCFS filesystems as configured in /etc/fstab.
At this point you might want to reboot all RAC nodes to ensure that the OCFS
filesystems are mounted automatically after reboots:
su - root
reboot
For information about what Automatic Storage Management is, see Configuring and
Using Automatic Storage Management.
See also Installing Oracle ASMLib for Linux.
Installing ASM
To load the ASM driver oracleams.o and to mount the ASM driver filesystem, enter:
su - root
# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting without typing an
answer will keep that current value. Ctrl-C will abort.
NOTE: Creating ASM disks is done on one RAC node! The following commands should
only be executed on one RAC node!
I executed the following commands to create my ASM disks: (make sure to change the
device names!)
(In this example I used partitions (/dev/sda2, /dev/sda3, /dev/sda5) instead of whole
disks (/dev/sda, /dev/sdb, /dev/sdc,...))
su - root
# /etc/init.d/oracleasm createdisk VOL1 /dev/<sd??>
Marking disk "/dev/sda2" as an ASM disk [ OK ]
# /etc/init.d/oracleasm createdisk VOL2 /dev/<sd??>
Marking disk "/dev/sda3" as an ASM disk [ OK ]
# /etc/init.d/oracleasm createdisk VOL3 /dev/<sd??>
Marking disk "/dev/sda5" as an ASM disk [ OK ]
#
# # Replace "sd??" with the name of your device. I used /dev/sda2,
/dev/sda3, and /dev/sda5
To list all ASM disks, enter:
# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
#
On all other RAC nodes, you just need to notify the system about the new ASM disks:
su - root
# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks [ OK ]
#
These two parameters indicate how long a RAC node must hang before the hangcheck-
timer module will reset the system. A node reset will occur when the following is true:
system hang time > (hangcheck_tick + hangcheck_margin)
To load the module with the right parameter settings, make entries to the
/etc/modules.conf file. To do that, add the following line to the /etc/modules.conf
file:
# su - root
# echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180"
>> /etc/modules.conf
Now you can run modprobe to load the module with the configured parameters in
/etc/modules.conf:
# su - root
# modprobe hangcheck-timer
# grep Hangcheck /var/log/messages |tail -2
Jul 5 00:46:09 rac1pub kernel: Hangcheck: starting hangcheck timer
0.8.0 (tick is 180 seconds, margin is 60 seconds).
Jul 5 00:46:09 rac1pub kernel: Hangcheck: Using TSC.
#
Note: To ensure the hangcheck-timer module is loaded after each reboot, add the
modprobe command to the /etc/rc.local file.
The following procedure shows how ssh can be configured that no password is requested
for oracle ssh logins.
To create an authentication key for oracle, enter the following command on all RAC
node:
(the ~/.ssh directory will be created automatically if it doesn't exist yet)
su - oracle
$ ssh-keygen -t dsa -b 1024
Generating public/private dsa key pair.
Enter file in which to save the key (/home/oracle/.ssh/id_dsa): Press
ENTER
Created directory '/home/oracle/.ssh'.
Enter passphrase (empty for no passphrase): Enter a
passphrase
Enter same passphrase again: Etner a
passphrase
Your identification has been saved in /home/oracle/.ssh/id_dsa.
Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
The key fingerprint is:
e0:71:b1:5b:31:b8:46:d3:a9:ae:df:6a:70:98:26:82 oracle@rac1pub
Copy the pulic key for oracle from each RAC node to all other RAC nodes.
For example, run the following commands on all RAC nodes:
su - oracle
ssh rac1pub cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ssh rac2pub cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ssh rac3pub cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Now verify that oracle on each RAC node can login to all other RAC nodes without a
password. Make sure that ssh only asks for the passphrase. Note, however, that the first
time you ssh to another server you will get a message stating that the authenticity of the
host cannot be established. Enter "yes" at the prompt to continue the connection.
For example, run the following commands on all RAC nodes to verify that no password
is asked:
su - oracle
ssh rac1pub hostname
ssh rac1pub hostname
ssh rac1prv hostname
ssh rac2pub hostname
ssh rac2prv hostname
ssh rac3pub hostname
ssh rac3prv hostname
And later, before runInstaller is launched, I will show how ssh can be configured that
no passphrase has to be entered for oracle ssh logins.
Checking Packages (RPMs)
Some packages will be missing when you selected the Installation Type "Advanced
Server" during the Red Hat Advanced Server installation.
The default and maximum window size can be changed in the proc file system without
reboot:
su - root
sysctl -w net.core.rmem_default=262144 # Default setting in bytes of
the socket receive buffer
sysctl -w net.core.wmem_default=262144 # Default setting in bytes of
the socket send buffer
sysctl -w net.core.rmem_max=262144 # Maximum socket receive buffer
size which may be set by using the SO_RCVBUF socket option
sysctl -w net.core.wmem_max=262144 # Maximum socket send buffer
size which may be set by using the SO_SNDBUF socket option
To make the change permanent, add the following lines to the /etc/sysctl.conf file,
which is used during the boot process:
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=262144
net.core.wmem_max=262144
Setting Semaphores
It is recommended to follow the steps as outlined in Setting Semaphores.
Once CRS is running on all RAC nodes, OUI will automatically recognice all nodes on
the cluster. This means that you can run OUI on one RAC node to install the Oracle
software on all other RAC nodes.
Note that Automatic Storage Management (ASM) cannot be used for the "Oracle Cluster
Registry (OCR)" file or for the "CRS Voting Disk" file. These files must be accessible
before any Oracle instances are started. And for ASM to become available, the ASM
instance needs to run first.
In the following example I will use OCFS for the "Oracle Cluster Registry (OCR)" file
and for the "CRS Voting disk" file. The Oracle Cluster Registry file has a size of about
100 MB, and the CRS Voting Disk file has a size of about 20 MB. Tese files must reside
on OCFS or on a shared raw device, or on any other clustered filesystem.
Open a new terminal for the RAC node where you will execute runInstaller and use
this terminal to login from your desktop using the following command:
$ ssh -X oracle@rac?pub
The "X11 forward" feature (-X option) of ssh will relink X to your local desktop. For
more information, see Installing Oracle10g on a Remote Linux Server.
Now configure ssh-agent to handle the authentication for the oracle account:
oracle$ ssh-agent $SHELL
oracle$ ssh-add
Enter passphrase for /home/oracle/.ssh/id_dsa: Enter your
passphrase
Identity added: /home/oracle/.ssh/id_dsa (/home/oracle/.ssh/id_dsa)
oracle$
Now make sure the oracle user can ssh into each RAC node. It is very important that
NO text is displayed and that you are not asked for a passphrase. Only the server name of
the remote RAC node should be displayed:
oracle$ ssh rac1pub hostname
rac1pub
oracle$ ssh rac1prv hostname
rac1pub
oracle$ ssh rac2pub hostname
rac2pub
oracle$ ssh rac2prv hostname
rac2pub
oracle$ ssh rac3pub hostname
rac3pub
oracle$ ssh rac3prv hostname
rac3pub
NOTE: Keep this terminal open since this is the terminal that will be used for running
runInstaller!
Make sure the OCFS filesystem(s) are mounted on all RAC nodes:
oracle$ ssh rac1pub df |grep oradata
/dev/sda1 51205216 33888 51171328 1%
/u02/oradata/orcl
oracle$ ssh rac2pub df |grep oradata
/dev/sda1 51205216 33888 51171328 1%
/u02/oradata/orcl
oracle$ ssh rac3pub df |grep oradata
/dev/sda1 51205216 33888 51171328 1%
/u02/oradata/orcl
Use the oracle terminal that you prepared for ssh at Automating Authentication for
oracle ssh Logins and execute runInstaller:
oracle$ /mnt/cdrom/runInstaller
One way to verify the CRS installation is to display all the nodes where CRS was
installed:
oracle$ /u01/app/oracle/product/10.1.0/crs_1/bin/olsnodes -n
rac1pub 1
rac2pub 2
rac3pub 3
Note that Oracle Database 10g R1 (10.1) OUI will not be able to discover disks that are
marked as Linux ASMLib. Therefore it is recommended to complete the software
installation and then to use dbca to create the database, see
http://otn.oracle.com/tech/linux/asmlib/install.html#10gr1 for more information.
To install the RAC Database software, insert the Oracle Database 10g R1 (10.1.0.2) CD
(downloaded image name: "ship.db.cpio.gz"), and mount it on e.g. rac1pub:
su - root
mount /mnt/cdrom
Use the oracle terminal that you prepared for ssh at Automating Authentication for
oracle ssh Logins, and execute runInstaller:
oracle$ /mnt/cdrom/runInstaller
- End of Installation:
Click Exit
The following steps should now be performed on all RAC nodes! It is very important that
these environment variables are set permanently for oracle on all RAC nodes!
To make sure $ORACLE and $PATH are set automatically each time oracle logs in, add
these environment variables to the ~oracle/.bash_profile file which is the user
startup file for the Bash shell on Red Hat Linux. To do this you could simply copy/paste
the following commands to make these settings permanent for your oracle's Bash shell
(the path might differ on your system!):
su - oracle
cat >> ~oracle/.bash_profile << EOF
export ORACLE_HOME=$ORACLE_BASE/product/10.1.0/db_1
export PATH=$PATH:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
EOF
Use the oracle terminal that you prepared for ssh at Automating Authentication for
oracle ssh Logins, and execute dbca. But before you execute dbca, make sure that
$ORACLE_HOME and $PATH are set:
oracle$ . ~oracle/.bash_profile
oracle$ dbca
Your RAC cluster should now be up and running. To verify, try to connect to each
instance from one of the RAC nodes:
$ sqlplus system@orcl1
$ sqlplus system@orcl2
$ sqlplus system@orcl3
After you connected to an instance, enter the following SQL command to verify your
connection:
SQL> select instance_name from v$instance;
Post-Installation Steps
Processes external to the Oracle 10g RAC cluster control the Transparent Application
Failover (TAF). This means that the failover types and methods can be unique for each
Oracle Net client. The re-connection happens automatically within the OCI library which
means that you do not need to change the client application to use TAF.
Setup
To test TAF on the new installed RAC cluster, configure the tnsnames.ora file for TAF
on a non-RAC server where you have either the Oracle database software or the Oracle
client software installed.
The following SQL statement can be used to check the sessions's failover type, failover
method, and if a failover has occured:
select instance_name, host_name,
NULL AS failover_type,
NULL AS failover_method,
NULL AS failed_over
FROM v$instance
UNION
SELECT NULL, NULL, failover_type, failover_method, failed_over
FROM v$session
WHERE username = 'SYSTEM';
SQL>
The above SQL statement shows that I'm connected to "rac1pub" for instance "orcl1".
In this case, execute shutdown abort on "rac1pub" for instance "orcl1":
SQL> shutdown abort
ORACLE instance shut down.
SQL>
The SQL statement shows that the sessions has failed over to instance "orcl2". Note that
this can take a few seconds.
Run the following command to see which data files are in which disk group:
SQL> select name from v$datafile
2 union
3 select name from v$controlfile
4 union
5 select member from v$logfile;
NAME
-----------------------------------------------------------------------
---------
+ORCL_DATA1/orcl/controlfile/current.260.3
+ORCL_DATA1/orcl/datafile/sysaux.257.1
+ORCL_DATA1/orcl/datafile/system.256.1
+ORCL_DATA1/orcl/datafile/undotbs1.258.1
+ORCL_DATA1/orcl/datafile/undotbs2.264.1
+ORCL_DATA1/orcl/datafile/users.259.1
+ORCL_DATA1/orcl/onlinelog/group_1.261.1
+ORCL_DATA1/orcl/onlinelog/group_2.262.1
+ORCL_DATA1/orcl/onlinelog/group_3.265.1
+ORCL_DATA1/orcl/onlinelog/group_4.266.1
10 rows selected.
SQL>
Run the following command to see which ASM disk(s) belong to the disk group
'ORCL_DATA1':
(ORCL_DATA1 was specified in Installing Oracle Database 10g with Real Application
Cluster)
SQL> select path from v$asm_disk where group_number in
2 (select group_number from v$asm_diskgroup where name =
'ORCL_DATA1');
PATH
-----------------------------------------------------------------------
---------
ORCL:VOL1
ORCL:VOL2
SQL>
Oracle 10g RAC Issues, Problems and Errors
This section describes other issues, problems and errors pertaining to installing Oracle
10g with RAC which has not been covered so far.
• /u01/app/oracle/product/10.1.0/crs_1/bin/crs_stat.bin: error
while loading shared libraries: libstdc++-libc6.2-2.so.3: cannot
open shared object file: No such file or directory
• /u01/app/oracle/product/10.1.0/crs_1/bin/crs_stat.bin: error
while loading shared libraries: libstdc++-libc6.2-2.so.3: cannot
open shared object file: No such file or directory
• PRKR-1061 : Failed to run remote command to get node
configuration for node rac1pup
• PRKR-1061 : Failed to run remote command to get node
configuration for node rac1pup
•
•
This error can come up when you run root.sh. To fix this error, install the compat-
libstdc++ RPM and rerun root.sh:
rpm -ivh compat-libstdc++-7.3-2.96.122.i386.rpm
Make sure that the name of the RAC node is not listed for the loopback address in the
/etc/hosts file similar to this example:
127.0.0.1 rac1pub localhost.localdomain localhost
The entry should rather look like this:
127.0.0.1 localhost.localdomain localhost
References
Oracle's Linux Center
Oracle Database 10g Documentation
Installing Oracle9i Real Application Cluster (RAC) on Red Hat Linux Advanced Server
2.1
Project Documentation: OCFS
Installing Oracle ASMLib