Sie sind auf Seite 1von 68

DBA/Sysadmin: Linux

Build Your Own Oracle RAC 10g Cluster on Linux and FireWire
by Jeffrey Hunter

Learn how to set up and configure an Oracle RAC 10g development cluster for less than US$1,800.

Contents

1. Introduction
2. Oracle RAC 10g Overview
3. Shared-Storage Overview
4. FireWire Technology
5. Hardware & Costs
6. Install the Linux Operating System
7. Network Configuration
8. Obtain and Install a Proper Linux Kernel
9. Create "oracle" User and Directories
10. Creating Partitions on the Shared FireWire Storage Device
11. Configure the Linux Servers
12. Configure the hangcheck-timer Kernel Module
13. Configure RAC Nodes for Remote Access
14. All Startup Commands for Each RAC Node
15. Check RPM Packages for Oracle 10g
16. Install and Configure Oracle Cluster File System
17. Install and Configure Automatic Storage Management and Disks
18. Download Oracle RAC 10g Software
19. Install Oracle Cluster Ready Services Software
20. Install Oracle Database 10g Software
21. Create TNS Listener Process
22. Create the Oracle Cluster Database
23. Verify TNS Networking Files
24. Create/Altering Tablespaces
25. Verify the RAC Cluster/Database Configuration
26. Starting & Stopping the Cluster
27. Transparent Application Failover
28. Conclusion
29. Acknowledgements

Downloads for this guide:


White Box Enterprise Linux 3 or Red Hat Enterprise Linux 3
Oracle Cluster File System
Oracle Database 10g EE and Cluster Ready Services
Precompiled FireWire Kernel for WBEL/RHEL
ASMLib Drivers

1. Introduction

One of the most efficient ways to become familiar with Oracle Real Application Clusters (RAC) 10g technology
is to have access to an actual Oracle RAC 10g cluster. There's no better way to understand its benefits—
including fault tolerance, security, load balancing, and scalability—than to experience them directly.

Unfortunately, for many shops, the price of the hardware required for a typical production RAC configuration
makes this goal impossible. A small two-node cluster can cost from US$10,000 to well over US$20,000. That
cost would not even include the heart of a production RAC environment—typically a storage area network—
which start at US$8,000.
For those who want to become familiar with Oracle RAC 10g without a major cash outlay, this guide provides a
low-cost alternative to configuring an Oracle RAC 10g system using commercial off-the-shelf components and
downloadable software at an estimated cost of US$1,200 to US$1,800. The system involved comprises a dual-
node cluster (each with a single processor) running Linux (White Box Enterprise Linux 3.0 Respin 1 or Red Hat
Enterprise Linux 3) with a shared disk storage based on IEEE1394 (FireWire) drive technology. (Of course,
you could also consider building a virtual cluster on a VMware Virtual Machine, but the experience won't quite
be the same!)

This guide does not work (yet) for the latest Red Hat Enterprise Linux 4 release (Linux kernel 2.6). Although
Oracle's Linux Development Team provides a stable (patched) precompiled 2.6-compatible kernel available for
use with FireWire, a stable release of OCFS version 2—which is required for the 2.6 kernel—is not yet
available. When that release becomes available, I will be update this guide to support RHEL4.

Please note that this is not the only way to build a low-cost Oracle RAC 10g system. I have seen other
solutions that utilize an implementation based on SCSI rather than FireWire for shared storage. In most cases,
SCSI will cost more than our FireWire solution where a typical SCSI card is priced around US$70 and an 80GB
external SCSI drive will cost US$700-US$1,000. Keep in mind that some motherboards may already include
built-in SCSI controllers.

It is important to note that this configuration should never be run in a production environment and that it is
not supported by Oracle or any other vendor. In a production environment, fiber channel—the high-speed
serial-transfer interface that can connect systems and storage devices in either point-to-point or switched
topologies—is the technology of choice. FireWire offers a low-cost alternative to fiber channel for testing and
development, but it is not ready for production.

Although in past experience I have used raw partitions for storing files on shared storage, here we will make
use of the Oracle Cluster File System (OCFS) and Oracle Automatic Storage Management (ASM). The two
Linux servers will be configured as follows:

Oracle Database Files


Database
RAC Node Name Instance Name $ORACLE_BASE File System
Name
linux1 orcl1 orcl /u01/app/oracle ASM

linux2 orcl2 orcl /u01/app/oracle ASM

Oracle CRS Shared Files


File Type File Name Partition Mount Point File System
Oracle Cluster Registry /u02/oradata/orcl/OCRFile /dev/sda1 /u02/oradata/orcl OCFS

CRS Voting Disk /u02/oradata/orcl/CSSFile /dev/sda1 /u02/oradata/orcl OCFS

The Oracle Cluster Ready Services (CRS) software will be installed to /u01/app/oracle/product/10.1.0/crs_1 on
each of the nodes that make up the RAC cluster. However, the CRS software requires that two of its files, the
Oracle Cluster Registry (OCR) file and the CRS Voting Disk file, be shared with all nodes in the cluster. These
two files will be installed on the shared storage using OCFS. It is also possible (but not recommended by
Oracle) to use raw devices for these files.

The Oracle Database 10g software will be installed into a separate Oracle Home; namely
/u01/app/oracle/product/10.1.0/db_1. All of the Oracle physical database files (data, online redo logs, control
files, archived redo logs) will be installed to different partitions of the shared drive being managed by ASM.
(The Oracle database files can just as easily be stored on OCFS. Using ASM, however, makes the article that
much more interesting!)

Note: For the previously published Oracle9i RAC version of this guide, click here.

2. Oracle RAC 10g Overview

Oracle RAC, introduced with Oracle9i, is the successor to Oracle Parallel Server (OPS). RAC allows multiple
instances to access the same database (storage) simultaneously. It provides fault tolerance, load balancing,
and performance benefits by allowing the system to scale out, and at the same time—because all nodes
access the same database—the failure of one instance will not cause the loss of access to the database.

At the heart of Oracle RAC is a shared disk subsystem. All nodes in the cluster must be able to access all of
the data, redo log files, control files and parameter files for all nodes in the cluster. The data disks must be
globally available to allow all nodes to access the database. Each node has its own redo log and control files
but the other nodes must be able to access them in order to recover that node in the event of a system failure.

One of the bigger differences between Oracle RAC and OPS is the presence of Cache Fusion technology. In
OPS, a request for data between nodes required the data to be written to disk first, and then the requesting
node could read that data. In RAC, data is passed along with locks.

Not all clustering solutions use shared storage. Some vendors use an approach known as a federated cluster,
in which data is spread across several machines rather than shared by all. With Oracle RAC 10g, however,
multiple nodes use the same set of disks for storing data. With Oracle RAC, the data files, redo log files,
control files, and archived log files reside on shared storage on raw-disk devices, a NAS, a SAN, ASM, or on a
clustered file system. Oracle's approach to clustering leverages the collective processing power of all the
nodes in the cluster and at the same time provides failover security.

For more background about Oracle RAC, visit the Oracle RAC Product Center on OTN.

3. Shared-Storage Overview

Fibre Channel is one of the most popular solutions for shared storage. As I mentioned previously, Fibre
Channel is a high-speed serial-transfer interface used to connect systems and storage devices in either point-
to-point or switched topologies. Protocols supported by Fibre Channel include SCSI and IP.

Fibre Channel configurations can support as many as 127 nodes and have a throughput of up to 2.12 gigabits
per second. Fibre Channel, however, is very expensive; the switch alone can cost as much as US$1,000 and
high-end drives can reach prices of US$300. Overall, a typical Fibre Channel setup (including cards for the
servers) costs roughly US$5,000.

A less expensive alternative to Fibre Channel is SCSI. SCSI technology provides acceptable performance for
shared storage, but for administrators and developers who are used to GPL-based Linux prices, even SCSI
can come in over budget at around US$1,000 to US$2,000 for a two-node cluster.

Another popular solution is the Sun NFS (Network File System) found on a NAS. It can be used for shared
storage but only if you are using a network appliance or something similar. Specifically, you need servers that
guarantee direct I/O over NFS, TCP as the transport protocol, and read/write block sizes of 32K.
4. FireWire Technology

Developed by Apple Computer and Texas Instruments, FireWire is a cross-platform implementation of a high-
speed serial data bus. With its high bandwidth, long distances (up to 100 meters in length) and high-powered
bus, FireWire is being used in applications such as digital video (DV), professional audio, hard drives, high-end
digital still cameras and home entertainment devices. Today, FireWire operates at transfer rates of up to 800
megabits per second while next generation FireWire calls for speeds to a theoretical bit rate to 1,600 Mbps and
then up to a staggering 3,200 Mbps. That's 3.2 gigabits per second. This speed will make FireWire
indispensable for transferring massive data files and for even the most demanding video applications, such as
working with uncompressed high-definition (HD) video or multiple standard-definition (SD) video streams.

The following chart shows speed comparisons of the various types of disk interface. For each interface, I
provide the maximum transfer rates in kilobits (kb), kilobytes (KB), megabits (Mb), and megabytes (MB) per
second. As you can see, the capabilities of IEEE1394 compare very favorably with other available disk
interface technologies.

Disk Interface Speed


Serial 115 kb/s - (.115 Mb/s)
Parallel (standard) 115 KB/s - (.115 MB/s)
USB 1.1 12 Mb/s - (1.5 MB/s)
Parallel (ECP/EPP) 3.0 MB/s
IDE 3.3 - 16.7 MB/s
ATA 3.3 - 66.6 MB/sec
SCSI-1 5 MB/s
SCSI-2 (Fast SCSI/Fast Narrow SCSI) 10 MB/s
Fast Wide SCSI (Wide SCSI) 20 MB/s
Ultra SCSI (SCSI-3/Fast-20/Ultra Narrow) 20 MB/s
Ultra IDE 33 MB/s
Wide Ultra SCSI (Fast Wide 20) 40 MB/s
Ultra2 SCSI 40 MB/s
IEEE1394(b) 100 - 400Mb/s - (12.5 - 50 MB/s)
USB 2.x 480 Mb/s - (60 MB/s)
Wide Ultra2 SCSI 80 MB/s
Ultra3 SCSI 80 MB/s
Wide Ultra3 SCSI 160 MB/s
FC-AL Fiber Channel 100 - 400 MB/s

5. Hardware & Costs

The hardware we will use to build our example Oracle RAC 10g environment comprises two Linux servers and
components that you can purchase at any local computer store or over the Internet.

Server 1 - (linux1)
Dimension 2400 Series
- Intel Pentium 4 Processor at 2.80GHz
- 1GB DDR SDRAM (at 333MHz)
- 40GB 7200 RPM Internal Hard Drive
- Integrated Intel 3D AGP Graphics
- Integrated 10/100 Ethernet
- CDROM (48X Max Variable)
- 3.5" Floppy
- No monitor (Already had one)
- USB Mouse and Keyboard US$620

1 - Ethernet LAN Cards


- Linksys 10/100 Mpbs - (Used for Interconnect to linux2)
Each Linux server should contain two NIC adapters. The Dell Dimension includes an
integrated 10/100 Ethernet adapter that will be used to connect to the public network. The
second NIC adapter will be used for the private interconnect. US$20

1 - FireWire Card
- SIIG, Inc. 3-Port 1394 I/O Card
Cards with chipsets made by VIA or TI are known to work. In addition to the SIIG, Inc. 3-Port
1394 I/O Card, I have also successfully used the Belkin FireWire 3-Port 1394 PCI Card and
StarTech 4 Port IEEE-1394 PCI Firewire Card I/O cards. US$30

Server 2 - (linux2)

Dimension 2400 Series


- Intel Pentium 4 Processor at 2.80GHz
- 1GB DDR SDRAM (at 333MHz)
- 40GB 7200 RPM Internal Hard Drive
- Integrated Intel 3D AGP Graphics
- Integrated 10/100 Ethernet
- CDROM (48X Max Variable)
- 3.5" Floppy
- No monitor (already had one)
- USB Mouse and Keyboard US$620

1 - Ethernet LAN Cards


- Linksys 10/100 Mpbs - (Used for Interconnect to linux1)
Each Linux server should contain two NIC adapters. The Dell Dimension includes an
integrated 10/100 Ethernet adapter that will be used to connect to the public network. The
second NIC adapter will be used for the private interconnect. US$20

1 - FireWire Card
- SIIG, Inc. 3-Port 1394 I/O Card
Cards with chipsets made by VIA or TI are known to work. In addition to the SIIG, Inc. 3-Port
1394 I/O Card, I have also successfully used the Belkin FireWire 3-Port 1394 PCI Card and
StarTech 4 Port IEEE-1394 PCI Firewire Card I/O cards. US$30

Miscellaneous Components

FireWire Hard Drive


- Maxtor One Touch 250GB USB 2.0 / Firewire External Hard Drive
Ensure that the FireWire drive that you purchase supports multiple logins. If the drive has a
chipset that does not allow for concurrent access for more than one server, the disk and its
partitions can only be seen by one server at a time. Disks with the Oxford 911 chipset are
known to work. Here are the details about the disk that I purchased for this test:
Vendor: Maxtor
Model: OneTouch US$260
Mfg. Part No. or KIT No.: A01A200 or A01A250
Capacity: 200 GB or 250 GB
Cache Buffer: 8 MB
Spin Rate: 7200 RPM
"Combo" Interface: IEEE 1394 and SPB-2 compliant (100 to 400 Mbits/sec) plus USB 2.0
and USB 1.1 compatible

1 - Extra FireWire Cable


- Belkin 6-pin to 6-pin 1394 Cable US$15

1 - Ethernet hub or switch


- Linksys EtherFast 10/100 5-port Ethernet Switch
(Used for interconnect int-linux1 / int-linux2) US$30

4 - Network Cables
- Category 5e patch cable - (Connect linux1 to public network) US$5
- Category 5e patch cable - (Connect linux2 to public network) US$5
- Category 5e patch cable - (Connect linux1 to interconnect ethernet switch) US$5
- Category 5e patch cable - (Connect linux2 to interconnect ethernet switch) US$5

Total US$1,665

Note that the Maxtor OneTouch external drive does have two IEEE1394 (FireWire) ports, although it may not
appear so at first glance. Also note that although you may be tempted to substitute the Ethernet switch (used
for interconnect int-linux1/int-linux2) with a crossover CAT5 cable, I would not recommend this approach. I
have found that when using a crossover CAT5 cable for the interconnect, whenever I took one of the PCs
down the other PC would detect a "cable unplugged" error, and thus the Cache Fusion network would become
unavailable.

Now that we know the hardware that will be used in this example, let's take a conceptual look at what the
environment looks like:
Figure 1: Architecture
As we start to go into the details of the installation, keep in mind that most tasks will need to be performed on
both servers.

6. Install the Linux Operating System

This section provides a summary of the screens used to install the Linux operating system. This article was
designed to work with the Red Hat Enterprise Linux 3 (AS/ES) operating environment. As an alternative, and
what I used for this article, is White Box Enterprise Linux (WBEL): a free and stable version of the RHEL3
operating environment.

For more detailed installation instructions, it is possible to use the manuals from Red Hat Linux. I would
suggest, however, that the instructions I have provided below be used for this configuration.

Before installing the Linux operating system on both nodes, you should have the FireWire and two NIC
interfaces (cards) installed.

Also, before starting the installation, ensure that the FireWire drive (our shared storage drive) is NOT
connected to either of the two servers.

Download the following ISO images for WBEL:

• liberation-respin1-binary-i386-1.iso (642,304 KB)


• liberation-respin1-binary-i386-2.iso (646,592 KB)
• liberation-respin1-binary-i386-3.iso (486,816 KB)
After downloading and burning the WBEL images (ISO files) to CD, insert WBEL Disk #1 into the first server
(linux1 in this example), power it on, and answer the installation screen prompts as noted below. After
completing the Linux installation on the first node, perform the same Linux installation on the second node
while substituting the node name linux1 for linux2 and the different IP addresses where appropriate.

Boot Screen
The first screen is the WBEL boot screen. At the boot: prompt, hit [Enter] to start the installation process.

Media Test
When asked to test the CD media, tab over to [Skip] and hit [Enter]. If there were any errors, the media burning
software would have warned us. After several seconds, the installer should then detect the video card, monitor,
and mouse. The installer then goes into GUI mode.

Welcome to White Box Enterprise Linux


At the welcome screen, click [Next] to continue.

Language / Keyboard / Mouse Selection


The next three screens prompt you for the Language, Keyboard, and Mouse settings. Make the appropriate
selections for your configuration.

Installation Type
Choose the [Custom] option and click [Next] to continue.

Disk Partitioning Setup


Select [Automatically partition] and click [Next] continue.

If there were a previous installation of Linux on this machine, the next screen will ask if you want to "remove" or
"keep" old partitions. Select the option to [Remove all partitions on this system]. Also, ensure that the [hda]
drive is selected for this installation. I also keep the checkbox [Review (and modify if needed) the partitions
created] selected. Click [Next] to continue.

You will then be prompted with a dialog window asking if you really want to remove all partitions. Click [Yes] to
acknowledge this warning.

Partitioning
The installer will then allow you to view (and modify if needed) the disk partitions it automatically selected. In
almost all cases, the installer will choose 100MB for /boot, double the amount of RAM for swap, and the rest
going to the root (/) partition. I like to have a minimum of 1GB for swap. For the purpose of this install, I will
accept all automatically preferred sizes. (Including 2GB for swap since I have 1GB of RAM installed.)

Boot Loader Configuration


The installer will use the GRUB boot loader by default. To use the GRUB boot loader, accept all default values
and click [Next] to continue.

Network Configuration
I made sure to install both NIC interfaces (cards) in each of the Linux machines before starting the operating
system installation. This screen should have successfully detected each of the network devices.

First, make sure that each of the network devices are checked to [Active on boot]. The installer may choose to
not activate eth1.

Second, [Edit] both eth0 and eth1 as follows. You may choose to use different IP addresses for both eth0 and
eth1 and that is OK. If possible, try to put eth1 (the interconnect) on a different subnet then eth0 (the public
network):

eth0:
- Check off the option to [Configure using DHCP]
- Leave the [Activate on boot] checked
- IP Address: 192.168.1.100
- Netmask: 255.255.255.0

eth1:
- Check off the option to [Configure using DHCP]
- Leave the [Activate on boot] checked
- IP Address: 192.168.2.100
- Netmask: 255.255.255.0

Continue by setting your hostname manually. I used "linux1" for the first node and "linux2" for the second one.
Finish this dialog off by supplying your gateway and DNS servers.

Firewall
On this screen, make sure to check [No firewall] and click [Next] to continue.

Additional Language Support/Time Zone


The next two screens allow you to select additional language support and time zone information. In almost all
cases, you can accept the defaults.

Set Root Password


Select a root password and click [Next] to continue.

Package Group Selection


Scroll down to the bottom of this screen and select [Everything] under the "Miscellaneous" section. Click [Next]
to continue.

About to Install
This screen is basically a confirmation screen. Click [Next] to start the installation. During the installation
process, you will be asked to switch disks to Disk #2 and then Disk #3.

Graphical Interface (X) Configuration


When the installation is complete, the installer will attempt to detect your video hardware. Ensure that the
installer has detected and selected the correct video hardware (graphics card and monitor) to properly use the
X Windows server. You will continue with the X configuration in the next three screens.

Congratulations
And that's it. You have successfully installed WBEL on the first node (linux1). The installer will eject the CD
from the CD-ROM drive. Take out the CD and click [Exit] to reboot the system.

When the system boots into Linux for the first time, it will prompt you with another Welcome screen. (No one
ever said Linux wasn't friendly!) The following wizard allows you to configure the date and time, add any
additional users, testing the sound card, and to install any additional CDs. The only screen I care about is the
time and date. As for the others, simply run through them as there is nothing additional that needs to be
installed (at this point anyways!). If everything was successful, you should now be presented with the login
screen.

Perform the same installation on the second node


After completing the Linux installation on the first node, repeat the above steps for the second node (linux2).
When configuring the machine name and networking, ensure to configure the proper values. For my
installation, this is what I configured for linux2:

First, make sure that each of the network devices are checked to [Active on boot]. The installer will choose not
to activate eth1.

Second, [Edit] both eth0 and eth1 as follows:


eth0:
- Check off the option to [Configure using DHCP]
- Leave the [Activate on boot] checked
- IP Address: 192.168.1.101
- Netmask: 255.255.255.0

eth1:
- Check off the option to [Configure using DHCP]
- Leave the [Activate on boot] checked
- IP Address: 192.168.2.101
- Netmask: 255.255.255.0

Continue by setting your hostname manually. I used "linux2" for the second node. Finish this dialog off by
supplying your gateway and DNS servers.

7. Configure Network Settings


Perform the following network configuration on all nodes in the cluster!

Note: Although we configured several of the network settings during the Linux installation, it is important to not
skip this section as it contains critical steps that are required for the RAC environment.

Introduction to Network Settings

During the Linux O/S install we already configured the IP address and host name for each of the nodes. We
now need to configure the /etc/hosts file as well as adjust several of the network settings for the
interconnect. I also include instructions for enabling Telnet and FTP services.

Each node should have one static IP address for the public network and one static IP address for the private
cluster interconnect. The private interconnect should only be used by Oracle to transfer Cluster Manager and
Cache Fusion related data. Although it is possible to use the public network for the interconnect, this is not
recommended as it may cause degraded database performance (reducing the amount of bandwidth for Cache
Fusion and Cluster Manager traffic). For a production RAC implementation, the interconnect should be at least
gigabit or more and only be used by Oracle.

Configuring Public and Private Network

In our two-node example, we need to configure the network on both nodes for access to the public network as
well as their private interconnect.

The easiest way to configure network settings in Red Hat Enterprise Linux 3 is with the Network Configuration
program. This application can be started from the command-line as the root user account as follows:

# su -
# /usr/bin/redhat-config-network &
Do not use DHCP naming for the public IP address or the interconnects - we need static IP addresses!

Using the Network Configuration application, you need to configure both NIC devices as well as the /etc/hosts
file. Both of these tasks can be completed using the Network Configuration GUI. Notice that the /etc/hosts
settings are the same for both nodes.

Our example configuration will use the following settings:


Server 1 (linux1)
Device IP Address Subnet Purpose
eth0 192.168.1.100 255.255.255.0 Connects linux1 to the public network
eth1 192.168.2.100 255.255.255.0 Connects linux1 (interconnect) to linux2 (int-linux2)
/etc/hosts
127.0.0.1 localhost loopback

# Public Network - (eth0)


192.168.1.100 linux1
192.168.1.101 linux2

# Private Interconnect - (eth1)


192.168.2.100 int-linux1
192.168.2.101 int-linux2

# Public Virtual IP (VIP) addresses for - (eth0)


192.168.1.200 vip-linux1
192.168.1.201 vip-linux2
Server 2 (linux2)
Device IP Address Subnet Purpose
eth0 192.168.1.101 255.255.255.0 Connects linux2 to the public network
eth1 192.168.2.101 255.255.255.0 Connects linux2 (interconnect) to linux1 (int-linux1)
/etc/hosts
127.0.0.1 localhost loopback

# Public Network - (eth0)


192.168.1.100 linux1
192.168.1.101 linux2

# Private Interconnect - (eth1)


192.168.2.100 int-linux1
192.168.2.101 int-linux2

# Public Virtual IP (VIP) addresses for - (eth0)


192.168.1.200 vip-linux1
192.168.1.201 vip-linux2

Note that the virtual IP addresses only need to be defined in the /etc/hosts file for both nodes. The public virtual
IP addresses will be configured automatically by Oracle when you run the Oracle Universal Installer, which
starts Oracle's Virtual Internet Protocol Configuration Assistant (VIPCA). All virtual IP addresses will be
activated when the srvctl start nodeapps -n <node_name> command is run. This is the Host
Name/IP Address that will be configured in the client(s) tnsnames.ora file (more details later).

In the screenshots below, only node 1 (linux1) is shown. Ensure to make all the proper network settings to both
nodes.
Figure 2: Network Configuration Screen, Node 1 (linux1)
Figure 3: Ethernet Device Screen, eth0 (linux1)
Figure 4: Ethernet Device Screen, eth1 (linux1)
Figure 5: Network Configuration Screen, /etc/hosts (linux1)

When the network if configured, you can use the ifconfig command to verify everything is working. The
following example is from linux1:

$ /sbin/ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:0C:41:F1:6E:9A
inet addr:192.168.1.100 Bcast:192.168.1.255
Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:421591 errors:0 dropped:0 overruns:0 frame:0
TX packets:403861 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:78398254 (74.7 Mb) TX bytes:51064273 (48.6 Mb)
Interrupt:9 Base address:0x400

eth1 Link encap:Ethernet HWaddr 00:0D:56:FC:39:EC


inet addr:192.168.2.100 Bcast:192.168.2.255
Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1715352 errors:0 dropped:1 overruns:0 frame:0
TX packets:4257279 errors:0 dropped:0 overruns:0 carrier:4
collisions:0 txqueuelen:1000
RX bytes:802574993 (765.3 Mb) TX bytes:1236087657 (1178.8 Mb)
Interrupt:3

lo Link encap:Local Loopback


inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1273787 errors:0 dropped:0 overruns:0 frame:0
TX packets:1273787 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:246580081 (235.1 Mb) TX bytes:246580081 (235.1 Mb)

About Virtual IP

Why do we have a Virtual IP (VIP) in 10g? Why does it just return a dead connection when its primary node
fails?

It's all about availability of the application. When a node fails, the VIP associated with it is supposed to be
automatically failed over to some other node. When this occurs, two things happen.

1. The new node re-arps the world indicating a new MAC address for the address. For directly
connected clients, this usually causes them to see errors on their connections to the old address.
2. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to
the clients. This results in the clients getting errors immediately.

This means that when the client issues SQL to the node that is now down, or traverses the address list while
connecting, rather than waiting on a very long TCP/IP time-out (~10 minutes), the client receives a TCP reset.
In the case of SQL, this is ORA-3113. In the case of connect, the next address in tnsnames is used.

Without using VIPs, clients connected to a node that died will often wait a 10-minute TCP timeout period before
getting an error. As a result, you don't really have a good HA solution without using VIPs (Source - Metalink
Note 220970.1) .

Confirm the RAC Node Name is Not Listed in Loopback Address

Ensure that none of the node names (linux1 or linux2) are not included for the loopback address in the
/etc/hosts file. If the machine name is listed in the in the loopback address entry as below:

127.0.0.1 linux1 localhost.localdomain localhost


it will need to be removed as shown below:
127.0.0.1 localhost.localdomain localhost

If the RAC node name is listed for the loopback address, you will receive the following error during the RAC
installation:

ORA-00603: ORACLE server session terminated by fatal error


or
ORA-29702: error occurred in Cluster Group Service operation

Adjusting Network Settings

With Oracle 9.2.0.1 and later, Oracle makes use of UDP as the default protocol on Linux for inter-process
communication (IPC), such as Cache Fusion and Cluster Manager buffer transfers between instances within
the RAC cluster.

Oracle strongly suggests to adjust the default and maximum send buffer size (SO_SNDBUF socket option) to
256KB, and the default and maximum receive buffer size (SO_RCVBUF socket option) to 256KB.

The receive buffers are used by TCP and UDP to hold received data until it is read by the application. The
receive buffer cannot overflow because the peer is not allowed to send data beyond the buffer size window.
This means that datagrams will be discarded if they don't fit in the socket receive buffer, potentially causing the
sender to overwhelm the receiver.

The default and maximum window size can be changed in the /proc file system without reboot:

# su - root

# sysctl -w net.core.rmem_default=262144
net.core.rmem_default = 262144

# sysctl -w net.core.wmem_default=262144
net.core.wmem_default = 262144

# sysctl -w net.core.rmem_max=262144
net.core.rmem_max = 262144

# sysctl -w net.core.wmem_max=262144
net.core.wmem_max = 262144

The above commands made the changes to the already running OS. You should now make the above
changes permanent (for each reboot) by adding the following lines to the /etc/sysctl.conf file for each node in
your RAC cluster:

# Default setting in bytes of the socket receive buffer


net.core.rmem_default=262144

# Default setting in bytes of the socket send buffer


net.core.wmem_default=262144

# Maximum socket receive buffer size which may be set by using


# the SO_RCVBUF socket option
net.core.rmem_max=262144

# Maximum socket send buffer size which may be set by using


# the SO_SNDBUF socket option
net.core.wmem_max=262144

Enabling Telnet and FTP Services

Linux is configured to run the Telnet and FTP server, but by default, these services are disabled. To enable the
telnet these service, login to the server as the root user account and run the following commands:

# chkconfig telnet on
# service xinetd reload
Reloading configuration: [ OK ]

Starting with the Red Hat Enterprise Linux 3.0 release (and in WBEL), the FTP server (wu-ftpd) is no longer
available with xinetd. It has been replaced with vsftp and can be started from /etc/init.d/vsftpd as in the
following:

# /etc/init.d/vsftpd start
Starting vsftpd for vsftpd: [ OK ]
If you want the vsftpd service to start and stop when recycling (rebooting) the machine, you can create the
following symbolic links:
# ln -s /etc/init.d/vsftpd /etc/rc3.d/S56vsftpd
# ln -s /etc/init.d/vsftpd /etc/rc4.d/S56vsftpd
# ln -s /etc/init.d/vsftpd /etc/rc5.d/S56vsftpd

Allowing Root Logins to Telnet and FTP Services

Before getting into the details of how to configure Red Hat Linux for root logins, keep in mind that this is very
poor security. Never configure your production servers for this type of login.

To configure Telnet for root logins, simply edit the file /etc/securetty and add the following to the end of
the file:

pts/0
pts/1
pts/2
pts/3
pts/4
pts/5
pts/6
pts/7
pts/8
pts/9
This will allow up to 10 telnet sessions to the server as root. To configure FTP for root logins, edit the files
/etc/vsftpd.ftpusers and /etc/vsftpd.user_list and remove the 'root' line from each file.

8. Obtain and Install a Proper Linux Kernel

Perform the following kernel upgrade on all nodes in the cluster!

The next step is to obtain and install a new Linux kernel that supports the use of IEEE1394 devices with
multiple logins. In a previous version of this article, I included the steps to download a patched version of the
Linux kernel (source code) and then compile it. Thanks to Oracle's Linux Projects development group, this is
no longer a requirement. They provide a pre-compiled kernel for RHEL3 (which also works with WBEL!), that
can simply be downloaded and installed. The instructions for downloading and installing the kernel are
included in this section. Before going into the details of how to perform these actions, however, lets take a
moment to discuss the changes that are required in the new kernel.

While FireWire drivers already exist for Linux, they often do not support shared storage. Typically when you
logon to an OS, the OS associates the driver to a specific drive for that machine alone. This implementation
simply will not work for our RAC configuration. The shared storage (our FireWire hard drive) needs to be
accessed by more than one node. We need to enable the FireWire driver to provide nonexclusive access to
the drive so that multiple servers—the nodes that comprise the cluster—will be able to access the same
storage. This goal is accomplished by removing the bit mask that identifies the machine during login in the
source code, resulting in nonexclusive access to the FireWire hard drive. All other nodes in the cluster login to
the same drive during their logon session, using the same modified driver, so they too also have nonexclusive
access to the drive.

Our implementation describes a dual node cluster (each with a single processor), each server running WBEL.
Keep in mind that the process of installing the patched Linux kernel will need to be performed on both Linux
nodes. White Box Enterprise Linux 3.0 (Respin 1) includes kernel 2.4.21-15.EL #1; we will need to download
the version hosted at http://oss.oracle.com/projects/firewire/files/, 2.4.21-27.0.2.ELorafw1.

Download one of the following files:

kernel-2.4.21-27.0.2.ELorafw1.i686.rpm - (for single processor)

or
kernel-smp-2.4.21-27.0.2.ELorafw1.i686.rpm - (for multiple processors)

Make a backup of your GRUB configuration file:

In most cases you will be using GRUB for the boot loader. Before actually installing the new kernel, backup a
copy of your /etc/grub.conf file:

# cp /etc/grub.conf /etc/grub.conf.original

Install the new kernel, as root:

# rpm -ivh --force kernel-2.4.21-27.0.2.ELorafw1.i686.rpm - (for single


processor)
or
# rpm -ivh --force kernel-smp-2.4.21-27.0.2.ELorafw1.i686.rpm - (for
multiple processors)

Note: Installing the new kernel using RPM will also update your GRUB (or lilo) configuration with the
appropiate stanza. There is no need to add any new stanza to your boot loader configuration unless you want
to have your old kernel image available.

The following is a listing of my /etc/grub.conf file before and then after the kernel install. As you can see, my
install put in another stanza for the 2.4.21-27.0.2.ELorafw1 kernel. If you want, you can chance the entry
(default) in the new file so that the new kernel will be the default one booted. By default, the installer keeps
the default kernel (your original one) by setting it to default=1. You should change the default value to zero
(default=0) in order to enable the new kernel to boot by default.

Original File

# grub.conf generated by anaconda


#
# Note that you do not have to rerun grub after making changes to this
file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/hda2
# initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title White Box Enterprise Linux (2.4.21-15.EL)
root (hd0,0)
kernel /vmlinuz-2.4.21-15.EL ro root=LABEL=/
initrd /initrd-2.4.21-15.EL.img

Newly Configured File After Kernel Install

# grub.conf generated by anaconda


#
# Note that you do not have to rerun grub after making changes to this
file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/hda2
# initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title White Box Enterprise Linux (2.4.21-27.0.2.ELorafw1)
root (hd0,0)
kernel /vmlinuz-2.4.21-27.0.2.ELorafw1 ro root=LABEL=/
initrd /initrd-2.4.21-27.0.2.ELorafw1.img
title White Box Enterprise Linux (2.4.21-15.EL)
root (hd0,0)
kernel /vmlinuz-2.4.21-15.EL ro root=LABEL=/
initrd /initrd-2.4.21-15.EL.img

Add module options:

Add the following lines to /etc/modules.conf:

alias ieee1394-controller ohci1394


options sbp2 sbp2_exclusive_login=0
post-install sbp2 insmod sd_mod
post-install sbp2 insmod ohci1394
post-remove sbp2 rmmod sd_mod
It is vital that the parameter sbp2_exclusive_login of the Serial Bus Protocol module (sbp2) be set to
zero to allow multiple hosts to login to and access the FireWire disk concurrently. The second line ensures the
SCSI disk driver module (sd_mod) is loaded as well since (sbp2) requires the SCSI layer. The core SCSI
support module (scsi_mod) will be loaded automatically if (sd_mod) is loaded; no need to make a separate
entry for it.

Connect FireWire drive to each machine and boot into the new kernel:

After performing the above tasks on both nodes in the cluster, power down both Linux machines:

===============================

# hostname
linux1

# init 0

===============================

# hostname
linux2

# init 0

===============================
After both machines are powered down, connect each of them to the back of the FireWire drive. Power on the
FireWire drive. Finally, power on each Linux server and ensure to boot each machine into the new kernel.

Loading the FireWire stack:


In most cases, the loading of the FireWire stack will already be configured in the /etc/rc.sysinit file.
The commands that are contained within this file that are responsible for loading the FireWire stack are:

# modprobe sbp2
# modprobe ohci1394
In older versions of Red Hat, this was not the case and these commands would have to be manually run or put
within a startup file. With Red Hat Enterprise Linux 3 and later, these commands are already put within the
/etc/rc.sysinit file and run on each boot.

Check for SCSI Device:

After each machine has rebooted, the kernel should automatically detect the disk as a SCSI device
(/dev/sdXX). This section will provide several commands that should be run on all nodes in the cluster to
verify the FireWire drive was successfully detected and being shared by all nodes in the cluster.

For this configuration, I was performing the above procedures on both nodes at the same time. When
complete, I shutdown both machines, started linux1 first, and then linux2. The following commands and
results are from my linux2 machine. Again, make sure that you run the following commands on all nodes to
ensure both machine can login to the shared drive.

Let's first check to see that the FireWire adapter was successfully detected:

# lspci
00:00.0 Host bridge: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE DRAM
Controller/Host-Hub Interface (rev 01)
00:02.0 VGA compatible controller: Intel Corp. 82845G/GL[Brookdale-G]/GE
Chipset Integrated Graphics Device (rev 01)
00:1d.0 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #1 (rev 01)
00:1d.1 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #2 (rev 01)
00:1d.2 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #3 (rev 01)
00:1d.7 USB Controller: Intel Corp. 82801DB (ICH4) USB2 EHCI Controller
(rev 01)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI
Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corp. 82801DB (ICH4) LPC Bridge (rev 01)
00:1f.1 IDE interface: Intel Corp. 82801DB (ICH4) Ultra ATA 100 Storage
Controller (rev 01)
00:1f.3 SMBus: Intel Corp. 82801DB/DBM (ICH4) SMBus Controller (rev 01)
00:1f.5 Multimedia audio controller: Intel Corp. 82801DB (ICH4) AC'97
Audio Controller (rev 01)
01:04.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000
Controller (PHY/Link)
01:05.0 Modem: Intel Corp.: Unknown device 1080 (rev 04)
01:06.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
Ethernet 10/100 (rev 11)
01:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T (rev
01)
Second, let's check to see that the modules are loaded:
# lsmod |egrep "ohci1394|sbp2|ieee1394|sd_mod|scsi_mod"
sd_mod 13744 0
sbp2 19724 0
scsi_mod 106664 3 [sg sd_mod sbp2]
ohci1394 28008 0 (unused)
ieee1394 62884 0 [sbp2 ohci1394]
Third, let's make sure the disk was detected and an entry was made by the kernel:
# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: Maxtor Model: OneTouch Rev: 0200
Type: Direct-Access ANSI SCSI revision: 06
Now let's verify that the FireWire drive is accessible for multiple logins and shows a valid login:
# dmesg | grep sbp2
ieee1394: sbp2: Query logins to SBP-2 device successful
ieee1394: sbp2: Maximum concurrent logins supported: 3
ieee1394: sbp2: Number of active logins: 1
ieee1394: sbp2: Logged into SBP-2 device
ieee1394: sbp2: Node[01:1023]: Max speed [S400] - Max payload [2048]
From the above output, you can see that the FireWire drive I have can support concurrent logins by up to three
servers. It is vital that you have a drive where the chipset supports concurrent access for all nodes within the
RAC cluster.

One other test I like to perform is to run a quick fdisk -l from each node in the cluster to verify that it is
really being picked up by the OS. It will show that the device does not contain a valid partition table, but this is
OK at this point of the RAC configuration.

# fdisk -l
Disk /dev/sda: 203.9 GB, 203927060480 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/hda: 40.0 GB, 40000000000 bytes


255 heads, 63 sectors/track, 4863 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System


/dev/hda1 * 1 13 104391 83 Linux
/dev/hda2 14 4609 36917370 83 Linux
/dev/hda3 4610 4863 2040255 82 Linux swap
Rescan SCSI bus no longer required:

In older versions of the kernel, I would need to run the rescan-scsi-bus.sh script in order to detect the FireWire
drive. The purpose of this script was to create the SCSI entry for the node by using the following command:

echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi


With Red Hat Enterprise Linux 3, this step is no longer required and the disk should be detected automatically.

Troubleshooting SCSI Device Detection:

If you are having troubles with any of the procedures (above) in detecting the SCSI device, you can try the
following:

# modprobe -r sbp2
# modprobe -r sd_mod
# modprobe -r ohci1394
# modprobe ohci1394
# modprobe sd_mod
# modprobe sbp2
You may also want to unplug any USB devices connected to the server. The system may not be able to
recognize your FireWire drive if you have a USB device attached!
9. Create "oracle" User and Directories (both nodes)

Perform the following procedure on all nodes in the cluster!

I will be using the Oracle Cluster File System (OCFS) to store the files required to be shared for the Oracle
Cluster Ready Services (CRS). When using OCFS, the UID of the UNIX user oracle and GID of the UNIX
group dba must be identical on all machines in the cluster. If either the UID or GID are different, the files on
the OCFS file system will show up as "unowned" or may even be owned by a different user. For this article, I
will use 175 for the oracle UID and 115 for the dba GID.

Create Group and User for Oracle

Let's continue our example by creating the Unix dba group and oracle user account along with all
appropriate directories.

# mkdir -p /u01/app
# groupadd -g 115 dba
# useradd -u 175 -g 115 -d /u01/app/oracle -s /bin/bash -c "Oracle
Software Owner" -p oracle oracle
# chown -R oracle:dba /u01
# passwd oracle
# su - oracle

Note: When you are setting the Oracle environment variables for each RAC node, ensure to assign each RAC
node a unique Oracle SID! For this example, I used:

• linux1 : ORACLE_SID=orcl1
• linux2 : ORACLE_SID=orcl2

After creating the "oracle" UNIX userid on both nodes, ensure that the environment is setup correctly by
using the following .bash_profile:
....................................

# .bash_profile

# Get the aliases and functions


if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

alias ls="ls -FA"

# User specific environment and startup programs


export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10.1.0/db_1
export ORA_CRS_HOME=$ORACLE_BASE/product/10.1.0/crs_1

# Each RAC node must have a unique ORACLE_SID. (i.e. orcl1, orcl2,...)
export ORACLE_SID=orcl1

export PATH=.:${PATH}:$HOME/bin:$ORACLE_HOME/bin
export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
export ORACLE_TERM=xterm
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
export CLASSPATH=$ORACLE_HOME/JRE
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
export THREADS_FLAG=native
export TEMP=/tmp
export TMPDIR=/tmp
export LD_ASSUME_KERNEL=2.4.1

....................................
Now, let's create the mount point for the Oracle Cluster File System (OCFS) that will be used to store files for
the Oracle Cluster Ready Service (CRS). These commands will need to be run as the "root" user account:
$ su -
# mkdir -p /u02/oradata/orcl
# chown -R oracle:dba /u02
Note: The Oracle Universal Installer (OUI) requires at most 400MB of free space in the /tmp directory.

You can check the available space in /tmp by running the following command:

# df -k /tmp
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda2 36337384 4691460 29800056 14% /
If for some reason you do not have enough space in /tmp, you can temporarily create space in another file
system and point your TEMP and TMPDIR to it for the duration of the install. Here are the steps to do this:
# su -
# mkdir /<AnotherFilesystem>/tmp
# chown root.root /<AnotherFilesystem>/tmp
# chmod 1777 /<AnotherFilesystem>/tmp
# export TEMP=/<AnotherFilesystem>/tmp # used by Oracle
# export TMPDIR=/<AnotherFilesystem>/tmp # used by Linux programs
# like the linker "ld"
When the installation of Oracle is complete, you can remove the temporary directory using the following:
# su -
# rmdir /<AnotherFilesystem>/tmp
# unset TEMP
# unset TMPDIR

10. Creating Partitions on the Shared FireWire Storage Device

Create the following partitions on only one node in the cluster!

The next step is to create the required partitions on the FireWire (shared) drive. As I mentioned previously, we
will use OCFS to store the two files to be shared for CRS. We will then use ASM for all physical database files
(data/index files, online redo log files, control files, SPFILE, and archived redo log files).
The following table lists the individual partitions that will be created on the FireWire (shared) drive and what
files will be contained on them.

Oracle Shared Drive Configuration


File System Type Partition Size Mount Point File Types
Oracle Cluster Registry File - (~100MB)
OCFS /dev/sda1 300MB /u02/oradata/orcl
CRS Voting Disk - (~20MB)
ASM /dev/sda2 50GB ORCL:VOL1 Oracle Database Files

ASM /dev/sda3 50GB ORCL:VOL2 Oracle Database Files

ASM /dev/sda4 50GB ORCL:VOL3 Oracle Database Files


Total 150.3GB

Create All Partitions on FireWire Shared Storage

As shown in the table above my FireWire drive shows up as the SCSI device /dev/sda. The fdisk command
is used for creating (and removing) partitions. For this configuration, we will be creating four partitions: one for
CRS and the other three for ASM (to store all Oracle database files). Before creating the new partitions, it is
important to remove any existing partitions (if they exist) on the FireWire drive:

# fdisk /dev/sda
Command (m for help): p

Disk /dev/sda: 203.9 GB, 203927060480 bytes


255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System


/dev/sda1 1 24791 199133676 c Win95 FAT32 (LBA)

Command (m for help): d


Selected partition 1

Command (m for help): p

Disk /dev/sda: 203.9 GB, 203927060480 bytes


255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Command (m for help): n


Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-24792, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-24792, default 24792):
+300M
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (38-24792, default 38): 38
Using default value 38
Last cylinder or +size or +sizeM or +sizeK (38-24792, default 24792):
+50G

Command (m for help): n


Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (6118-24792, default 6118): 6118
Using default value 6118
Last cylinder or +size or +sizeM or +sizeK (6118-24792, default 24792):
+50G

Command (m for help): n


Command action
e extended
p primary partition (1-4)
p
Selected partition 4
First cylinder (12198-24792, default 12198): 12198
Using default value 12198
Last cylinder or +size or +sizeM or +sizeK (12198-24792, default 24792):
+50G

Command (m for help): p

Disk /dev/sda: 203.9 GB, 203927060480 bytes


255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System


/dev/sda1 1 37 297171 83 Linux
/dev/sda2 38 6117 48837600 83 Linux
/dev/sda3 6118 12197 48837600 83 Linux
/dev/sda4 12198 18277 48837600 83 Linux

Command (m for help): w


The partition table has been altered!

Calling ioctl() to re-read partition table.


Syncing disks.

After creating all required partitions, you should now inform the kernel of the partition changes using the
following syntax as the root user account:

# partprobe
# fdisk -l /dev/sda
Disk /dev/sda: 203.9 GB, 203927060480 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System


/dev/sda1 1 37 297171 83 Linux
/dev/sda2 38 6117 48837600 83 Linux
/dev/sda3 6118 12197 48837600 83 Linux
/dev/sda4 12198 18277 48837600 83 Linux
(Note: The FireWire drive and partitions created will be exposed as a SCSI device.)

Reboot All Nodes in RAC Cluster

After creating the partitions, it is recommended that you reboot the kernel on all RAC nodes to ensure that all
the new partitions are recognized by the kernel on all RAC nodes.

# su -
# reboot
After each machine is back up, run the fdisk -l /dev/sda command on each machine in the cluster to
ensure that they both can see the partition table:
# fdisk -l /dev/sda

Disk /dev/sda: 203.9 GB, 203927060480 bytes


255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System


/dev/sda1 1 37 297171 83 Linux
/dev/sda2 38 6117 48837600 83 Linux
/dev/sda3 6118 12197 48837600 83 Linux
/dev/sda4 12198 18277 48837600 83 Linux
Build Your Own Oracle RAC 10g Cluster on Linux and FireWire
(Continued)
For development and testing only

11. Configure the Linux Servers

Perform the following configuration procedures on all nodes in the cluster!

Several of the commands within this section will need to be performed on every node within the cluster every
time the machine is booted. This section provides very detailed information about setting shared memory,
semaphores, and file handle limits. Instructions for placing them in a startup script (/etc/rc.local) are included in
Section 14 ("All Startup Commands for Each RAC Node").

Overview

This section focuses on configuring both Linux servers: getting each one prepared for the Oracle RAC 10g
installation. This includes verifying enough swap space, setting shared memory and semaphores, and finally
how to set the maximum amount of file handles for the OS.

Throughout this section you will notice that there are several different ways to configure (set) these
parameters. For the purpose of this article, I will be making all changes permanent (through reboots) by placing
all commands in the /etc/rc.local file. The method that I use will echo the values directly into the appropriate
path of the /proc filesystem.

Swap Space Considerations

• Installing Oracle10g requires a minimum of 512MB of memory. (Note: An inadequate amount of swap
during the installation will cause the Oracle Universal Installer to either "hang" or "die")
• To check the amount of memory / swap you have allocated, type either:

# free

or

# cat /proc/swaps

or

# cat /proc/meminfo | grep MemTotal

• If you have less than 512MB of memory (between your RAM and SWAP), you can add temporary
swap space by creating a temporary swap file. This way you do not have to use a raw device or even
more drastic, rebuild your system.

As root, make a file that will act as additional swap space, let's say about 300MB:

# dd if=/dev/zero of=tempswap bs=1k count=300000

Now we should change the file permissions:

# chmod 600 tempswap


Finally we format the "partition" as swap and add it to the swap space:

# mke2fs tempswap
# mkswap tempswap
# swapon tempswap

Setting Shared Memory

Shared memory allows processes to access common structures and data by placing them in a shared memory
segment. This is the fastest form of inter-process communications (IPC) available, mainly due to the fact that
no kernel involvement occurs when data is being passed between the processes. Data does not need to be
copied between processes.

Oracle makes use of shared memory for its Shared Global Area (SGA) which is an area of memory that is
shared by all Oracle backup and foreground processes. Adequate sizing of the SGA is critical to Oracle
performance because it is responsible for holding the database buffer cache, shared SQL, access paths, and
so much more.

To determine all shared memory limits, use the following:

# ipcs -lm

------ Shared Memory Limits --------


max number of segments = 4096
max seg size (kbytes) = 32768
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1
Setting SHMMAX

The SHMMAX parameters defines the maximum size (in bytes) for a shared memory segment. The Oracle SGA
is comprised of shared memory and it is possible that incorrectly setting SHMMAX could limit the size of the
SGA. When setting SHMMAX, keep in mind that the size of the SGA should fit within one shared memory
segment. An inadequate SHMMAX setting could result in the following:

ORA-27123: unable to attach to shared memory segment


You can determine the value of SHMMAX by performing the following:
# cat /proc/sys/kernel/shmmax
33554432
The default value for SHMMAX is 32MB. This size is often too small to configure the Oracle SGA. I generally
set the SHMMAX parameter to 2GB using either of the following methods:

• You can alter the default setting for SHMMAX without rebooting the machine by making the
changes directly to the /proc file system. The following method can be used to dynamically
set the value of SHMMAX. This command can be made permanent by putting it into the
/etc/rc.local startup file:

# echo "2147483648" > /proc/sys/kernel/shmmax

• You can also use the sysctl command to change the value of SHMMAX:

# sysctl -w kernel.shmmax=2147483648

• Lastly, you can make this change permanent by inserting the kernel parameter in the
/etc/sysctl.conf startup file:
• # echo "kernel.shmmax=2147483648" >> /etc/sysctl.conf
Setting SHMMNI

We now look at the SHMMNI parameters. This kernel parameter is used to set the maximum number of shared
memory segments system wide. The default value for this parameter is 4096. This value is sufficient and
typically does not need to be changed.

You can determine the value of SHMMNI by performing the following:

# cat /proc/sys/kernel/shmmni
4096
Setting SHMALL

Finally, we look at the SHMALL shared memory kernel parameter. This parameter controls the total amount of
shared memory (in pages) that can be used at one time on the system. In short, the value of this parameter
should always be at least:

ceil(SHMMAX/PAGE_SIZE)
The default size of SHMALL is 2097152 and can be queried using the following command:
# cat /proc/sys/kernel/shmall
2097152
The default setting for SHMALL should be adequate for our Oracle RAC 10g installation.
(Note: The page size in Red Hat Linux on the i386 platform is 4,096 bytes. You can, however, use bigpages
which supports the configuration of larger memory page sizes.)

Setting Semaphores

Now that we have configured our shared memory settings, it is time to take care of configuring our
semaphores. The best way to describe a semaphore is as a counter that is used to provide synchronization
between processes (or threads within a process) for shared resources like shared memory. Semaphore sets
are supported in Unix System V where each one is a counting semaphore. When an application requests
semaphores, it does so using "sets."

To determine all semaphore limits, use the following:

# ipcs -ls

------ Semaphore Limits --------


max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
You can also use the following command:
# cat /proc/sys/kernel/sem
250 32000 32 128
Setting SEMMSL

The SEMMSL kernel parameter is used to control the maximum number of semaphores per semaphore set.

Oracle recommends setting SEMMSL to the largest PROCESS instance parameter setting in the init.ora
file for all databases on the Linux system plus 10. Also, Oracle recommends setting the SEMMSL to a value of
no less than 100.

Setting SEMMNI
The SEMMNI kernel parameter is used to control the maximum number of semaphore sets in the entire Linux
system. Oracle recommends setting the SEMMNI to a value of no less than 100.

Setting SEMMNS

The SEMMNS kernel parameter is used to control the maximum number of semaphores (not semaphore sets)
in the entire Linux system.

Oracle recommends setting the SEMMNS to the sum of the PROCESSES instance parameter setting for each
database on the system, adding the largest PROCESSES twice, and then finally adding 10 for each Oracle
database on the system.

Use the following calculation to determine the maximum number of semaphores that can be allocated on a
Linux system. It will be the lesser of:

SEMMNS -or- (SEMMSL * SEMMNI)


Setting SEMOPM

The SEMOPM kernel parameter is used to control the number of semaphore operations that can be performed
per semop system call.

The semop system call (function) provides the ability to do operations for multiple semaphores with one
semop system call. A semaphore set can have the maximum number of SEMMSL semaphores per
semaphore set and is therefore recommended to set SEMOPM equal to SEMMSL.

Oracle recommends setting the SEMOPM to a value of no less than 100.

Setting Semaphore Kernel Parameters

Finally, we see how to set all semaphore parameters using several methods. In the following, the only
parameter I care about changing (raising) is SEMOPM. All other default settings should be sufficient for our
example installation.

• You can alter the default setting for all semaphore settings without rebooting the machine by
making the changes directly to the /proc file system. This is the method that I use by
placing the following into the /etc/rc.local startup file:
• # echo "250 32000 100 128" > /proc/sys/kernel/sem
• You can also use the sysctl command to change the value of all semaphore settings:
• # sysctl -w kernel.sem="250 32000 100 128"
• Lastly you can make this change permanent by inserting the kernel parameter in the
/etc/sysctl.conf startup file:
• # echo "kernel.sem=250 32000 100 128" >> /etc/sysctl.conf

Setting File Handles

When configuring our Red Hat Linux server, it is critical to ensure that the maximum number of file handles is
large enough. The setting for file handles denotes the number of open files that you can have on the Linux
system.

Use the following command to determine the maximum number of file handles for the entire system:

# cat /proc/sys/fs/file-max
32768
Oracle recommends that the file handles for the entire system be set to at least 65536.
• You can alter the default setting for the maximum number of file handles without rebooting
the machine by making the changes directly to the /proc file system. This is the method
that I use by placing the following into the /etc/rc.local startup file:
• # echo "65536" > /proc/sys/fs/file-max
• You can also use the sysctl command to change the value of SHMMAX:
• # sysctl -w fs.file-max=65536
• Last, you can make this change permanent by inserting the kernel parameter in the
/etc/sysctl.conf startup file:

# echo "fs.file-max=65536" >> /etc/sysctl.conf


You can query the current usage of file handles by using the following:
# cat /proc/sys/fs/file-nr
613 95 32768
The file-nr file displays three parameters: total allocated file handles, currently used file handles, and maximum
file handles that can be allocated.
(Note: If you need to increase the value in /proc/sys/fs/file-max, then make sure that the ulimit is set properly.
Usually for 2.4.20 it is set to unlimited. Verify the ulimit setting my issuing the ulimit command:
# ulimit
unlimited

12. Configure the hangcheck-timer Kernel Module

Perform the following configuration procedures on all nodes in the cluster!

Oracle 9.0.1 and 9.2.0.1 used a userspace watchdog daemon called watchdogd to monitor the health of the
cluster and to restart a RAC node in case of a failure. Starting with Oracle 9.2.0.2, the watchdog daemon was
deprecated by a Linux kernel module named hangcheck-timer that addresses availability and reliability
problems much better. The hang-check timer is loaded into the Linux kernel and checks if the system
hangs. It will set a timer and check the timer after a certain amount of time. There is a configurable threshold to
hang-check that, if exceeded will reboot the machine. Although the hangcheck-timer module is not
required for Oracle CRS, it is highly recommended by Oracle.

The hangcheck-timer.o Module


The hangcheck-timer module uses a kernel-based timer that periodically checks the system task scheduler to
catch delays in order to determine the health of the system. If the system hangs or pauses, the timer resets the
node. The hangcheck-timer module uses the Time Stamp Counter (TSC) CPU register, which is incremented
at each clock signal. The TCS offers much more accurate time measurements because this register is updated
by the hardware automatically.
Much more information about the hangcheck-timer project can be found here.
Installing the hangcheck-timer.o Module
The hangcheck-timer was originally shipped only by Oracle; however, this module is now included with Red
Hat Linux starting with kernel versions 2.4.9-e.12 and higher. If you followed the steps in Section 8 ("Obtain
and Install a Proper Linux Kernel"), then the hangcheck-timer is already included for you. Use the following to
confirm:
# find /lib/modules -name "hangcheck-timer.o"
/lib/modules/2.4.21-15.ELorafw1/kernel/drivers/char/hangcheck-timer.o
/lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/char/hangcheck-timer.o
In the above output, we care about the hangcheck timer object (hangcheck-timer.o) in the
/lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/char directory.
Configuring and Loading the hangcheck-timer Module
There are two key parameters to the hangcheck-timer module:
• hangcheck-tick: This parameter defines the period of time between checks of system health.
The default value is 60 seconds; Oracle recommends setting it to 30 seconds.
• hangcheck-margin: This parameter defines the maximum hang delay that should be tolerated
before hangcheck-timer resets the RAC node. It defines the margin of error in seconds. The default
value is 180 seconds; Oracle recommends setting it to 180 seconds.

NOTE: The two hangcheck-timer module parameters indicate how long a RAC node must hang before it
will reset the system. A node reset will occur when the following is true:
system hang time > (hangcheck_tick + hangcheck_margin)
Configuring Hangcheck Kernel Module Parameters

Each time the hangcheck-timer kernel module is loaded (manually or by Oracle), it needs to know what value
to use for each of the two parameters we just discussed: (hangcheck-tick and hangcheck-margin). These
values need to be available after each reboot of the Linux server. To do that, make an entry with the correct
values to the /etc/modules.conf file as follows:

# su -
# echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180"
>> /etc/modules.conf
Each time the hangcheck-timer kernel module gets loaded, it will use the values defined by the entry I made in
the /etc/modules.conf file.

Manually Loading the Hangcheck Kernel Module for Testing

Oracle is responsible for loading the hangcheck-timer kernel module when required. For that reason, it is not
required to perform a modprobe or insmod of the hangcheck-timer kernel module in any of the startup files
(i.e. /etc/rc.local).

It is only out of pure habit that I continue to include a modprobe of the hangcheck-timer kernel module in the
/etc/rc.local file. Someday I will get over it, but realize that it does not hurt to include a modprobe of the
hangcheck-timer kernel module during startup.

So to keep myself sane and able to sleep at night, I always configure the loading of the hangcheck-timer kernel
module on each startup as follows:

# echo "/sbin/modprobe hangcheck-timer" >> /etc/rc.local

(Note: You don't have to manually load the hangcheck-timer kernel module using modprobe or insmod
after each reboot. The hangcheck-timer module will be loaded by Oracle automatically when needed.)

Now, to test the hangcheck-timer kernel module to verify it is picking up the correct parameters we defined in
the /etc/modules.conf file, use the modprobe command. Although you could load the hangcheck-
timer kernel module by passing it the appropriate parameters (e.g. insmod hangcheck-timer
hangcheck_tick=30 hangcheck_margin=180), we want to verify that it is picking up the options we
set in the /etc/modules.conf file.

To manually load the hangcheck-timer kernel module and verify it is using the correct values defined in the
/etc/modules.conf file, run the following command:

# su -
# modprobe hangcheck-timer
# grep Hangcheck /var/log/messages | tail -2
Jan 30 22:11:33 linux1 kernel: Hangcheck: starting hangcheck timer 0.8.0
(tick is 30 seconds, margin is 180 seconds).
Jan 30 22:11:33 linux1 kernel: Hangcheck: Using TSC.
I also like to verify that the correct hangcheck-timer kernel module is being loaded. To confirm, I typically
remove the kernel module (if it was loaded) and then re-loading it using the following:
# su -
# rmmod hangcheck-timer
# insmod hangcheck-timer
Using /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/char/hangcheck-
timer.o

13. Configure RAC Nodes for Remote Access

Perform the following configuration procedures on all nodes in the cluster!

When running the Oracle Universal Installer on a RAC node, it will use the rsh (or ssh) command to copy the
Oracle software to all other nodes within the RAC cluster. The oracle UNIX account on the node running the
Oracle Installer (runInstaller) must be trusted by all other nodes in your RAC cluster. Therefore you
should be able to run r* commands like rsh, rcp, and rlogin on the Linux server you will be running the
Oracle installer from, against all other Linux servers in the cluster without a password. The rsh daemon
validates users using the /etc/hosts.equiv file or the .rhosts file found in the user's (oracle's) home
directory. (The use of rcp and rsh are not required for normal RAC operation. However rcp and rsh
should be enabled for RAC and patchset installation.)

Oracle added support in 10g for using the Secure Shell (SSH) tool suite for setting up user equivalence. This
article, however, uses the older method of rcp for copying the Oracle software to the other nodes in the
cluster. When using the SSH tool suite, the scp (as opposed to the rcp) command would be used to copy the
software in a very secure manner.

First, let's make sure that we have the rsh RPMs installed on each node in the RAC cluster:

# rpm -q rsh rsh-server


rsh-0.17-17
rsh-server-0.17-17
From the above, we can see that we have the rsh and rsh-server installed. Were rsh not installed, we
would run the following command from the CD where the RPM is located:
# su -
# rpm -ivh rsh-0.17-17.i386.rpm rsh-server-0.17-17.i386.rpm
To enable the "rsh" service, the "disable" attribute in the /etc/xinetd.d/rsh file must be set to "no" and
xinetd must be reloaded. Do that by running the following commands on all nodes in the cluster:
# su -
# chkconfig rsh on
# chkconfig rlogin on
# service xinetd reload
Reloading configuration: [ OK ]
To allow the "oracle" UNIX user account to be trusted among the RAC nodes, create the /etc/hosts.equiv file
on all nodes in the cluster:
# su -
# touch /etc/hosts.equiv
# chmod 600 /etc/hosts.equiv
# chown root.root /etc/hosts.equiv
Now add all RAC nodes to the /etc/hosts.equiv file similar to the following example for all nodes in the
cluster:
# cat /etc/hosts.equiv
+linux1 oracle
+linux2 oracle
+int-linux1 oracle
+int-linux2 oracle
(Note: In the above example, the second field permits only the oracle user account to run rsh commands
on the specified nodes. For security reasons, the /etc/hosts.equiv file should be owned by root and
the permissions should be set to 600. In fact, some systems will only honor the content of this file if the owner
is root and the permissions are set to 600.

Before attempting to test your rsh command, ensure that you are using the correct version of rsh. By default,
Red Hat Linux puts /usr/kerberos/sbin at the head of the $PATH variable. This will cause the
Kerberos version of rsh to be executed.

I will typically rename the Kerberos version of rsh so that the normal rsh command is being used. Use the
following:

# su -

# which rsh
/usr/kerberos/bin/rsh

# cd /usr/kerberos/bin
# mv rsh rsh.original

# which rsh
/usr/bin/rsh
You should now test your connections and run the rsh command from the node that will be performing the
Oracle CRS and 10g RAC installation. We will use the node linux1 to perform the install, so run the
following commands from that node:
# su - oracle

$ rsh linux1 ls -l /etc/hosts.equiv


-rw------- 1 root root 68 Jan 31 00:39 /etc/hosts.equiv

$ rsh int-linux1 ls -l /etc/hosts.equiv


-rw------- 1 root root 68 Jan 31 00:39 /etc/hosts.equiv

$ rsh linux2 ls -l /etc/hosts.equiv


-rw------- 1 root root 68 Jan 31 00:25 /etc/hosts.equiv

$ rsh int-linux2 ls -l /etc/hosts.equiv


-rw------- 1 root root 68 Jan 31 00:25 /etc/hosts.equiv

14. All Startup Commands for Each RAC Node

Verify that the following startup commands are included on all nodes in the cluster!

Up to this point, we have examined in great detail the parameters and resources that need to be configured on
all nodes for the Oracle RAC 10g configuration. In this section we will take a "deep breath" and recap those
parameters, commands, and entries (in previous sections of this document) that you must include in the
startup scripts for each Linux node in the RAC cluster.

For each of the startup files below, entries in gray should be included in each startup file.
/etc/modules.conf

(All parameters and values to be used by kernel modules.)

alias eth0 tulip


alias eth1 b44
alias sound-slot-0 i810_audio
post-install sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -L
>/dev/null 2>&1 || :
pre-remove sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -S >/dev/null
2>&1 || :
alias usb-controller usb-uhci
alias usb-controller1 ehci-hcd
alias ieee1394-controller ohci1394
options sbp2 sbp2_exclusive_login=0
post-install sbp2 insmod sd_mod
post-install sbp2 insmod ohci1394
post-remove sbp2 rmmod sd_mod
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180

/etc/sysctl.conf

(We wanted to adjust the default and maximum send buffer size as well as the default and maximum receive
buffer size for the interconnect.)

# Kernel sysctl configuration file for Red Hat Linux


#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding


net.ipv4.ip_forward = 0

# Controls source route verification


net.ipv4.conf.default.rp_filter = 1

# Controls the System Request debugging functionality of the kernel


kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Default setting in bytes of the socket receive buffer


net.core.rmem_default=262144

# Default setting in bytes of the socket send buffer


net.core.wmem_default=262144

# Maximum socket receive buffer size which may be set by using


# the SO_RCVBUF socket option
net.core.rmem_max=262144

# Maximum socket send buffer size which may be set by using


# the SO_SNDBUF socket option
net.core.wmem_max=262144
/etc/hosts

(All machine/IP entries for nodes in our RAC cluster.)

# Do not remove the following line, or various programs


# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.100 linux1
192.168.1.101 linux2
# Private Interconnect - (eth1)
192.168.2.100 int-linux1
192.168.2.101 int-linux2
# Public Virtual IP (VIP) addresses for - (eth0)
192.168.1.200 vip-linux1
192.168.1.201 vip-linux2
192.168.1.106 melody
192.168.1.102 alex
192.168.1.105 bartman

/etc/hosts.equiv

(Allow logins to each node as the oracle user account without the need for a password.)

+linux1 oracle
+linux2 oracle
+int-linux1 oracle
+int-linux2 oracle

/etc/grub.conf

(Determine which kernel to use when the node is booted.)


# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this
file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/hda2
# initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title White Box Enterprise Linux (2.4.21-27.0.2.ELorafw1)
root (hd0,0)
kernel /vmlinuz-2.4.21-27.0.2.ELorafw1 ro root=LABEL=/
initrd /initrd-2.4.21-27.0.2.ELorafw1.img
title White Box Enterprise Linux (2.4.21-15.EL)
root (hd0,0)
kernel /vmlinuz-2.4.21-15.EL ro root=LABEL=/
initrd /initrd-2.4.21-15.EL.img

/etc/rc.local
(These commands are responsible for configuring shared memory, semaphores, and file handles for use by
the Oracle instance.)
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

# +---------------------------------------------------------+
# | SHARED MEMORY |
# +---------------------------------------------------------+

echo "2147483648" > /proc/sys/kernel/shmmax


echo "4096" > /proc/sys/kernel/shmmni

# +---------------------------------------------------------+
# | SEMAPHORES |
# | ---------- |
# | |
# | SEMMSL_value SEMMNS_value SEMOPM_value SEMMNI_value |
# | |
# +---------------------------------------------------------+

echo "256 32000 100 128" > /proc/sys/kernel/sem

# +---------------------------------------------------------+
# | FILE HANDLES |
# ----------------------------------------------------------+

echo "65536" > /proc/sys/fs/file-max

# +---------------------------------------------------------+
# | HANGCHECK TIMER |
# | (I do not believe this is required, but doesn't hurt) |
# ----------------------------------------------------------+

/sbin/modprobe hangcheck-timer

15. Check RPM Packages for Oracle 10g

Perform the following checks on all nodes in the cluster!

When installing the Linux O/S (RHEL 3 or WBEL), you should verify that all required RPMs are installed. If you
followed the instructions I used for installing Linux, you would have installed Everything, in which case you will
have all of the required RPM packages. However, if you performed another installation type (i.e. Advanced
Server), you may have some packages missing and will need to install them. All of the required RPMs are on
the Linux CDs/ISOs.
Check Required RPMs

The following packages (or higher versions) must be installed:

make-3.79.1
gcc-3.2.3-34
glibc-2.3.2-95.20
glibc-devel-2.3.2-95.20
glibc-headers-2.3.2-95.20
glibc-kernheaders-2.4-8.34
cpp-3.2.3-34
compat-db-4.0.14-5
compat-gcc-7.3-2.96.128
compat-gcc-c++-7.3-2.96.128
compat-libstdc++-7.3-2.96.128
compat-libstdc++-devel-7.3-2.96.128
openmotif-2.2.2-16
setarch-1.3-1

To query package information (gcc and glibc-devel for example), use the "rpm -q <PackageName> [,
<PackageName>]" command as follows:

# rpm -q gcc glibc-devel


gcc-3.2.3-34
glibc-devel-2.3.2-95.20
If you need to install any of the above packages, use "rpm -Uvh <PackageName.rpm>". For example,
to install the GCC 3.2.3-24 package, use:
# rpm -Uvh gcc-3.2.3-24.i386.rpm
Reboot the System

At this point, reboot all nodes in the cluster before attempting to install any of the Oracle components!!!

# init 6

16. Install and Configure OCFS

Most of the configuration procedures in this section should be performed on all nodes in the cluster! Creating
the OCFS filesystem, however, should be executed on only one node in the cluster.

It is now time to install the Oracle Cluster File System (OCFS). OCFS was developed by Oracle to remove the
burden of managing raw devices from DBAs and Sysadmins. It provides the same functionality and feel of a
normal filesystem.

In this guide, we will be use OCFS version 1 to store the two files that are required to be shared by CRS.
(These will be the only two files stored on the OCFS.) This release of OCFS does NOT support using the
filesystem for a shared Oracle Home install (the Oracle Database software). This feature will be available in a
future release of OCFS, possibly version 2. Here, we will install the Oracle Database software to a separate
$ORACLE_HOME directory locally on each Oracle Linux server in the cluster.

In version 1, OCFS supports only the following types of files:

• Oracle database files


• Online Redo Log files
• Archived Redo Log files
• Control files
• Server Parameter file (SPFILE)
• Oracle Cluster Registry (OCR) file
• CRS Voting disk.

The Linux binaries used to manipulate files and directories (move, copy, tar, etc.) should not be used on
OCFS. These binaries are part of the standard system commands and come with the OS (i.e. mv, cp, tar,
etc.); they have a major performance impact when used on the OCFS filesystem. You should instead use
Oracle's patched version of these commands. Keep this in mind when using third-party backup tools that also
make use of the standard system commands (i.e. mv, tar, etc.).
See this document for more information on OCFS version 1 (including Installation Notes) for RHEL.

Downloading OCFS

First, download the OCFS files (driver, tools, support) from the Oracle Linux Projects Development Group web
site (http://oss.oracle.com/projects/ocfs/files/RedHat/RHEL3/i386/). This page will contain several releases of
the OCFS files for different versions of the Linux kernel. First, download the key OCFS drivers for either a
single processor or a multiple processor Linux server:

ocfs-2.4.21-EL-1.0.14-1.i686.rpm - (for single processor)

or

ocfs-2.4.21-EL-smp-1.0.14-1.i686.rpm - (for multiple processors)

You will also need to download the following two support files:

ocfs-support-1.0.10-1.i386.rpm - (1.0.10-1 support package)


ocfs-tools-1.0.10-1.i386.rpm - (1.0.10-1 tools package)

If you were curious as to which OCFS driver release you need, use the OCFS release that matches your
kernel version. To determine your kernel release:

$ uname -a
Linux linux1 2.4.21-27.0.1.ELorafw1 #1 Tue Dec 28 16:58:59 PST 2004 i686
i686 i386 GNU/Linux
In the absence of the string "smp" after the string "ELorafw1", you are running a single processor
(Uniprocessor) machine. If the string "smp" were to appear, then you would be running on a multi-processor
machine.

Installing OCFS

We will be installing the OCFS files onto two single-processor machines. The installation process is simply a
matter of running the following command on all nodes in the cluster as the root user account:

$ su -
# rpm -Uvh ocfs-2.4.21-EL-1.0.14-1.i686.rpm \
ocfs-support-1.0.10-1.i386.rpm \
ocfs-tools-1.0.10-1.i386.rpm
Preparing... ###########################################
[100%]
1:ocfs-support ########################################### [
33%]
2:ocfs-2.4.21-EL ########################################### [
67%]
Linking OCFS module into the module path [ OK ]
3:ocfs-tools ###########################################
[100%]
Configuring and Loading OCFS

The next step is to generate and configure the /etc/ocfs.conf file. The easiest way to accomplish that is
to run the GUI tool ocfstool We will need to do that on all nodes in the cluster as the root user account:

$ su -
# ocfstool &
This will bring up the GUI as shown below:

Figure 6. ocfstool GUI

Using the ocfstool GUI tool, perform the following steps:

1. Select [Task] - [Generate Config]


2. In the "OCFS Generate Config" dialog, enter the interface and DNS Name for the private
interconnect. In our example, this would be eth1 and int-linux1 for the node linux1 and
eth1 and int-linux2 for the node linux2.
3. After verifying all values are correct on all nodes, exit the application.

The following dialog shows the settings I used for the node linux1:
Figure 7. ocfstool Settings

After exiting the ocfstool, you will have a /etc/ocfs.conf similar to the following:

#
# ocfs config
# Ensure this file exists in /etc
#

node_name = int-linux1
ip_address = 192.168.2.100
ip_port = 7000
comm_voting = 1
guid = 8CA1B5076EAF47BE6AA0000D56FC39EC
Notice the guid value. This is a group user ID that has to be unique for all nodes in the cluster. Also keep in
mind that the /etc/ocfs.conf could have been created manually or by simply running the
ocfs_uid_gen -c command that will assign (or update) the GUID value in the file.

The next step is to load the ocfs.o kernel module. Like all steps in this section, run the following command
on all nodes as the root user account:

$ su -
# /sbin/load_ocfs
/sbin/insmod ocfs node_name=int-linux1 ip_address=192.168.2.100 cs=1891
guid=8CA1B5076EAF47BE6AA0000D56FC39EC comm_voting=1 ip_port=7000
Using /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o
Warning: kernel-module version mismatch
/lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o was compiled for kernel
version 2.4.21-27.EL
while this kernel is version 2.4.21-27.0.2.ELorafw1
Warning: loading /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o will taint the
kernel: forced load
See http://www.tux.org/lkml/#export-tainted for information about
tainted modules
Module ocfs loaded, with warnings

The two warnings (above) can safely be ignored! To verify that the kernel module was loaded, run the
following:

# /sbin/lsmod |grep ocfs


ocfs 299072 0 (unused)
(Note: The ocfs module will stay loaded until the machine is cycled. I will provide instructions for how to load
the module automatically later.)
Many types of errors can occur while attempting to load the ocfs module. I have not run into any of these
problems, so I include them here only for documentation purposes!

One common error looks like this:

# /sbin/load_ocfs
/sbin/insmod ocfs node_name=int-linux1 \
ip_address=192.168.2.100 \
cs=1891 \
guid=8CA1B5076EAF47BE6AA0000D56FC39EC \
comm_voting=1 ip_port=7000
Using /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o
/lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o: kernel-module version mismatch
/lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o was compiled for kernel
version 2.4.21-4.EL
while this kernel is version 2.4.21-15.ELorafw1.
This usually means you have the wrong version of the modutils RPM. Get the latest version of modutils
and use the following commnad to update your system:
rpm -Uvh modutils-devel-2.4.25-12.EL.i386.rpm
Other problems can occur when using FireWire. If you are still having troubles loading and verifying the loading
of the ocfs module, try the following on all nodes that are having the error as the "root" user account:
$ su -
# /lib/modules/`uname -r`/kernel/drivers/addon/ocfs
# ln -s `rpm -qa | grep ocfs-2 | xargs rpm -ql | grep "/ocfs.o$"` \
/lib/modules/`uname -r`/kernel/drivers/addon/ocfs/ocfs.o

Thanks to Werner Puschitz for coming up with the above solutions!

Creating an OCFS Filesystem

(Note: Unlike the other tasks in this section, creating the OCFS filesystem should be executed only on one
node. We will be executing all commands in this section from linux1 only.)

Finally, we can start making use of those partitions we created in Section 10 ("Create Partitions on the Shared
FireWire Storage Device"). Well, at least the first partition!

To create the file system, we use the Oracle executable /sbin/mkfs.ocfs. For the purpose of this
example, I run the following command only from linux1 as the root user account:

$ su -
# mkfs.ocfs -F -b 128 -L /u02/oradata/orcl -m /u02/oradata/orcl -u '175'
-g '115' -p 0775 /dev/sda1
Cleared volume header sectors
Cleared node config sectors
Cleared publish sectors
Cleared vote sectors
Cleared bitmap sectors
Cleared data block
Wrote volume header
The following should be noted with the above command:

• The -u argument is the User ID for the oracle user. This can be obtained using the command id
-u oracle and should be the same on all nodes.
• The -g argument is the Group ID for the oracle:dba user:group. This can be obtained using the
command id -g oracle and should be the same on all nodes.
• /dev/sda1 is the device name (or partition) to use for this filesystem. We created the /dev/sda1
for storing the Cluster Manager files.

The following is a list of the options available with the mkfs.ocfs command:
usage: mkfs.ocfs -b block-size [-C] [-F]
[-g gid] [-h] -L volume-label
-m mount-path [-n] [-p permissions]
[-q] [-u uid] [-V] device

-b Block size in kilo bytes


-C Clear all data blocks
-F Force format existing OCFS volume
-g GID for the root directory
-h Help
-L Volume label
-m Path where this device will be mounted
-n Query only
-p Permissions for the root directory
-q Quiet execution
-u UID for the root directory
-V Print version and exit
One final note about creating OCFS filesystems: You can use the GUI tool ocfstool to perform the same
task as the command-line mkfs.ocfs. From the ocfstool utility, use the menu [Tasks] -
[Format] .

Mounting the OCFS Filesystem

Now that the file system is created, we can mount it. Let's first do it using the command line, then I'll show how
to include it in the /etc/fstab to have it mount on each boot. We will need to mount the filesystem on all
nodes as the root user account.

First, here is how to manually mount the OCFS filesystem from the command line. Remember to do this as
root:

$ su -
# mount -t ocfs /dev/sda1 /u02/oradata/orcl
If the mount was successful, you will simply got your prompt back. We should, however, run the following
checks to ensure the filesystem is mounted correctly with the right permissions. You should run these manual
checks on all nodes.

First, let's use the mount command to ensure that the new filesystem is really mounted. This step should be
performed on all nodes:

# mount
/dev/hda2 on / type ext3 (rw)
none on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
usbdevfs on /proc/bus/usb type usbdevfs (rw)
/dev/hda1 on /boot type ext3 (rw)
none on /dev/shm type tmpfs (rw)
/dev/sda1 on /u02/oradata/orcl type ocfs (rw)

Next, use the ls command to check ownership. The permissions should be set to 0775 with owner oracle
and group dba. If this is not the case for all nodes in the cluster, then it is likely that the oracle UID (175 in
this example) and/or the dba GID (115 in this example) are not consistent across all nodes.
# ls -ld /u02/oradata/orcl
drwxrwxr-x 1 oracle dba 131072 Feb 2 18:02 /u02/oradata/orcl
Configuring OCFS to Mount Automatically at Startup

Let's take a look at what we have done so far. We downloaded and installed the OCFS that will be used to
store the files needed by Cluster Manager files. After going through the install, we loaded the OCFS module
into the kernel and then created the cluster filesystem. Finally, we mounted the newly created filesystem. This
section walks through the steps responsible for loading the OCFS module and ensure the filesystem(s) are
mounted each time the machine(s) are booted.

We start by adding the following line to the /etc/fstab file on all nodes:

/dev/sda1 /u02/oradata/orcl ocfs _netdev 0 0


(Notice the _netdev option for mounting this filesystem. This option prevents the OCFS from being mounted
until all networking services are enabled.)

Now, let's make sure that the ocfs.o kernel module is being loaded and that the filesystem will be mounted
during the boot process.

If you have been following along with the examples in this guide, the actions to load the kernel module and
mount the OCFS filesystem should already be enabled. However, we should still check those options by
running the following on all nodes as root:

$ su -
# chkconfig --list ocfs
ocfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off
The flags that I have marked in bold should be set to on. If for some reason these options are set to off, you
can use the following command to enable them:
$ su -
# chkconfig ocfs on
(Note that loading the ocfs.o kernel module will also mount the OCFS filesystem(s) configured in
/etc/fstab!)

Before starting the next section, this would be a good place to reboot all the nodes in the cluster. When the
machines come up, ensure that the ocfs.o kernel module is being loaded and that the filesystem we created
is being mounted.

17. Install and Configure Automatic Storage Management and Disks

Most of the installation and configuration procedures should be performed on all nodes. Creating the ASM
disks, however, will only need to be performed on a single node within the cluster.

In this section, we will configure Automatic Storage Management (ASM) to be used as the filesystem/volume
manager for all Oracle physical database files (data, online redo logs, control files, archived redo logs).

ASM was introduced in Oracle Database 10g and relieves the DBA from having to manage individual files and
drives. ASM is built into the Oracle kernel and provides the DBA with a way to manage thousands of disk
drives 24x7 for single as well as clustered instances. All the files and directories to be used for Oracle will be
contained in a disk group. ASM automatically performs load balancing in parallel across all available disk
drives to prevent hot spots and maximize performance, even with rapidly changing data usage patterns.
First we'll discuss the ASMLib libraries and the associated driver for Linux, plus other methods for configuring
ASM with Linux. Next, I will provide instructions for downloading the ASM drivers (ASMLib Release 1.0)
specific to our Linux kernel. Finally, we will install and configure the ASM drivers while finishing off the section
with a demonstration of how we created the ASM disks.

If you would like to learn more about the ASMLib, visit


www.oracle.com/technology/tech/linux/asmlib/install.html.

Methods for Configuring ASM with Linux (For Reference Only)

When I first started this guide, I wanted to focus on using ASM for all database files. I was curious to see how
well ASM works with this test RAC configuration with regard to load balancing and fault tolerance.

There are two different methods to configure ASM on Linux:

• ASM with ASMLib I/O: This method creates all Oracle database files on raw block devices managed
by ASM using ASMLib calls. Raw devices are not required with this method as ASMLib works with
block devices.
• ASM with Standard Linux I/O: This method creates all Oracle database files on raw character
devices managed by ASM using standard Linux I/O system calls. You will be required to create raw
devices for all disk partitions used by ASM.

We will examine the "ASM with ASMLib I/O" method here.

Before discussing the installation and configuration details of ASMLib, however, I thought it would be
interesting to talk briefly about the second method, "ASM with Standard Linux I/O". If you were to use this
method (which is a perfectly valid solution, just not the method we will be implementing), you should be aware
that Linux does not use raw devices by default. Every Linux raw device you want to use must be bound to the
corresponding block device using the raw driver. For example, if you wanted to use the partitions we've
created, (/dev/sda2, /dev/sda3, and /dev/sda4), you would need to perform the following tasks:

1. Edit the file /etc/sysconfig/rawdevices as follows:


2. # raw device bindings
3. # format: <rawdev> <major> <minor>
4. # <rawdev> <blockdev>
5. # example: /dev/raw/raw1 /dev/sda1
6. # /dev/raw/raw2 8 5
7. /dev/raw/raw2 /dev/sda2
8. /dev/raw/raw3 /dev/sda3
9. /dev/raw/raw4 /dev/sda4

The raw device bindings will be created on each reboot.

10. You would then want to change ownership of all raw devices to the "oracle" user account:
11. # chown oracle:dba /dev/raw/raw2; chmod 660 /dev/raw/raw2
12. # chown oracle:dba /dev/raw/raw3; chmod 660 /dev/raw/raw3
13. # chown oracle:dba /dev/raw/raw4; chmod 660 /dev/raw/raw4
14. The last step is to reboot the server to bind the devices or simply restart the rawdevices service:

# service rawdevices restart

As I mentioned earlier, the above example was just to demonstrate that there is more than one method for
using ASM with Linux. Now let's move on to the method that will be used for this article, "ASM with ASMLib
I/O."

Downloading the ASMLib Packages


As with OCFS, we need to download the version for the Linux kernel and number of processors on the
machine. We are using kernel 2.4.21 and the machines I am using are both single-processor machines:

# uname -a
Linux linux1 2.4.21-27.0.2.ELorafw1 #1 Tue Dec 28 16:58:59 PST 2004 i686
i686 i386 GNU/Linux
Oracle ASMLib Downloads

• oracleasm-2.4.21-EL-1.0.3-1.i686.rpm - (Driver for "up" kernels)


• oracleasmlib-1.0.0-1.i386.rpm - (Userspace library)
• oracleasm-support-1.0.3-1.i386.rpm - (Driver support files)

Installing ASMLib Packages

This installation needs to be performed on all nodes as the root user account:

$ su -
# rpm -Uvh oracleasm-2.4.21-EL-1.0.3-1.i686.rpm \
oracleasmlib-1.0.0-1.i386.rpm \
oracleasm-support-1.0.3-1.i386.rpm
Preparing... ###########################################
[100%]
1:oracleasm-support ########################################### [
33%]
2:oracleasm-2.4.21-EL ########################################### [
67%]
Linking module oracleasm.o into the module path [ OK ]
3:oracleasmlib ###########################################
[100%]
Configuring and Loading the ASMLib Packages

Now that we downloaded and installed the ASMLib Packages for Linux, we need to configure and load the
ASM kernel module. This task needs to be run on all nodes as root:

$ su -
# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library driver. The following questions will
determine whether the driver is loaded on boot and what permissions it will have. The current values will be
shown in brackets ('[]'). Hitting <ENTER> without typing an answer will keep that current value. Ctrl-C will
abort.
Default user to own the driver interface []: oracle
Default group to own the driver interface []: dba
Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration [ OK ]
Creating /dev/oracleasm mount point [ OK ]
Loading module "oracleasm" [ OK ]
Mounting ASMlib driver filesystem [ OK ]
Scanning system for ASM disks [ OK ]

Creating ASM Disks for Oracle

In Section 8, we created three Linux partitions to be used for storing Oracle database files such as online redo
logs, database files, control files, SPFILEs, and archived redo log files.
Here is a list of the partitions we created:

Oracle ASM Partitions Created


Filesystem Type Partition Size Mount Point File Types

ASM /dev/sda2 50GB ORCL:VOL1 Oracle Database Files

ASM /dev/sda3 50GB ORCL:VOL2 Oracle Database Files

ASM /dev/sda4 50GB ORCL:VOL3 Oracle Database Files

Total 150GB

The last task in this section it to create the ASM Disks. Creating the ASM disks only needs to be done on one
node as the root user account. I will be running these commands on linux1. On the other nodes, you will
need to perform a scandisk to recognize the new volumes. When that is complete, you should then run the
oracleasm listdisks command on all nodes to verify that all ASM disks were created and available.

$ su -
# /etc/init.d/oracleasm createdisk VOL1 /dev/sda2
Marking disk "/dev/sda2" as an ASM disk [ OK ]

# /etc/init.d/oracleasm createdisk VOL2 /dev/sda3


Marking disk "/dev/sda3" as an ASM disk [ OK ]

# /etc/init.d/oracleasm createdisk VOL3 /dev/sda4


Marking disk "/dev/sda4" as an ASM disk [ OK ]
Note: If you are repeating this guide using the same hardware (actually, the same shared drive), you may get a
failure when attempting to create the ASM disks. If you do receive a failure, try listing all ASM disks using:
# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
As you can see, the results show that I have three volumes already defined. If you have the three volumes
already defined from a previous run, go ahead and remove them using the following commands and then
creating them again using the above (oracleasm createdisk) commands:
# /etc/init.d/oracleasm deletedisk VOL1
Removing ASM disk "VOL1" [ OK ]
# /etc/init.d/oracleasm deletedisk VOL2
Removing ASM disk "VOL2" [ OK ]
# /etc/init.d/oracleasm deletedisk VOL3
Removing ASM disk "VOL3" [ OK ]
On all other nodes in the cluster, you must perform a scandisk to recognize the new volumes:
# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks [ OK ]

We can now test that the ASM disks were successfully created by using the following command on all nodes
as the root user account:

# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3

18. Download Oracle RAC 10g Software


The following download procedures only need to be performed on one node in the cluster!

The next logical step is to install Oracle CRS (10.1.0.3.0) and the Oracle Database 10g (10.1.0.3.0) software.
However, we must first download and extract the required Oracle software packages from OTN.

We will be downloading and extracting the required software from Oracle to only one of the Linux nodes in the
cluster—namely, linux1. We will perform all installs from this machine. The Oracle installer will copy the
required software packages to all other nodes in the RAC configuration we set up in Section 13.

Login to one of the nodes in the Linux RAC cluster as the oracle user account. In this example, we will be
downloading the required Oracle software to linux1 and saving them to /u01/app/oracle/orainstall/crs and
/u01/app/oracle/orainstall/db.

Downloading and Extracting the Software

First, download the Oracle Database 10g (10.1.0.3 or later) and Oracle CRS (10.1.0.3 or later) software for
Linux x86. Both downloads are available from the same page.

As the oracle user account, extract the two packages you downloaded to a temporary directory. In this
example, we will use /u01/app/oracle/orainstall/crs and /u01/app/oracle/orainstall/db.

Extract the CRS package as follows:

# su - oracle
$ cd ~oracle/orainstall/crs
$ gunzip ship.crs.lnx32.cpio.gz
$ cpio -idmv < ship.crs.lnx32.cpio
Then extract the Oracle Database Software:
$ cd ~oracle/orainstall/db
$ gunzip ship.db.lnx32.cpio.gz
$ cpio -idmv < ship.db.lnx32.cpio
Build Your Own Oracle RAC 10g Cluster on Linux and FireWire
(Continued)
For development and testing only

19. Install Oracle Cluster Ready Services Software

Perform the following installation procedures on only one node in the cluster! The Oracle CRS software will be
installed to all other nodes in the cluster by the Oracle Universal Installer.

We are ready to install the "cluster" part of the environment: the CRS software. In the last section, we
downloaded and extracted the install files for CRS to linux1 in the directory
/u01/app/oracle/orainstall/crs/Disk1. This is the only node we need to perform the install from.

During the installation of CRS, you will be asked for the nodes involved and to configure in the RAC cluster.
When the actual installation starts, it will copy the required software to all nodes using the remote access we
configured in Section 13.

So, what exactly is the Oracle CRS responsible for?

The Oracle CRS contains all the cluster and database configuration metadata along with several system
management features for RAC. It allows the DBA to register and invite an Oracle instance (or instances) to the
cluster. During normal operation, CRS will send messages (via a special ping operation) to all nodes
configured in the cluster—often called the "heartbeat." If the heartbeat fails for any of the nodes, it checks with
the CRS configuration files (on the shared disk) to distinguish between a real node failure and a network
failure.

After installing CRS, the Oracle Universal Installer (OUI) used to install the Oracle Database 10g software
(next section) will automatically recognize these nodes. Like the CRS install we will be performing in this
section, the Oracle 10g database software only needs to be run from one node. The OUI will copy the software
packages to all nodes configured in the RAC cluster.

The excellent Metalink Note "CRS and 10g Real Application Clusters - (Note: 259301.1)" provides some key
facts about CRS and Oracle RAC 10g to consider before installing both software components:

• CRS must be installed and running prior to installing RAC 10g.


• CRS can either run on top of the vendor clusterware (such as Sun Cluster, HP Serviceguard, IBM
HACMP, TruCluster, Veritas Cluster, Fujitsu Primecluster, etc.) or without it. (Vendor clusterware was
required in Oracle9i RAC.)
• The CRS HOME and ORACLE_HOME must be installed in different locations.
• Shared Location(s) or devices for the Voting File and Oracle Configuration Repository (OCR) file must
be available prior to installing CRS. The voting file should be at least 20MB and the OCR file should
be at least 100MB.
• CRS and RAC require that the following network interfaces be configured prior to installing CRS or
RAC:
o Public Interface
o Private Interface
o Virtual (Public) Interface
o For more information on this, see Note 264847.1
• The root.sh script at the end of the CRS installation starts the CRS stack. If your CRS stack does not
start, see Note: 240001.1.
• Only one set of CRS daemons can be running per RAC node.
• On Unix, the CRS stack is run from entries in /etc/inittab with "respawn".
• If there is a network split (nodes loose communication with each other), one or more nodes may
reboot automatically to prevent data corruption.
• The supported method to start CRS is booting the machine.
• The supported method to stop is shutting down the machine or using init.crs stop.
• Killing CRS daemons is not supported unless you are removing the CRS installation via Note:
239998.1 because flag files can become mismatched.
• For maintenance, go to single user mode at the OS.
• When the stack is started, you should be able to see all of the daemon processes with a ps -ef
command:
• $ ps -ef | grep crs
• root 4661 1 0 14:18 ? 00:00:00 /bin/su -l oracle -c exec
/u01/app/oracle/product/10.1.0/crs_1/bin/evmd
• root 4664 1 0 14:18 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/crsd.bin
• root 4862 4663 0 14:18 ? 00:00:00 /bin/su -l oracle -c
/u01/app/oracle/product/10.1.0/crs_1/bin/ocssd || exit 137
• oracle 4864 4862 0 14:18 ? 00:00:00 -bash -c
/u01/app/oracle/product/10.1.0/crs_1/bin/ocssd || exit 137
• oracle 4898 4864 0 14:18 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/ocssd.bin
• oracle 4901 4661 0 14:18 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/evmd.bin
• root 4908 4664 0 14:18 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/crsd.bin -1
• oracle 4947 4901 0 14:18 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/evmd.bin

• oracle 4949 4898 0 14:18 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/ocssd.bin
• ...
• oracle 4958 4949 0 14:19 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/ocssd.bin

• oracle 4959 4947 0 14:19 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/evmd.bin
• ...
• oracle 4983 4947 0 14:19 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/evmd.bin

• oracle 4984 4983 0 14:19 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/evmlogger.bin \
• -o /u01/app/oracle/product/10.1.0/crs_1/evm/log/evmlogger.info
\
• -l /u01/app/oracle/product/10.1.0/crs_1/evm/log/evmlogger.log

• oracle 4985 4947 0 14:19 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/evmd.bin
• ...
oracle 4990 4947 0 14:19 ? 00:00:00
/u01/app/oracle/product/10.1.0/crs_1/bin/evmd.bin
CRS Shared Files
The two shared files used by CRS will be stored on the OCFS we created earlier. The two shared CRS files
are:

• Oracle Cluster Registry (OCR)


o Location: /u02/oradata/orcl/OCRFile
o Size: ~ 100MB
• CRS Voting Disk
o Location: /u02/oradata/orcl/CSSFile
o Size: ~ 20MB

Note: For our installation here, it is not possible to use ASM for the two CRS files, OCR or CRS Voting Disk.
These files need to be in place and accessible before any Oracle instances can be started. For ASM to be
available, the ASM instance would need to be run first. However, the two shared files could be stored on the
OCFS, shared raw devices, or another vendor's clustered filesystem.

Verifying Environment Variables

Before starting the OUI, you should first run the xhost command as root from the console to allow X Server
connections. Then unset the ORACLE_HOME variable and verify that each of the nodes in the RAC cluster
defines a unique ORACLE_SID. We also should verify that we are logged in as the oracle user account:

Login as oracle

# xhost +
access control disabled, clients can connect from any host

# su - oracle
Unset ORACLE_HOME
$ unset ORA_CRS_HOME
$ unset ORACLE_HOME
$ unset ORA_NLS33
$ unset TNS_ADMIN
Verify Environment Variables on linux1
$ env | grep ORA
ORACLE_SID=orcl1
ORACLE_BASE=/u01/app/oracle
ORACLE_TERM=xterm
Verify Environment Variables on linux2
$ env | grep ORA
ORACLE_SID=orcl2
ORACLE_BASE=/u01/app/oracle
ORACLE_TERM=xterm
Installing Cluster Ready Services

Perform following tasks to install the Oracle CRS:

$ cd ~oracle
$ ./orainstall/crs/Disk1/runInstaller -ignoreSysPrereqs
Screen Name Response
Welcome Screen Click Next
Accept the default values:
Specify Inventory
Inventory directory: /u01/app/oracle/oraInventory
directory and credentials
Operating System group name: dba
Root Script Window - Open a new console window on the node you are performing the install on as the "root" user account.
Run orainstRoot.sh
Navigate to the /u01/app/oracle/oraInventory directory and run orainstRoot.sh.

Go back to the OUI and acknowledge the dialog window.


Leave the default value for the Source directory. Set the destination for the ORACLE_HOME name
and location as follows:
Specify File Locations
Name: OraCrs10g_home1
Location: /u01/app/oracle/product/10.1.0/crs_1
I accepted the default value:
Language Selection
Selected Languages: English
Cluster Name: crs
Cluster Configuration Public Node Name: linux1 Private Node Name: int-linux1
Public Node Name: linux2 Private Node Name: int-linux2
Specify Network Interface Name: eth0 Subnet: 192.168.1.0 Interface Type: Public
Interface Usage Interface Name: eth1 Subnet: 192.168.2.0 Interface Type: Private
Oracle Cluster Registry Specify OCR Location: /u02/oradata/orcl/OCRFile
Voting Disk Enter voting disk file name: /u02/oradata/orcl/CSSFile
Open a new console window on each node in the RAC cluster as the "root" user account.

Root Script Window - Navigate to the /u01/app/oracle/oraInventory directory and run orainstRoot.sh on all nodes in the
Run orainstRoot.sh RAC cluster.

Go back to the OUI and acknowledge the dialog window.


For some reason, the OUI fails to create a "$ORACLE_HOME/log" for the installation directory before
starting the installation. You should manually create this directory before clicking the "Install" button.

Summary For this installation, manually create the file /u01/app/oracle/product/10.1.0/crs_1/log on all nodes in
the cluster. The OUI will log all errors to a log file in this directory only if it exists.

Click Install to start the installation!


After the installation has completed, you will be prompted to run the root.sh script.

Open a new console window on each node in the RAC cluster as the "root" user account.

Navigate to the /u01/app/oracle/product/10.1.0/crs_1 directory and run root.sh on all nodes in the
RAC cluster one at a time.

You will receive several warnings while running the root.sh script on all nodes. These warnings can
be safely ignored.
Root Script Window -
Run root.sh The root.sh may take awhile to run. When running the root.sh on the last node, the output should
look like:

...
CSS is active on these nodes.
linux1
linux2
CSS is active on all nodes.
Oracle CRS stack installed and running under init(1M)

Go back to the OUI and acknowledge the dialog window.


End of installation At the end of the installation, exit from the OUI.

Verifying CRS Installation

After installing CRS, we can run through several tests to verify the install was successful. Run the following
commands on all nodes in the RAC cluster.

Check cluster nodes

$ /u01/app/oracle/product/10.1.0/crs_1/bin/olsnodes -n
linux1 1
linux2 2
Check CRS Auto-Start Scripts
$ ls -l /etc/init.d/init.*
-r-xr-xr-x 1 root root 1207 Feb 5 19:41 /etc/init.d/init.crs*
-r-xr-xr-x 1 root root 5492 Feb 5 19:41 /etc/init.d/init.crsd*
-r-xr-xr-x 1 root root 18617 Feb 5 19:41 /etc/init.d/init.cssd*
-r-xr-xr-x 1 root root 4553 Feb 5 19:41 /etc/init.d/init.evmd*

20. Install Oracle Database 10g Software

Perform the following installation procedures on only one node in the cluster! The Oracle database software
will be installed to all other nodes in the cluster by the Oracle Universal Installer.

After successfully installing the Oracle CRS software, the next step is to install the Oracle Database 10g
(10.1.0.3 or later) with RAC. (Note: At the time of this writing, the OUI for Oracle 10g was unable to discover
disks/volumes that were marked as Linux ASMLib. As a result, we will forgo the "Create Database" option
when installing the Oracle 10g software. We will, instead, create the database using the Database Creation
Assistant, or DBCA, after the Oracle Database 10g install. (For more install information, click here.)

Verifying Environment Variables

Before starting the OUI, you should first run the xhost command as root from the console to allow X Server
connections. Then unset the ORACLE_HOME variable and verify that each of the nodes in the RAC cluster
defines a unique ORACLE_SID. We also should verify that we are logged in as the oracle user account:

Login as oracle

# xhost +
access control disabled, clients can connect from any host

# su - oracle

Unset ORACLE_HOME

$ unset ORA_CRS_HOME
$ unset ORACLE_HOME
$ unset ORA_NLS33
$ unset TNS_ADMIN
Verify Environment Variables on linux1
$ env | grep ORA
ORACLE_SID=orcl1
ORACLE_BASE=/u01/app/oracle
ORACLE_TERM=xterm
Verify Environment Variables on linux2
$ env | grep ORA
ORACLE_SID=orcl2
ORACLE_BASE=/u01/app/oracle
ORACLE_TERM=xterm
Installing Oracle Database 10g Software

Install the Oracle Database 10g software with the following:

$ cd ~oracle
$ /u01/app/oracle/orainstall/db/Disk1/runInstaller -ignoreSysPrereqs
Screen Name Response
Welcome Screen Click Next
Ensure that the "Source Path:" is pointing to the products.xml file for the
.../db/Disk1/stage/product.xml product installation files.
Specify File
Locations Set the destination for the ORACLE_HOME name and location as follows:
Name: OraDb10g_home1
Location: /u01/app/oracle/product/10.1.0/db_1
Select the Cluster Installation option then select all nodes available. Click Select All to select all
servers: linux1 and linux2.

If the installation stops here and the status of any of the RAC nodes is "Node not reachable",
Specify Hardware perform the following checks:
Cluster Installation
Mode
• Ensure CRS is running on the node in question.
• Ensure you are table to reach the node in question from the node you are performing
the installation from.

Select Installation
I selected the Enterprise Edition option.
Type
Select the option to "Do not create a starter database".
Select Database
Configuration Remember that the OUI for 10g R1 (10.1) cannot discover disks that were marked as Linux
ASMLib. We will create the clustered database as a separate step using dbca.
For some reason, the OUI fails to create a "$ORACLE_HOME/log" for the installation directory
before starting the installation. You should manually create this directory before clicking the
"Install" button.

Summary For this installation, manually create the file /u01/app/oracle/product/10.1.0/db_1/log on the
node you are performing the installation from. The OUI will log all errors to a log file in this
directory only if it exists.

Click Install to start the installation!


When the installation is complete, you will be prompted to run the root.sh script. It is important to
Root Script keep in mind that the root.sh script will need to be run on all nodes in the RAC cluster one at a
Window - Run time starting with the node you are running the database installation from.
root.sh
First, open a new console window on the node from which you are installing the Oracle 10g
database software. For me, this was linux1. Before running the root.sh script on the first Linux
server, ensure that the console window you are using can run a GUI utility. (Set your $DISPLAY
environment variable before running the root.sh script!)

Navigate to the /u01/app/oracle/product/10.1.0/db_1 directory and run root.sh.

At the end of the root.sh script, it will bring up the GUI installer named VIP Configuration
Assistant (VIPCA). The VIPCA will only come up on the first node you run the root.sh from. You
still, however, need to continue running the root.sh script on all nodes in the cluster one at a
time.

When the VIPCA appears, answered the screen prompts like this:

Welcome: Click Next


Network interfaces: Select both interfaces - eth0 and eth1
Virtual IPs for cluster notes:
Node Name: linux1
IP Alias Name: vip-linux1
IP Address: 192.168.1.200
Subnet Mask: 255.255.255.0

Node Name: linux2


IP Alias Name: vip-linux2
IP Address: 192.168.1.201
Subnet Mask: 255.255.255.0

Summary: Click Finish


Configuration Assistant Progress Dialog: Click OK after configuration is complete.
Configuration Results: Click Exit

When running the root.sh script on the remaining nodes, the end of the script will display "CRS
resources are already configured".

Go back to the OUI and acknowledge the dialog window.


End of installation At the end of the installation, exit from the OUI.

21. Create the TNS Listener Process

Perform the following configuration procedures on only one node in the cluster! The Network Configuration
Assistant will setup the TNS listener in a clustered configuration on all nodes in the cluster.

The DBCA requires the Oracle TNS Listener process to be configured and running on all nodes in the RAC
cluster before it can create the clustered database.

The process of creating the TNS listener only needs to be performed on one node in the cluster. All changes
will be made and replicated to all nodes in the cluster. On one of the nodes (I will be using linux1) bring up
the Network Configuration Assistant (NETCA) and run through the process of creating a new TNS listener
process and also configure the node for local access.

Before running the NETCA, make sure to re-login as the oracle user and verify the $ORACLE_HOME
environment variable set to the proper location. If you attempt to use the console window used in the previous
section, remember that we unset the $ORACLE_HOME environment variable. This will result in a failure when
attempting to run netca.

To start the NETCA, run the following GUI utility as the oracle user account:

$ netca &
The following screenshots walk you through the process of creating a new Oracle listener for our RAC
environment.
Screen Name Response
Select the Type of Oracle
Select Cluster Configuration
Net Services Configuration
Select the nodes to
Select all of the nodes: linux1 and linux2.
configure
Type of Configuration Select Listener configuration.
The following screens are now like any other normal listener configuration. You can
simply accept the default parameters for the next six screens:
What do you want to do: Add
Listener name: LISTENER
Listener Configuration -
Selected protocols: TCP
Next 6 Screens
Port number: 1521
Configure another listener: No
Listener configuration complete! [ Next ]
You will be returned to this Welcome (Type of Configuration) Screen.
Type of Configuration Select Naming Methods configuration.
The following screens are:
Naming Methods Selected Naming Methods: Local Naming
Configuration Naming Methods configuration complete! [ Next ]
You will be returned to this Welcome (Type of Configuration) Screen.
Type of Configuration Click Finish to exit the NETCA.

The Oracle TNS listener process should now be running on all nodes in the RAC cluster:

$ hostname
linux1

$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_LINUX1

=====================

$ hostname
linux2

$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_LINUX2

22. Create the Oracle Cluster Database

The database creation process should only be performed from one node in the cluster!
We will use the DBCA to create the clustered database.

Before executing the DBCA, make sure that $ORACLE_HOME and $PATH are set appropriately for the
$ORACLE_BASE/product/10.1.0/db_1 environment.

You should also verify that all services we have installed up to this point (Oracle TNS listener, CRS processes,
etc.) are running before attempting to start the clustered database creation process.

Creating the Clustered Database

To start the database creation process, run the following:

# xhost +
access control disabled, clients can connect from any host

# su - oracle
$ dbca &
Screen Name Response
Welcome Screen Select Oracle Real Application Clusters database.
Operations Select Create a Database.
Node Selection Click the Select All button to select all servers: linux1 and linux2.
Database
Select Custom Database
Templates
Select:
Global Database Name: orcl.idevelopment.info
Database SID Prefix: orcl
Identification
I used idevelopment.info for the database domain. You may use any domain. Keep in mind that this
domain does not have to be a valid DNS domain.
Management
Leave the default options here, which is to "Configure the Database with Enterprise Manager."
Option
Database I selected to Use the Same Password for All Accounts. Enter the password (twice) and make sure
Credentials the password does not start with a digit number.
Storage Options For this guide, we will select to use ASM.
Other than supplying the SYS password I wanted to use for this instance, all other options I used
were the defaults. This includes the default for all ASM parameters and then to use default
parameter file (IFILE): {ORACLE_BASE}/admin/+ASM/pfile/init.ora.
Create ASM
Instance You will then be prompted with a dialog box asking if you want to create and start the ASM
instance. Select the OK button to acknowledge this dialog.

The OUI will now create and start the ASM instance on all nodes in the RAC cluster.
To start, click the Create New button. This will bring up the "Create Disk Group" window with the
three volumes we configured earlier using ASMLib.

ASM Disk If the volumes we created earlier in this article do not show up in the "Select Member Disks"
Groups window:

ORCL:VOL1, ORCL:VOL2, and ORCL:VOL3


then click on the "Change Disk Discovery Path" button and input "ORCL:VOL*".
For the "Disk Group Name", I used the string ORCL_DATA1.

Select all of the ASM volumes in the "Select Member Disks" window. All three volumes should have
a status of "PROVISIONED".

After verifying all values in this window are correct, click the OK button. This will present the "ASM
Disk Group Creation" dialog.

When the ASM Disk Group Creation process is finished, you will be returned to the "ASM Disk
Groups" windows. Select the checkbox next to the newly created Disk Group Name ORCL_DATA1
and click [Next] to continue.
I selected to use the default, which is to use Oracle Managed Files:
Database File
Locations
Database Area: +ORCL_DATA1
Recovery Using recovery options like Flash Recovery Area is out of scope for this article. I did not select any
Configuration recovery options.
Database
I left all of the Database Components (and destination tablespaces) set to their default value.
Content
Database For this test configuration, click Add, and enter orcltest as the "Service Name." Leave both
Services instances set to Preferred and for the "TAF Policy" select Basic.
Initialization
Change any parameters for your environment. I left them all at their default settings.
Parameters
Database
Change any parameters for your environment. I left them all at their default settings.
Storage
Keep the default option Create Database selected and click Finish to start the database creation
process.
Creation
Options Click OK on the "Summary" screen.

You may receive an error message during the install.


At the end of the database creation, exit from the DBCA.
End of Database
Creation When exiting the DBCA, another dialog will come up indicating that it is starting all Oracle instances
and HA service "orcltest". This may take several minutes to complete. When finished, all windows
and dialog boxes will disappear.

When the DBCA has completed, you will have a fully functional Oracle RAC cluster running!

Creating the orcltest Service

During the creation of the Oracle clustered database, we added a service named orcltest that will be used
to connect to the database with TAF enabled. During several of my installs, the service was added to the
tnsnames.ora, but was never updated as a service for each Oracle instance.

Use the following to verify the orcltest service was successfully added:

SQL> show parameter service

NAME TYPE VALUE


-------------------- ----------- --------------------------------
service_names string orcl.idevelopment.info, orcltest
If the only service defined was for orcl.idevelopment.info, then you will need to manually add the
service to both instances:
SQL> show parameter service

NAME TYPE VALUE


-------------------- ----------- --------------------------
service_names string orcl.idevelopment.info

SQL> alter system set service_names =


2 'orcl.idevelopment.info, orcltest.idevelopment.info' scope=both;

23. Verify the TNS Networking Files

Ensure that the TNS networking files are configured on all nodes in the cluster!

listener.ora

We already covered how to create a TNS listener configuration file (listener.ora) for a clustered environment in
Section 21. The listener.ora file should be properly configured and no modifications should be needed.

For clarity, I have included a copy of the listener.ora file from my node linux1 in this guide's support files.
I've also included a copy of my tnsnames.ora file that was configured by Oracle and can be used for testing the
Transparent Application Failover (TAF). This file should already be configured on each node in the RAC
cluster.

You can include any of these entries on other client machines that need access to the clustered database.

Connecting to Clustered Database From an External Client

This is an optional step, but I like to perform it in order to verify my TNS files are configured correctly. Use
another machine (i.e. a Windows machine connected to the network) that has Oracle installed (either 9i or 10g)
and add the TNS entries (in the tnsnames.ora) from either of the nodes in the cluster that were created for the
clustered database.

Then try to connect to the clustered database using all available service names defined in the
tnsnames.ora file:

C:\> sqlplus system/manager@orcl2


C:\> sqlplus system/manager@orcl1
C:\> sqlplus system/manager@orcltest
C:\> sqlplus system/manager@orcl

24. Creating/Altering Tablespaces


When creating the clustered database, we left all tablespaces set to their default size. If you are using a large
drive for the shared storage, you may want to make a sizable testing database.
Below are several optional SQL commands for modifying and creating all tablespaces for the test database.
Please keep in mind that the database file names (OMF files) used in this example may differ from what Oracle
creates for your environment.

$ sqlplus "/ as sysdba"

SQL> create user scott identified by tiger default tablespace users;


SQL> grant dba, resource, connect to scott;

SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/users.264.1'


resize 1024m;
SQL> alter tablespace users add datafile '+ORCL_DATA1' size 1024m
autoextend off;

SQL> create tablespace indx datafile '+ORCL_DATA1' size 1024m


2 autoextend on next 50m maxsize unlimited
3 extent management local autoallocate
4 segment space management auto;

SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/system.259.1'


resize 800m;

SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/sysaux.261.1'


resize 500m;

SQL> alter tablespace undotbs1 add datafile '+ORCL_DATA1' size 1024m


2 autoextend on next 50m maxsize 2048m;

SQL> alter tablespace undotbs2 add datafile '+ORCL_DATA1' size 1024m


2 autoextend on next 50m maxsize 2048m;

SQL> alter database tempfile '+ORCL_DATA1/orcl/tempfile/temp.262.1'


resize 1024m;
Here is a snapshot of the tablespaces I have defined for my test database environment:
Status Tablespace Name TS Type Ext. Mgt. Seg. Mgt. Tablespace
Size Used (in bytes) Pct. Used
--------- --------------- ------------ ---------- --------- -------------
--- ---------------- ---------
ONLINE INDX PERMANENT LOCAL AUTO
1,073,741,824 65,536 0
ONLINE SYSAUX PERMANENT LOCAL AUTO
524,288,000 227,803,136 43
ONLINE SYSTEM PERMANENT LOCAL MANUAL
838,860,800 449,380,352 54
ONLINE UNDOTBS1 UNDO LOCAL MANUAL
1,283,457,024 184,745,984 14
ONLINE UNDOTBS2 UNDO LOCAL MANUAL
1,283,457,024 4,194,304 0
ONLINE USERS PERMANENT LOCAL AUTO
2,147,483,648 131,072 0
ONLINE TEMP TEMPORARY LOCAL MANUAL
1,073,741,824 22,020,096 2
-------------
--- ---------------- ---------
avg
16
sum
8,225,030,144 888,340,480

7 rows selected.

25. Verify the RAC Cluster/Database Configuration

The following RAC verification checks should be performed on all nodes in the cluster! For this guide, we will
perform these checks only from linux1.

This section provides several srvctl commands and SQL queries you can use to validate your Oracle RAC
10g configuration.

There are five node-level tasks defined for SRVCTL:

• Adding and deleting node-level applications


• Setting and unsetting the environment for node-level applications
• Administering node applications
• Administering ASM instances
• Starting and stopping a group of programs that includes virtual IP addresses, listeners, Oracle
Notification Services, and Oracle Enterprise Manager agents (for maintenance purposes).

Status of all instances and services


$ srvctl status database -d orcl
Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2
Status of a single instance
$ srvctl status instance -d orcl -i orcl2
Instance orcl2 is running on node linux2
Status of a named service globally across the database
$ srvctl status service -d orcl -s orcltest
Service orcltest is running on instance(s) orcl2, orcl1
Status of node applications on a particular node
$ srvctl status nodeapps -n linux1
VIP is running on node: linux1
GSD is running on node: linux1
Listener is running on node: linux1
ONS daemon is running on node: linux1
Status of an ASM instance
$ srvctl status asm -n linux1
ASM instance +ASM1 is running on node linux1.
List all configured databases
$ srvctl config database
orcl
Display configuration for our RAC database
$ srvctl config database -d orcl
linux1 orcl1 /u01/app/oracle/product/10.1.0/db_1
linux2 orcl2 /u01/app/oracle/product/10.1.0/db_1
Display all services for the specified cluster database
$ srvctl config service -d orcl
orcltest PREF: orcl2 orcl1 AVAIL:
Display the configuration for node applications - (VIP, GSD, ONS, Listener)
$ srvctl config nodeapps -n linux1 -a -g -s -l
VIP exists.: /vip-linux1/192.168.1.200/255.255.255.0/eth0:eth1
GSD exists.
ONS daemon exists.
Listener exists.
Display the configuration for the ASM instance(s)
$ srvctl config asm -n linux1
+ASM1 /u01/app/oracle/product/10.1.0/db_1
All running instances in the cluster
SELECT
inst_id
, instance_number inst_no
, instance_name inst_name
, parallel
, status
, database_status db_status
, active_state state
, host_name host
FROM gv$instance
ORDER BY inst_id;

INST_ID INST_NO INST_NAME PAR STATUS DB_STATUS STATE HOST


-------- -------- ---------- --- ------- ------------ --------- -------
1 1 orcl1 YES OPEN ACTIVE NORMAL linux1
2 2 orcl2 YES OPEN ACTIVE NORMAL linux2
All data files which are in the disk group
select name from v$datafile
union
select member from v$logfile
union
select name from v$controlfile
union
select name from v$tempfile;

NAME
-------------------------------------------
+ORCL_DATA1/orcl/controlfile/current.256.1
+ORCL_DATA1/orcl/datafile/indx.269.1
+ORCL_DATA1/orcl/datafile/sysaux.261.1
+ORCL_DATA1/orcl/datafile/system.259.1
+ORCL_DATA1/orcl/datafile/undotbs1.260.1
+ORCL_DATA1/orcl/datafile/undotbs1.270.1
+ORCL_DATA1/orcl/datafile/undotbs2.263.1
+ORCL_DATA1/orcl/datafile/undotbs2.271.1
+ORCL_DATA1/orcl/datafile/users.264.1
+ORCL_DATA1/orcl/datafile/users.268.1
+ORCL_DATA1/orcl/onlinelog/group_1.257.1
+ORCL_DATA1/orcl/onlinelog/group_2.258.1
+ORCL_DATA1/orcl/onlinelog/group_3.265.1
+ORCL_DATA1/orcl/onlinelog/group_4.266.1
+ORCL_DATA1/orcl/tempfile/temp.262.1

15 rows selected.

All ASM disk that belong to the 'ORCL_DATA1' disk group

SELECT path
FROM v$asm_disk
WHERE group_number IN (select group_number
from v$asm_diskgroup
where name = 'ORCL_DATA1');

PATH
----------------------------------
ORCL:VOL1
ORCL:VOL2
ORCL:VOL3

26. Starting & Stopping the Cluster

At this point, we've installed and configured Oracle RAC 10g entirely and have a fully functional clustered
database.

After all the work done up to this point, you may well ask, "OK, so how do I start and stop services?" If you
have followed the instructions in this guide, all services—including CRS, all Oracle instances, Enterprise
Manager Database Console, and so on—should start automatically on each reboot of the Linux nodes.

There are times, however, when you might want to shut down a node and manually start it back up. Or you
may find that Enterprise Manager is not running and need to start it. This section provides the commands
(using SRVCTL) responsible for starting and stopping the cluster environment.

Ensure that you are logged in as the oracle UNIX user. We will runn all commands in this section from
linux1:

# su - oracle

$ hostname
linux1
Stopping the Oracle RAC 10g Environment

The first step is to stop the Oracle instance. When the instance (and related services) is down, then bring down
the ASM instance. Finally, shut down the node applications (Virtual IP, GSD, TNS Listener, and ONS).

$ export ORACLE_SID=orcl1
$ emctl stop dbconsole
$ srvctl stop instance -d orcl -i orcl1
$ srvctl stop asm -n linux1
$ srvctl stop nodeapps -n linux1
Starting the Oracle RAC 10g Environment

The first step is to start the node applications (Virtual IP, GSD, TNS Listener, and ONS). When the node
applications are successfully started, then bring up the ASM instance. Finally, bring up the Oracle instance
(and related services) and the Enterprise Manager Database console.

$ export ORACLE_SID=orcl1
$ srvctl start nodeapps -n linux1
$ srvctl start asm -n linux1
$ srvctl start instance -d orcl -i orcl1
$ emctl start dbconsole
Start/Stop All Instances with SRVCTL
Start/stop all the instances and their enabled services. I have included this step just for fun as a way to bring
down all instances!

$ srvctl start database -d orcl

$ srvctl stop database -d orcl

27. Managing Transparent Application Failover

It is not uncommon for businesses to demand 99.99% (or even 99.999%) availability for their enterprise
applications. Think about what it would take to ensure a downtime of no more than .5 hours or even no
downtime during the year. To answer many of these high-availability requirements, businesses are investing in
mechanisms that provide for automatic failover when one participating system fails. When considering the
availability of the Oracle database, Oracle RAC 10g provides a superior solution with its advanced failover
mechanisms. Oracle RAC 10g includes the required components that all work within a clustered configuration
responsible for providing continuous availability; when one of the participating systems fail within the cluster,
the users are automatically migrated to the other available systems.

A major component of Oracle RAC 10g that is responsible for failover processing is the Transparent
Application Failover (TAF) option. All database connections (and processes) that lose connections are
reconnected to another node within the cluster. The failover is completely transparent to the user.

This final section provides a short demonstration on how TAF works in Oracle RAC 10g. Please note that a
complete discussion of failover in Oracle RAC 10g would require an article in itself; my intention here is to
present only a brief overview.

One important note is that TAF happens automatically within the OCI libraries. Thus your application (client)
code does not need to change in order to take advantage of TAF. Certain configuration steps, however, will
need to be done on the Oracle TNS file tnsnames.ora. (Keep in mind that as of this writing, the Java thin client
will not be able to participate in TAF because it never reads tnsnames.ora.)

Setup the tnsnames.ora File

Before demonstrating TAF, we need to verify that a valid entry exists in the tnsnames.ora file on a non-RAC
client machine (if you have a Windows machine lying around). Ensure that you have the Oracle RDBMS
software installed. (Actually, you only need a client install of the Oracle software.)

During the creation of the clustered database in this guide, we created a new service that will be used for
testing TAF named ORCLTEST. It provides all the necessary configuration parameters for load balancing and
failover. You can copy the contents of this entry to the %ORACLE_HOME%\network\admin\tnsnames.ora file
on the client machine (my Windows laptop is being used in this example) in order to connect to the new Oracle
clustered database:

...
ORCLTEST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = vip-linux1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = vip-linux2)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcltest.idevelopment.info)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 180)
(DELAY = 5)
)
)
)
...
SQL Query to Check the Session's Failover Information

The following SQL query can be used to check a session's failover type, failover method, and if a failover has
occurred. We will be using this query throughout this example.

COLUMN instance_name FORMAT a13


COLUMN host_name FORMAT a9
COLUMN failover_method FORMAT a15
COLUMN failed_over FORMAT a11

SELECT
instance_name
, host_name
, NULL AS failover_type
, NULL AS failover_method
, NULL AS failed_over
FROM v$instance
UNION
SELECT
NULL
, NULL
, failover_type
, failover_method
, failed_over
FROM v$session
WHERE username = 'SYSTEM';
TAF Demo

From a Windows machine (or other non-RAC client machine), login to the clustered database using the
orcltest service as the SYSTEM user:

C:\> sqlplus system/manager@orcltest

COLUMN instance_name FORMAT a13


COLUMN host_name FORMAT a9
COLUMN failover_method FORMAT a15
COLUMN failed_over FORMAT a11

SELECT
instance_name
, host_name
, NULL AS failover_type
, NULL AS failover_method
, NULL AS failed_over
FROM v$instance
UNION
SELECT
NULL
, NULL
, failover_type
, failover_method
, failed_over
FROM v$session
WHERE username = 'SYSTEM';

INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER


------------- --------- ------------- --------------- -----------
orcl1 linux1
SELECT BASIC NO

DO NOT logout of the above SQL*Plus session!

Now that we have run the query (above), we should now shutdown the instance orcl1 on linux1 using the
abort option. To perform this operation, we can use the srvctl command-line utility as follows:

# su - oracle
$ srvctl status database -d orcl
Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2

$ srvctl stop instance -d orcl -i orcl1 -o abort

$ srvctl status database -d orcl


Instance orcl1 is not running on node linux1
Instance orcl2 is running on node linux2
Now let's go back to our SQL session and rerun the SQL statement in the buffer:
COLUMN instance_name FORMAT a13
COLUMN host_name FORMAT a9
COLUMN failover_method FORMAT a15
COLUMN failed_over FORMAT a11

SELECT
instance_name
, host_name
, NULL AS failover_type
, NULL AS failover_method
, NULL AS failed_over
FROM v$instance
UNION
SELECT
NULL
, NULL
, failover_type
, failover_method
, failed_over
FROM v$session
WHERE username = 'SYSTEM';

INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER


------------- --------- ------------- --------------- -----------
orcl2 linux2
SELECT BASIC YES
SQL> exit
From the above demonstration, we can see that the above session has now been failed over to instance
orcl2 on linux2.

28. Conclusion

Ideally this guide has provided an economical solution to setting up and configuring an inexpensive Oracle
RAC 10g cluster using White Box Enterprise Linux (or Red Hat Enterprise Linux 3) and FireWire technology.
The RAC solution presented here can be put together for around US$1,800 and will provide the DBA with a
fully functional development Oracle RAC cluster.

Remember, although this solution should be stable enough for testing and development, it should never be
considered for a production environment.

29. Acknowledgements

An article of this magnitude and complexity is generally not the work of one person alone. Although I was able
to author and successfully demonstrate the validity of the components that make up this configuration, there
are several other individuals that deserve credit in making this article a success.

First, I would like to thank Werner Puschitz for his outstanding work on "Installing Oracle Database 10g with
Real Application Clusters (RAC) on Red Hat Enterprise Linux Advanced Server 3." This article, along with
several others he has authored, provided information on Oracle RAC 10g that could not be found in any other
Oracle documentation. Without his hard work and research into issues like configuring and installing the
hangcheck-timer kernel module, properly configuring Unix shared memory, and configuring ASMLib, this guide
may have never come to fruition. If you are interested in examining technical articles on Linux internals and in-
depth Oracle configurations written by Werner Puschitz, please visit his excellent website at
www.puschitz.com.

Next I would like to thank Wim Coekaerts, Manish Singh, and the entire team at Oracle's Linux Projects
Development Group. The professionals in this group made the job of upgrading the Linux kernel to support
IEEE1394 devices with multiple logins (and several other significant modifications) a seamless task. The group
provides a pre-compiled kernel for Red Hat Enterprise Linux 3.0 (which also works with White Box Enterprise
Linux) along with many other useful tools and documentation at oss.oracle.com.

Jeffrey Hunter (www.idevelopment.info) has been a senior DBA and software engineer for over 11 years. He is
an Oracle Certified Professional, Java Development Certified Professional, and author and currently works for
The DBA Zone, Inc.. Jeff's work includes advanced performance tuning, Java programming, capacity planning,
database security, and physical/logical database design in Unix, Linux, and Windows NT environments. Jeff's
other interests include mathematical encryption theory, programming language processors (compilers and
interpreters) in Java and C, LDAP, writing web-based database administration tools, and of course Linux.