Sie sind auf Seite 1von 9

WHAT ARE CLUSTERS

A computer cluster is a group of linked computers, working together


closely so that in many respects they form a single computer. Clusters are
generally connected by a fast Local Area Network. Parallel programs that
run on one of the nodes uses the processing power of all the nodes and
produces the result. Generally clusters are tightly coupled ie. All the
motherboards will be stacked into a single cabinet and connected using
some interconnection network. They'll share RAM, Hard Disk and other
peripherals. Operating System runs on one of the nodes and controls the
activities of other nodes. For more on Clusters.

WHAT IS A BEOWULF CLUSTER


Beowulf Clusters are cheap clusters created using off the shelf
components. You can create a Beowulf cluster with just a few crap
computers and an ethernet segment in your backyard. Although they
don't give you top-notch performance, their performance is many-fold
better than a single computer. A variant of Beowulf Clusters allows OS to
run on every node and still allow parallel processing. And this is what
exactly we're going to do here.

KICK START YOUR CLUSTER


PREREQUISITES
1. Atleast Two Computers with a Linux Distribution installed in it(I'll use
Ubuntu 8.04 here). Make sure that your system has GCC installed in
it.
2. A network connection between them. If you have just two
computers, you can connect them using an ethernet wire. Make sure
that IP addresses are assigned to them. If you dont have a router to
assign IP, you can statically assign them IP addresses.
Rest of the document will assume that we are having two computers
having host names node0 and node1. Let node0 be the master node.
The following steps are to be done for every node
Step 1:- Add the nodes to the /etc/hosts file. Open this file using your
favourite text editor and add your node's IP address followed by its host
name. Give one node information per line.
For example

Node0 1.1.1.10
Node1 1.1.1.20
Step 2:-Create a new user in both the nodes. Let us call this new user as
mpiuser. You can create a new user through GUI by going to System>Administration->Users and
Groups and click "Add User". Create a
new user called mpiuser and give it a
password. Give administrative
privileges to that user. Make sure that you create
the same user on all
nodes. Although same password on all the nodes is not
necessary, it is
recommended that you do so because it'll eliminate the need to
remember passwords for every node.
Step 3:-Now download and install ssh-server in every node. Execute the
command sudo apt-get install opensshserver in every machine.
Step 4:-Now logout from your session and log in as mpiuser.
Step 5:-Open terminal and type the following ssh-keygen -t dsa. This
command will
generate a new ssh key. On executing this command,
it'll ask for a paraphrase.
Leave it blank as we want to create a
passwordless ssh (Assuming that you've a
trusted
LAN
with
no
security issues).
Step 6:-A folder called .ssh will be created in your home directory. Its a
hidden folder.
This folder will contain a file id_dsa.pub that contains
your public key. Now copy
this
key
to
another
file
called
authorized_keys in the same directory. Execute the command
in
the
terminal
cd
/home/mpiuser/.ssh;
cat
id_dsa.pub
>>
authorized_keys;.
Step 7:-Now download MPICH from the following website(MPICH1). Please
download the
MPICH 1.xx version from the website. Do not download
MPICH 2 version. I was unable to get MPICH 2 to work in the cluster.
Step 8:-Untar the archive and navigate into the directory in the terminal.
Execute the
following commands:
mkdir /home/mpiuser/mpich1
./configure --prefix=/home/mpiuser/mpich1
Make
make install

step 9:-Open the file .bashrc in your home directory. If file does not exist,
create one. Copy the following code into that file
export PATH=/home/mpiuser/mpich1/bin:$PATH
export PATH
LD_LIBRARY_PATH="/home/mpiuser/mpich1/lib:
$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH

step 10:-Now we'll define the path to MPICH for SSH. Run the following
command: sudo echo
/home/mpiuser/mpich1/bin
>>
/etc/environment
step 11:-Now logout and login back into the user mpiuser.
Step 12:-In the folder mpich1, within the sub-directory share or
util/machines/ a
file called machines.LINUX will be found. Open
that file and add the hostnames of all nodes except the home node ie. If
you're editing the machines.LINUX file of node0, then that file will contain
host names of all nodes except node0. By default
MPICH executes a
copy of the program in the home node. The machines.LINUX file for
the
machine node0 is as follows

How to Configure Linux Cluster with 2 Nodes on RedHat


and CentOS
In an active-standby Linux cluster configuration, all the critical services including
IP, filesystem will failover from one node to another node in the cluster.

The following are the high-level steps involved in configuring Linux cluster
on Redhat or CentOS:

Install and start RICCI cluster service

Create cluster on active node

Add a node to cluster

Add fencing to cluster

Configure failover domain

Add resources to cluster

Sync cluster configuration across nodes

Start the cluster

Verify failover by shutting down an active node

Required Cluster Packages


First make sure the following cluster packages are installed. If you dont have these packages
install them using yum command.
rpm -qa | egrep -i "ricci|luci|cluster|ccs|cman"
modcluster-0.16.2-28.el6.x86_64
luci-0.26.0-48.el6.x86_64
ccs-0.16.2-69.el6.x86_64
ricci-0.16.2-69.el6.x86_64
cman-3.0.12.1-59.el6.x86_64
clusterlib-3.0.12.1-59.el6.x86_64

Start RICCI service and Assign Password


Next, start ricci service on both the nodes
service ricci start
Starting oddjobd:
generating SSL certificates... done
Generating NSS database... done
Starting ricci:

OK

OK

You also need to assign a password for the RICCI on both the nodes.
passwd ricci
Changing password for user ricci.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.

Also, If you are running iptables firewall, keep in mind that you need to
have appropriate firewall rules on both the nodes to be able to talk to
each other

Create Cluster on Active Node


From the active node, please run the below command to create a new
cluster.

The following command will create the cluster configuration file


/etc/cluster/cluster.conf. If the file already exists, it will replace the existing
cluster.conf with the newly created cluster.conf.
ccs -h rh1.mydomain.net --createcluster mycluster
rh1.mydomain.net password:
[root@rh1 ~]# ls -l /etc/cluster/cluster.conf
-rw-r-----. 1 root root 188 Sep 26 17:40 /etc/cluster/cluster.conf

Also keep in mind that we are running these commands only from one
node on the cluster and we are not yet ready to propagate the changes to
the other node on the cluster

Initial Plain cluster.conf File


After creating the cluster, the cluster.conf file will look like the following:
cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="1" name="mycluster">
<fence_daemon/>
<clusternodes/>
<cman/>
<fencedevices/>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>

Add a Node to the Cluster


Once the cluster is created, we need to add the participating nodes to the
cluster using the ccs command as shown below
First, add the first node rh1 to the cluster as shown below.
ccs -h rh1.mydomain.net --addnode rh1.mydomain.net
Node rh1.mydomain.net added.

Next, add the second node rh2 to the cluster as shown below.
ccs -h rh1.mydomain.net --addnode rh2.mydomain.net
Node rh2.mydomain.net added

Once the nodes are created, you can use the following command to view
all the available nodes in the cluster. This will also display the node id for
the corresponding node.
ccs -h rh1 --lsnodes
rh1.mydomain.net: nodeid=1
rh2.mydomain.net: nodeid=2

cluster.conf File After Adding Nodes

This above will also add the nodes to the cluster.conf file as shown below.
cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="3" name="mycluster">
<fence_daemon/>
<clusternodes>
<clusternode name="rh1.mydomain.net" nodeid="1"/>
<clusternode name="rh2.mydomain.net" nodeid="2"/>
</clusternodes>
<cman/>
<fencedevices/>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>

Add Fencing to Cluster


Fencing is the disconnection of a node from shared storage. Fencing cuts
off I/O from shared storage, thus ensuring data integrity.
A fence device is a hardware device that can be used to cut a node off
from shared storage.
This can be accomplished in a variety of ways: powering off the node via a
remote power switch, disabling a Fiber Channel switch port, or revoking a
hosts SCSI 3 reservations.
A fence agent is a software program that connects to a fence device in
order to ask the fence device to cut off access to a nodes shared storage
(via powering off the node or removing access to the shared storage by
other means).
Execute the following command to enable fencing.
[root@rh1 ~]# ccs -h rh1 --setfencedaemon post_fail_delay=0
[root@rh1 ~]# ccs -h rh1 --setfencedaemon post_join_delay=25

Next, add a fence device. There are different types of fencing devices available.
If you are using virtual machine to build a cluster, use fence_virt device as shown
below.
[root@rh1 ~]# ccs -h rh1 --addfencedev myfence agent=fence_virt

Next, add fencing method. After creating the fencing device, you need to created
the fencing method and add the hosts to the fencing method.
[root@rh1 ~]# ccs -h rh1 --addmethod mthd1 rh1.mydomain.net
Method mthd1 added to rh1.mydomain.net.

[root@rh1 ~]# ccs -h rh1 --addmethod mthd1 rh2.mydomain.net


Method mthd1 added to rh2.mydomain.net

Finally, associate fence device to the method created above as shown below:
[root@rh1 ~]# ccs -h rh1 --addfenceinst myfence rh1.mydomain.net mthd1
[root@rh1 ~]# ccs -h rh1 --addfenceinst myfence rh2.mydomain.net mthd1

cluster.conf File after Fencing


Your cluster.conf will look like below after the fencing devices, methods
are added.
cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="10" name="mycluster">
<fence_daemon post_join_delay="25"/>
<clusternodes>
<clusternode name="rh1.mydomain.net" nodeid="1">
<fence>
<method name="mthd1">
<device name="myfence"/>
</method>
</fence>
</clusternode>
<clusternode name="rh2.mydomain.net" nodeid="2">
<fence>
<method name="mthd1">
<device name="myfence"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_virt" name="myfence"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>

Types of Failover Domain


A failover domain is an ordered subset of cluster members to which a
resource group or service may be bound.
The following are the different types of failover domains:

Restricted failover-domain: Resource groups or service bound to the


domain may only run on cluster members which are also members of the
failover domain. If no members of failover domain are availables, the
resource group or service is placed in stopped state.

Unrestricted failover-domain: Resource groups bound to this domain may


run on all cluster members, but will run on a member of the domain
whenever one is available. This means that if a resource group is running
outside of the domain and member of the domain transitions online, the
resource group or

service will migrate to that cluster member.

Ordered domain: Nodes in the ordered domain are assigned a priority level
from 1-100. Priority 1 being highest and 100 being the lowest. A node with
the highest priority will run the resource group. The resource if it was
running on node 2, will migrate to node 1 when it becomes online.

Unordered domain: Members of the domain have no order of preference.


Any member may run in the resource group. Resource group will always
migrate to members of their failover domain whenever possible.

Add a Filover Domain


To add a failover domain, execute the following command. In this
example, I created domain named as webserverdomain,
[root@rh1 ~]# ccs -h rh1 --addfailoverdomain webserverdomain ordered

Once the failover domain is created, add both the nodes to the failover
domain as shown below:
[root@rh1 ~]# ccs -h rh1 --addfailoverdomainnode webserverdomain
rh1.mydomain.net priority=1
[root@rh1 ~]# ccs -h rh1 --addfailoverdomainnode webserverdomain
rh2.mydomain.net priority=2

You can view all the nodes in the failover domain using the following
command.
[root@rh1 ~]# ccs -h rh1 --lsfailoverdomain
webserverdomain: restricted=0, ordered=1, nofailback=0
rh1.mydomain.net: 1
rh2.mydomain.net: 2

Add Resources to Cluster


Now it is time to add a resources. This indicates the services that also
should failover along with ip and filesystem when a node fails. For
example, the Apache webserver can be part of the failover in the Redhat
Linux Cluster.
When you are ready to add resources, there are 2 ways you can do this.
You can add as global resources or add a resource directly to resource
group or service.

The advantage of adding it as global resource is that if you want to add


the resource to more than one service group you can just reference the
global resource on your service or resource group.
In this example, we added the filesystem on a shared storage as global
resource and referenced it on the service.
[root@rh1 ~]# ccs h rh1 --addresource fs name=web_fs
device=/dev/cluster_vg/vol01 mountpoint=/var/www fstype=ext4

To add a service to the cluster, create a service and add the resource to
the service.
[root@rh1 ~]# ccs -h rh1 --addservice webservice1 domain=webserverdomain
recovery=relocate autostart=1

Now add the following lines in the cluster.conf for adding the resource
references to the service. In this example, we also added failover IP to our
service.
<fs ref="web_fs"/>
<ip address="192.168.1.12" monitor_link="yes" sleeptime="10"/>

Centos 7 cluster doc


http://clusterlabs.org/doc/en-US/Pacemaker/1.1pcs/html/Clusters_from_Scratch/_login_remotely.html

http://clusterlabs.org/doc/en-US/Pacemaker/1.1pcs/html/Clusters_from_Scratch/_configure_the_os.html

Das könnte Ihnen auch gefallen