Sie sind auf Seite 1von 13

Clustering is when two or more servers are linked together with the Network Load Balancing

protocol (in Windows) to allow for faster response time and reliability. This way, if one server
goes down, another can pick up the slack without any interruption in network performance - in
theory anyway! :-)

In a computer system, a cluster is a group of servers and other resources that act like a single
system and enable high availability and, in some cases, load balancing and parallel processing.

In computers, clustering is the use of multiple computers, typically PCs or UNIX workstations,
multiple storage devices, and redundant interconnections, to form what appears to users as a
single highly available system. Cluster computing can be used for load balancing as well as for
high availability.

A common use of cluster computing is to load balance traffic on high-traffic Web sites. A Web
page request is sent to a "manager" server, which then determines which of several identical or
very similar Web servers to forward the request to for handling. Having a Web farm (as such a
configuration is sometimes called) allows traffic to be handled more quickly.

Clustering has been available since the 1980s when it was used in DEC's

High Availability

In information technology, high availability refers to a system or component that is continuously


operational for a desirably long length of time. availability can be measured relative to "100%
operational" or "never failing.

High-availability (HA) clusters

High-availability clusters (also known as Failover Clusters) are implemented primarily


for the purpose of improving the availability of services that the cluster provides. They
operate by having redundant nodes, which are then used to provide service when system
components fail. The most common size for an HA cluster is two nodes, which is the
minimum requirement to provide redundancy. HA cluster implementations attempt to use
redundancy of cluster components to eliminate single points of failure.
Load Balancing
Load balancing is dividing the amount of work that a computer has to do between two or more
computers so that more work gets done in the same amount of time and, in general, all users get
served faster. Load balancing can be implemented with hardware, software, or a combination of
both. Typically, load balancing is the main reason for computer server

Load-balancing is when multiple computers are linked together to share computational workload
or function as a single virtual computer. Logically, from the user side, they are multiple
machines, but function as a single virtual machine. Requests initiated from the user are managed
by, and distributed among, all the standalone computers to form a cluster

Load balancing can also be considered as distributing items into buckets:

• data to memory locations


• files to disks
• tasks to processors
• packets to network interfaces
• requests to servers

Layer-2 Load Balancing

Layer-4 Load Balancing

Layer-7 Load Balancing

MPLS Load Balancing

DNS Load Balancing

Link Load Balancing

Database Load Balancing

Computing Load Balancing


Layer -2 load balancing

Layer-2 load balancing, aka link aggregation, port aggregation, etherchannel, or gigabit
etherchannel port bundling is to bond two or more links into a single, higher-bandwidth
logical link.

Layer-4 load balancing


Layer-4 load balancing is to distribute requests to the servers at transport layer, such as TCP,
UDP and SCTP transport protocol. The load balancer distributes network connections from
clients who know a single IP address for a service, to a set of servers that actually perform the
work
Since connection must be established between client and server in connection-oriented transport
before sending the request content, the load balancer usually selects a server without looking at
the content of the request.

IPVS is an implementation of layer-4 load balancing for the Linux kernel,

IP Load Balancing Technologies

LVS/NAT

, LVS/TUN

LVS/DR.

Layer-7 load balancing, also known as application-level load balancing, is to parse requests in
application layer and distribute requests to servers based on different types of request contents,
so that it can provide quality of service requirements for different types of contents and improve
overall cluster performance.

MPLS Load Balancing

MPLS load balancing is to balance network services based on the Multiprotocol Label
Switching (MPLS) label information.

See MPLS Load Balancing for more information..

DNS load balancing is to distribute requests to different servers though resolving the domain
name to different IP addresses of servers. When a DNS request comes to the DNS server to
resolve the domain name, it gives out one of the server IP addresses based on scheduling
strategies, such as simple round-robin scheduling or geographical scheduling.

Link Load Balancing

Link load balancing is to balance traffic among multiple links from different ISPs or one
ISP for better scalability and availability of Internet connectivity, and also cost saving.

http://www.loadbalancer.org/load_balancing_methods.php#nat
http://lcic.org/
http://lcic.org/documentation.html
Grid computing

Main article: Grid computing

Grids are usually computer clusters, but more focused on throughput like a computing
utility rather than running fewer, tightly-coupled jobs. Often, grids will incorporate
heterogeneous collections of computers, possibly distributed geographically, sometimes
administered by unrelated organizations.

Grid computing (or the use of computational grids) is the combination of computer
resources from multiple administrative domains applied to a common task, usually to a
scientific, technical or business problem that requires a great number of computer
processing cycles or the need to process large amounts of data.

One of the main strategies of grid computing is using software to divide and apportion
pieces of a program among several computers, sometimes up to many thousands.

Ip traffic
Iproute
Iproute2 is a collection of utilities for controlling
TCP / IP networking and traffic control in Linux. It is currently maintained
by Stephen Hemminger <shemminger@osdl.org>. The original author, Alexey Kuznetsov, is
well known for the QoS implementation in the Linux kernel.
http://freshmeat.net/articles/linux-clustering-software

Software for building and using clusters

• High Performance Computing Software


• (Beowulf/Scyld, OSCAR, OpenMosix...),

• High Availability Software
• (Kimberlite, Heartbeat...).

• Load Balancing Software


• (Linux Virtual Server, Ultra Monkey...).

Software used on, and for using, clusters

• File Systems
• (Intermezzo, ClusterNFS, DRBD...).

Beowulf

Project, also known these days as Scyld. Scyld contains an enhanced kernel and some
tools and libraries that are used to present the cluster as a "Single System Image". This
idea of a single system image means that processes that are running on slave nodes in the
cluster are visible and manageable from the master node, giving the impression of the
cluster being just a single system

http://freshmeat.net/projects/beowulf/

openMosix Cluster for Linux

openMosix is a a set of extensions to the standard Linux kernel allowing you to build a
cluster of out of off-the-shelf PC hardware. openMosix scales perfectly up to thousands
of nodes. You do not need to modify your applications to benefit from your cluster
(unlike PVM, MPI, Linda, etc.). Processes in openMosix migrate transparently between
nodes and the cluster will always auto-balance.
HPC

There are other HPC clustering solutions that do not change the way the kernel functions. These
use other means to run jobs and deal with showing information about them. Cplant, the Ka
Clustering Toolkit, and OSCAR all allow you to build, use, and manage your cluster in this
manner.

Filesystem Used

The Global File System (GFS)

The Global File System (GFS) is a 64-bit shared disk cluster file system for Linux. GFS
cluster nodes physically share the same storage by means of Fibre Channel or shared
SCSI devices. The file system appears to be local on each node and GFS synchronizes
file access across the cluster. GFS is fully symmetric, meaning that all nodes are equal
and there is no server which may be a bottleneck or single point of failure. GFS uses read
and write caching while maintaining full UNIX file system semantics. GFS supports
journaling, recovery from client failures, and many other features.

OpenAFS
AFS is a distributed filesystem which offers a client-server architecture, transparent data
igration abilities, scalability, a single namespace, and integrated ancillary subsystems.
High Availability Software

Kimberlite
specializes in shared data storage and maintaining data integrity.

Piranha (a.k.a. the Red Hat High Availability Server Project),


can serve in one of two ways; it can be a two-node high availability failover solution or a multi-
node load balancing solution.

HeartBeat
One of the better-known projects in this space is probably the High Availability Linux Project,
also known as Linux-HA. The heart of Linux-HA is Heartbeat, which provides a heartbeat,
monitoring, and IP takeover functionality. It can run heartbeats over serial ports or UDP
broadcast or multicast, and can re-allocate IP addresses and other resources to various members
of the cluster when a node goes down, and restore them when the node comes back up.

Linux-HA

Heartbeat is a full-function high-availability system for Linux and other POSIX-like


OSes. It monitors services and restarts them on errors. When managing a cluster (more
than 1 machine), it will also monitor the members of the cluster and begin recovery of
lost services in less than a second. It runs over serial ports and UDP broadcast/multicast,
as well as OpenAIS multicast. It is easily adapted to different interconnect media and
protocols. When used in a cluster, it can operate using shared disks, data replication, or
no data sharing.

Load Balancing Software

One of the best known projects in this area is the Linux Virtual Server Project. It uses the
load balancers to pass along requests to the servers, and can "virtualize" almost any TCP
or UDP service, such as HTTP(S), DNS, ssh, POP, IMAP, SMTP,

Load Balancing projects are based on LVS.

Ultra Monkey incorporates LVS, a heartbeat, and service monitoring to provide highly
available and load balanced services.

Piranha has a load balancing mode, which it refers to in its documentation as LVS mode
Keepalived adds a strong and robust keepalive facility to LVS. It monitors the server
pools, and when one of the servers goes down, it tells the kernel and has the server
removed from the LVS topology.

The Zeus Load Balancer is not based on LVS, but offers similar functionality. It
combines content-aware traffic management, site health monitoring, and failover services
in its Web site load balancing.

Pen, not based on LVS

a simple load balancer for TCP-based protocols like HTTP or SMTP. Turbolinux Cluster
Server is the last of the load balancing projects I will talk about. It is from the folks at
Turbolinux, and its load balancing and monitoring software allows detection and
recovery from hardware and software failures (if recovery is possible).

LVS

An LVS is a group of servers with a director that appear to the outside world (a client on
the internet) as one server. The LVS can offer more services, or services of higher
capacity/throughput, or redundant services (where individual servers can be brought
down for maintenance) than is available from a single server. A service is defined here as
a connection to a single port, eg telnet, http, https, nntp, ntp, nfs, ntp, ssh, smtp, pop,
databases.

In the computer bestiary, and LVS is a layer-4 switch. Standard client-server semantics
are preserved. Each client thinks that it has connected directly with the realserver. Each
realserver thinks that it has connected directly to the client. Neither the client nor the
realservers have any way of telling that a director has intervened in the connection.

An LVS is not a beowulf - a beowulf is a group of machines each of which is


cooperatively calculating a small part of a larger problem. It is not a cluster - a
cluster of machines is a set of machines which cooperatively distribute processing.
The realservers in an LVS do not cooperate - they have no knowlege of any other
realservers in the LVS. All a realserver knows about is that it gets connections from
a client.

http://www.linuxtopia.org/online_books/linux_system_administration/redhat_cluster_con
figuration_and_management/s1-lvs-block-diagram.html

Piranha

Keepalived

Ultra Monkey

surealived

Linux-HA heartbeat package

Mon

ipvsman

Net-SNMP-LVS-Module

LVSM

lvs-kiss

SCOP

LVS webmin module

iptoip

lvs-snmp

http://www.linuxvirtualserver.org/software/index.html
Here's a typical LVS-NAT setup.

________
| |
| client | (local or on internet)
|________|
|
(router)
DIRECTOR_GW
|
-- |
L Virtual IP
i ____|_____
n | | (director can have 1 or 2 NICs)
u | director |
x |__________|
DIP
V |
i |
r -----------------+----------------
t | | |
u | | |
a RIP1 RIP2 RIP3
l ____________ ____________ ____________
| | | | | |
S | realserver | | realserver | | realserver |
e |____________| |____________| |____________|
r
v
e
r

http://www.austintek.com/LVS/LVS-HOWTO/mini-HOWTO/LVS-mini-
HOWTO.html#what

There are some tools to help you configure an LVS.

tools which include director failover, e.g. Ultra Monkey. by Horms, which handles
director failover, but has to be setup by hand. In Apr 2005, Horms released UltraMonkey
v3 (http://www.ultramonkey.org/download/3)

keepalived by Alexandre Cassen, which sets everything up for you and is at the
keepalived site. It handles director and realserver failure.
php based web interface to lvs/ldirectord

CLI-controlled LVS demon lvs-kiss which uses ipvsadm for load balancing and fail-
over.

UltraMonkey

Ultra Monkey is a project to create load balanced and highly available network services.
For example a cluster of web servers that appear as a single web server to end-users. The
service may be for end-users across the world connected via the internet, or for enterprise
users connected via an intranet.

Ultra Monkey makes use of the Linux operating system to provide a flexible solution that
can be tailored to a wide range of needs. From small clusters of only two nodes to large
systems serving thousands of connections per second.

UltraMonkey - heartbeat

http://www.ultramonkey.org/

http://www.ultramonkey.org/3/topologies/ha-lb-eg.html

http://www.ultramonkey.org/about.shtml

http://www.linuxvirtualserver.org/docs/ha/ultramonkey.html
High Availability

Using Piranha to build highly available LVS systems

Using Keepalived to build highly available LVS systems

Using UltraMonkey to build highly available LVS systems

Using heartbeat+mon+coda to build highly available LVS systems

Using heartbeat+ldirectord to build highly available LVS systems

Ref -

http://www.linuxvirtualserver.org/HighAvailability.html

HAProxy + HearBeat

http://www.howtoforge.com/setting-up-a-high-availability-load-balancer-with-haproxy-
heartbeat-on-debian-lenny

http://www.webhostingtalk.com/showthread.php?t=627783

http://haproxy.1wt.eu/download/1.2/doc/haproxy-en.txt

Das könnte Ihnen auch gefallen