Sie sind auf Seite 1von 7

CLUSTER COMPUTING

ABSTRACT:
A computer cluster is a group of loosely coupled computers that work together closely so that in many respects it can be viewed as though it were a single computer. Clusters are commonly connected through fast local area networks. Clusters are usually deployed to improve speed and/or reliability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or reliability. Cluster computing has emerged as a result of convergence of several trends including the availability of inexpensive high performance microprocessors and high speed networks, the development of standard software tools for high performance distributed computing. Clusters have evolved to support applications ranging from ecommerce, to high performance database applications. Clustering has been available since the 1980s when it was used in DEC's VMS systems. IBM's simplex is a cluster approach for a mainframe system. Microsoft, Sun Microsystems, and other leading hardware and software companies offer clustering packages that are said to offer scalability as well as availability. Cluster computing can also be used as a relatively low-cost form of parallel processing for scientific and other applications that lend themselves to parallel operations.

INTRODUCTION:
Today, a wide range of applications are hungry for higher computing power, and even though single processor PCs and workstations now can provide extremely fast processing; the even faster execution that multiple processors can achieve by working concurrently is still needed. Now, finally, costs are falling as well. Networked clusters of commodity PCs and workstations using off-the-shelf processors and communication platforms such as Myrinet, Fast Ethernet, and Gigabit Ethernet are becoming increasingly cost effective and popular. This concept, known as cluster computing. Clusters, built using commodity-off-the-shelf (COTS) hardware components and free, or commonly used, software, are playing a major role in solving large-scale science, engineering, and commercial applications. Cluster computing has emerged as a result of the convergence of several trends, including the availability of inexpensive high performance microprocessors and high speed networks, the development of standard software tools for high performance distributed computing, and the increasing need of computing power for computational science and commercial applications.

COMPARING OLD AND NEW: (BENEFITS)


Today, open standards-based HPC systems are being used to solve problems from High-end, floating-point intensive scientific and engineering problems to data intensive tasks in industry. Some of the reasons why HPC clusters outperform RISC based systems Include: Collaboration Scientists can collaborate in real-time across dispersed locations- bridging isolated islands of scientific research and discovery- when HPC clusters are based on open source and building block technology. Scalability HPC clusters can grow in overall capacity because processors and nodes can be added as demand increases. Availability Because single points of failure can be eliminated, if any one system component goes down, the system as a whole or the solution (multiple systems) stay highly available.

Ease of technology refresh Processors, memory, disk or operating system (OS) technology can be easily updated, And new processors and nodes can be added or upgraded as needed. Affordable service and support Compared to proprietary systems, the total cost of ownership can be much lower. This includes service, support and training. Vendor lock-in The age-old problem of proprietary vs. open systems that use industry-accepted standards is eliminated. System manageability The installation, configuration and monitoring of key elements of proprietary systems is usually accomplished with proprietary technologies, complicating system management. The servers of an HPC cluster can be easily managed from a single point using readily available network infrastructure and enterprise management software. Reusability of components Commercial components can be reused, preserving the investment. For example, older nodes can be deployed as file/print servers, web servers or other infrastructure servers. Disaster recovery Large SMPs are monolithic entities located in one facility. HPC systems can be collocated or geographically dispersed to make them less susceptible to disaster.

CLUSTERING CONCEPTS:
Clusters are in fact quite simple. They are a bunch of computers tied together with a network working on a large problem that has been broken down into smaller pieces. Parallelism It is the quality that allows something to be done in parts that work independently rather than a task that has so many interlocking dependencies that it cannot be further broken down. Parallelism operates at two levels: hardware parallelism and software parallelism.

Hardware Parallelism On one level hardware parallelism deals with the CPU of an individual system and how we can squeeze performance out of sub-components of the CPU that can speed up our code. Software Parallelism Software parallelism is the ability to find well defined areas in a problem we want to solve that can be broken down into self-contained parts. These parts are the program elements that can be distributed and give us the speedup that we want to get out of a high performance computing system. System-Level Middleware System-level middleware offers Single System Image (SSI) and high availability infrastructure for processes, memory, storage, I/O, and networking. The single system image illusion can be implemented using the hardware or software infrastructure. Application-Level Middleware Application-level middleware is the layer of software between the operating system and applications. Middleware provides various services required by an application to function correctly. Single System image A single system image is the illusion, created by software or hardware, that presents a collection of resources as one, more powerful resource. SSI makes the cluster appear like a single machine to the user, to applications, and to the network. A cluster without a SSI is not a cluster. Every SSI has a boundary. SSI support can exist at different levels within a system, one able to be built on another. Single System Image Benefits

Provide a simple, straightforward view of all system resources and activities, from any node of the cluster

Free the operator from having to know where a resource is located Reduce the risk of operator errors, with the result that end users see improved reliability and higher availability of the system

Greatly simplify system management Provide location- independent message communication Provide transparent process migration and load balancing across nodes.

Improved system response time and performance

High speed networks Network is the most critical part of a cluster. Its capabilities and performance directly influences the applicability of the whole system for HPC. Starting from Local/Wide Area Networks (LAN/WAN) like Fast Ethernet and ATM, to System Area Networks (SAN) like Myrinet and Memory Channel E.g. Fast Ethernet

100 Mbps over UTP or fiber-optic cable MAC protocol: CSMA/CD

COMPONENTS OF CLUSTER COMPUTER:


Multiple High Performance Computers State of the art Operating Systems High Performance Networks/Switches Network Interface Card Fast Communication Protocols and Services Cluster Middleware Hardware Operating System Kernel/Gluing Layers Applications and Subsystems

Parallel Programming Environments and Tools

CLUSTER CLASSIFICATIONS: Clusters are classified in to several sections based on the facts such as 1) Application target 2) Node owner ship 3) Node Hardware 4) Node operating System 5) Node configuration. Clusters based on Application Target are again classified into two:

High Performance (HP) Clusters High Availability (HA) Clusters

Clusters based on Node Ownership are again classified into two:


Dedicated clusters Non-dedicated clusters

Clusters based on Node Hardware are again classified into three:


Clusters of PCs (Cops) Clusters of Workstations (COWs) Clusters of SMPs (CLUMPs)

Clusters based on Node Operating System are again classified into:


Linux Clusters (e.g., Beowulf) Solaris Clusters (e.g., Berkeley NOW) Digital VMS Clusters HP-UX clusters Microsoft Wolf pack clusters

Clusters based on Node Configuration are again classified into:

Homogeneous Clusters -All nodes will have similar architectures and run the same OSs

Heterogeneous Clusters- All nodes will have different architectures and run different OSs

Advantages: Management (software and hardware) of the cluster done by a central department. The job always runs on the least loaded machine High probability of being allotted more compute power than purchased. In short it is a WIN-WIN situation

Disadvantages: Dependency on cluster head. Addition of node is costly. You should more number of nodes to process.

Applications: Scientific computing Making movie Commercial server (web/database etc.)

CONCLUSION: Clusters are promising


Solve parallel processing paradox Offer incremental growth and matches with funding pattern New trends in hardware and software technologies are likely to make clusters more promising and fill SSI gap.

Clusters based supercomputers (Linux based clusters) can be seen everywhere!

Das könnte Ihnen auch gefallen