Supercomputer

Why HPC?
Whether your business works in engineering, financial services, design, geosciences,

computer-aided graphics, biosciences or a wide range of other sectors, High
Performance Computing (HPC) could help you to gain a competitive advantage.
Expensive, specialist supercomputers are no longer required. Instead, commodity
hardware that's commonplace for Web servers and desktop workstations can be used
with open-source software to provide affordable, scalable HPC solutions.
However, selecting the right products, technologies and partners can be critical to
success. The Book High Performance Computing for Dummies gives a brief
introduction to HPC and explains how HPC helps enterprises to gain a competitive
edge.
Supercomputer
A supercomputer is a computer that is at the frontline of current processing capacity,
particularly speed of calculation. Supercomputers were introduced in the 1960s and
were designed primarily by Seymour Cray at Control Data Corporation (CDC), which
led the market into the 1970s until Cray left to form his own company, Cray
Research. He then took over the supercomputer market with his new designs, holding
the top spot in supercomputing for five years (1985–1990). In the 1980s a large
number of smaller competitors entered the market, in parallel to the creation of the
minicomputer market a decade earlier, but many of these disappeared in the mid-
1990s "supercomputer market crash".
Today, supercomputers are typically one-of-a-kind custom designs produced by

"traditional" companies such as Cray, IBM and Hewlett-Packard, who had purchased
many of the 1980s companies to gain their experience. As of July 2009, the Cray
Jaguar is the fastest supercomputer in the world.
The term supercomputer itself is rather fluid, and today's supercomputer tends to
become tomorrow's ordinary computer. CDC's early machines were simply very fast
scalar processors, some ten times the speed of the fastest machines offered by other
companies. In the 1970s most supercomputers were dedicated to running a vector
processor, and many of the newer players developed their own such processors at a
lower price to enter the market. The early and mid-1980s saw machines with a modest
number of vector processors working in parallel to become the standard. Typical
numbers of processors were in the range of four to sixteen. In the later 1980s and
1990s, attention turned from vector processors to massive parallel processing systems
with thousands of "ordinary" CPUs, some being off the shelf units and others being
custom designs. Today, parallel designs are based on "off the shelf" server-class
microprocessors, such as the PowerPC, Opteron, or Xeon, and most modern
supercomputers are now highly-tuned computer clusters using commodity processors
combined with custom interconnects.
Supercomputers are used for highly calculation-intensive tasks such as problems

involving quantum physics, weather forecasting, climate research, molecular
modeling (computing the structures and properties of chemical compounds, biological
macromolecules, polymers, and crystals), physical simulations (such as simulation of
airplanes in wind tunnels, simulation of the detonation of nuclear weapons, and
research into nuclear fusion). A particular class of problems, known as Grand
Challenge problems, are problems whose full solution requires semi-infinite
computing resources.
Relevant here is the distinction between capability computing and capacity

computing, as defined by Graham et al. Capability computing is typically thought of
as using the maximum computing power to solve a large problem in the shortest
amount of time. Often a capability system is able to solve a problem of a size or
complexity that no other computer can. Capacity computing in contrast is typically
thought of as using efficient cost-effective computing power to solve somewhat large
problems or many small problems or to prepare for a run on a capability system.
Contents
• 1 Hardware and software design
o 1.1 Supercomputer challenges, technologies
o 1.2 Processing techniques
o 1.3 Operating systems
o 1.4 Programming
o 1.5 Software tools
• 2 Modern supercomputer architecture
• 3 Special-purpose supercomputers
• 4 The fastest supercomputers today
o 4.1 Measuring supercomputer speed
o 4.2 The Top500 list
o 4.3 Current fastest supercomputer system
o 4.4 Quasi-supercomputing
• 5 Research and development
• 6 Timeline of supercomputers
• 7 See also
• 8 Notes
• 9 External links
Hardware and software design
Processor board of a CRAY YMP vector computer

Supercomputers using custom CPUs traditionally gained their speed over
conventional computers through the use of innovative designs that allow them to
perform many tasks in parallel, as well as complex detail engineering. They tend to be
specialized for certain types of computation, usually numerical calculations, and
perform poorly at more general computing tasks. Their memory hierarchy is very
carefully designed to ensure the processor is kept fed with data and instructions at all
times — in fact, much of the performance difference between slower computers and
supercomputers is due to the memory hierarchy. Their I/O systems tend to be
designed to support high bandwidth, with latency less of an issue, because
supercomputers are not used for transaction processing.
As with all highly parallel systems, Amdahl's law applies, and supercomputer designs
devote great effort to eliminating software serialization, and using hardware to
address the remaining bottlenecks.
Supercomputer challenges, technologies
• A supercomputer generates large amounts of heat and must be cooled. Cooling

most supercomputers is a major HVAC problem.
• Information cannot move faster than the speed of light between two parts of a
supercomputer. For this reason, a supercomputer that is many metres across
must have latencies between its components measured at least in the tens of
nanoseconds. Seymour Cray's supercomputer designs attempted to keep cable
runs as short as possible for this reason, hence the cylindrical shape of his
Cray range of computers. In modern supercomputers built of many
conventional CPUs running in parallel, latencies of 1–5 microseconds to send
a message between CPUs are typical.
• Supercomputers consume and produce massive amounts of data in a very short
period of time. According to Ken Batcher, "A supercomputer is a device for
turning compute-bound problems into I/O-bound problems." Much work on
external storage bandwidth is needed to ensure that this information can be
transferred quickly and stored/retrieved correctly.
Technologies developed for supercomputers include:
• Vector processing
• Liquid cooling
• Non-Uniform Memory Access (NUMA)
• Striped disks (the first instance of what was later called RAID)
• Parallel filesystems
Processing techniques
Vector processing techniques were first developed for supercomputers and continue to
be used in specialist high-performance applications. Vector processing techniques
have trickled down to the mass market in DSP architectures and SIMD (Single
Instruction Multiple Data) processing instructions for general-purpose computers.
Modern video game consoles in particular use SIMD extensively and this is the basis
for some manufacturers' claim that their game machines are themselves
supercomputers. Indeed, some graphics cards have the computing power of several
TeraFLOPS. The applications to which this power can be applied was limited by the
special-purpose nature of early video processing. As video processing has become
more sophisticated, graphics processing units (GPUs) have evolved to become more
useful as general-purpose vector processors, and an entire computer science sub-
discipline has arisen to exploit this capability: General-Purpose Computing on
Graphics Processing Units (GPGPU).
Operating systems
Supercomputers predominantly run some variant of Linux.[1]
Supercomputers today most often use variants of Linux[1].
Until the early-to-mid-1980s, supercomputers usually sacrificed instruction set

compatibility and code portability for performance (processing and memory access
speed). For the most part, supercomputers to this time (unlike high-end mainframes)
had vastly different operating systems. The Cray-1 alone had at least six different
proprietary OSs largely unknown to the general computing community. In similar
manner, different and incompatible vectorizing and parallelizing compilers for Fortran
existed. This trend would have continued with the ETA-10 were it not for the initial
instruction set compatibility between the Cray-1 and the Cray X-MP, and the adoption
of computer system's such as Cray's Unicos, or Linux.
Programming
The parallel architectures of supercomputers often dictate the use of special
programming techniques to exploit their speed. The base language of supercomputer
code is, in general, Fortran or C, using special libraries to share data between nodes.
In the most common scenario, environments such as PVM and MPI for loosely
connected clusters and OpenMP for tightly coordinated shared memory machines are
used. Significant effort is required to optimize a problem for the interconnect
characteristics of the machine it will be run on; the aim is to prevent any of the CPUs
from wasting time waiting on data from other nodes.
Software tools
Software tools for distributed processing include standard APIs such as MPI and
PVM, VTL, and open source-based software solutions such as Beowulf, WareWulf,
and openMosix, which facilitate the creation of a supercomputer from a collection of
ordinary workstations or servers. Technology like ZeroConf (Rendezvous/Bonjour)
can be used to create ad hoc computer clusters for specialized software such as
Apple's Shake compositing application. An easy programming language for
supercomputers remains an open research topic in computer science. Several utilities
that would once have cost several thousands of dollars are now completely free thanks
to the open source community that often creates disruptive technology.
Modern supercomputer architecture
IBM Roadrunner - LANL

The CPU Architecture Share of Top500 Rankings between 1993 and 2009.
Supercomputers today often have a similar top-level architecture consisting of a

cluster of MIMD multiprocessors, each processor of which is SIMD. The
supercomputers vary radically with respect to the number of multiprocessors per
cluster, the number of processors per multiprocessor, and the number of simultaneous
instructions per SIMD processor. Within this hierarchy we have:
• A computer cluster is a collection of computers that are highly interconnected

via a high-speed network or switching fabric. Each computer runs under a
separate instance of an Operating System (OS).
• A multiprocessing computer is a computer, operating under a single OS and
using more than one CPU, wherein the application-level software is indifferent
to the number of processors. The processors share tasks using Symmetric
multiprocessing (SMP) and Non-Uniform Memory Access (NUMA).
• A SIMD processor executes the same instruction on more than one set of data
at the same time. The processor could be a general purpose commodity
processor or special-purpose vector processor. It could also be high-
performance processor or a low power processor. As of 2007, the processor
executes several SIMD instructions per nanosecond.
As of November 2009 the fastest supercomputer in the world is the Cray XT5 Jaguar
system at National Center for Computational Sciences with more than 19000
computers and 224,000 processing elements, based on standard AMD processors. The
fastest heterogeneous machine is IBM Roadrunner. This machine is a cluster of 3240
computers, each with 40 processing cores and includes both AMD and Cell
processors. By contrast, Columbia is a cluster of 20 machines, each with
512 processors, each of which processes two data streams concurrently.
In February 2009, IBM also announced work on "Sequoia," which appears to be a 20

petaflops supercomputer. This will be equivalent to 2 million laptops (whereas
Roadrunner is comparable to a mere 100,000 laptops). It is slated for deployment in
late 2011. [2]
Moore's Law and economies of scale are the dominant factors in supercomputer
design: a single modern desktop PC is now more powerful than a ten-year-old
supercomputer, and the design concepts that allowed past supercomputers to out-
perform contemporaneous desktop machines have now been incorporated into
commodity PCs. Furthermore, the costs of chip development and production make it
uneconomical to design custom chips for a small run and favor mass-produced chips
that have enough demand to recoup the cost of production. A current model quad-core
Xeon workstation running at 2.66 GHz will outperform a multimillion dollar Cray
C90 supercomputer used in the early 1990s; most workloads requiring such a
supercomputer in the 1990s can now be done on workstations costing less than 4,000
US dollars. Supercomputing is taking a step of increasing density, allowing for
desktop supercomputers to become available, offering the computer power that in
1998 required a large room to require less than a desktop footprint.
In addition, many problems carried out by supercomputers are particularly suitable for
parallelization (in essence, splitting up into smaller parts to be worked on
simultaneously) and, in particular, fairly coarse-grained parallelization that limits the
amount of information that needs to be transferred between independent processing
units. For this reason, traditional supercomputers can be replaced, for many
applications, by "clusters" of computers of standard design, which can be
programmed to act as one large computer.
Special-purpose supercomputers
Special-purpose supercomputers are high-performance computing devices with a
hardware architecture dedicated to a single problem. This allows the use of specially
programmed FPGA chips or even custom VLSI chips, allowing higher
price/performance ratios by sacrificing generality. They are used for applications such
as astrophysics computation and brute-force codebreaking. Historically a new special-
purpose supercomputer has occasionally been faster than the world's fastest general-
purpose supercomputer, by some measure. For example, GRAPE-6 was faster than
the Earth Simulator in 2002 for a particular special set of problems.
Examples of special-purpose supercomputers:
• Belle, Deep Blue, and Hydra, for playing chess

• Reconfigurable computing machines or parts of machines
• GRAPE, for astrophysics and molecular dynamics
• Deep Crack, for breaking the DES cipher
• MDGRAPE-3, for protein structure computation
• D. E. Shaw Research Anton, for simulating molecular dynamics [3]
The fastest supercomputers today

Measuring supercomputer speed
In general, the speed of a supercomputer is measured in "FLOPS" (FLoating Point

Operations Per Second), commonly used with an SI prefix such as tera-, combined
into the shorthand "TFLOPS" (1012 FLOPS, pronounced teraflops), or peta-,
combined into the shorthand "PFLOPS" (1015 FLOPS, pronounced petaflops.) This
measurement is based on a particular benchmark, which does LU decomposition of a
large matrix. This mimics a class of real-world problems, but is significantly easier to
compute than a majority of actual real-world problems.
"Petascale" supercomputers can process one quadrilion (1015) (1000 trillion) FLOPS.
Exascale is computing performance in the exaflops range. An exaflop is one
quintillion (1018) FLOPS (one million teraflops).
The Top500 list
Main article: TOP500
Since 1993, the fastest supercomputers have been ranked on the Top500 list according
to their LINPACK benchmark results. The list does not claim to be unbiased or
definitive, but it is a widely cited current definition of the "fastest" supercomputer
available at any given time.
Current fastest supercomputer system
A Blue Gene/P node card
In November 2009, the AMD Opteron-based Cray XT5 Jaguar at the Oak Ridge
National Laboratory was announced as the fastest operational supercomputer, with a
sustained processing rate of 1.759 PFLOPS.[4] [5]
Quasi-supercomputing
Some types of large-scale distributed computing for embarrassingly parallel problems

take the clustered supercomputing concept to an extreme.
The fastest cluster, Folding@home, reported over 7.8 petaflops of processing power
as of December 2009. Of this, 2.3 petaflops of this processing power is contributed by
clients running on PlayStation 3 systems and another 5.1 petaflops is contributed by
their newly released GPU2 client.[6]
Another distributed computing project is the BOINC platform, which hosts a number
of distributed computing projects. As of December 2009, BOINC recorded a
processing power of over 4.725 petaflops through over 609,000 active computers on
the network.[7] The most active project (measured by computational power),
MilkyWay@home, reports processing power of over 1.4 petaflops through over
30,000 active computers.[8]
As of December 2009, GIMPS's distributed Mersenne Prime search currently

achieves about 45 teraflops.[9]
Also a “quasi-supercomputer” is Google's search engine system with estimated total

processing power of between 126 and 316 teraflops, as of April 2004.[10] In June 2006
the New York Times estimated that the Googleplex and its server farms contain
450,000 servers.[11] According to recent estimations, the processing power of Google's
cluster might reach from 20 to 100 petaflops.[12]
The PlayStation 3 Gravity Grid uses a network of 16 machines, and exploits the Cell
processor for the intended application, which is binary black hole coalescence using
perturbation theory.[13][14] The Cell processor has a main CPU and 6 floating-point
vector processors, giving the machine a net of 16 general-purpose machines and 96
vector processors. The machine has a one-time cost of $9,000 to build and is adequate
for black-hole simulations, which would otherwise cost $6,000 per run on a
conventional supercomputer. The black hole calculations are not memory-intensive
and are locally introduced, and so are well-suited to this architecture.
Other notable computer clusters are the flash mob cluster and the Beowulf cluster.
The flash mob cluster allows the use of any computer in the network, while the
Beowulf cluster still requires uniform architecture.
Research and development

IBM is developing the Cyclops64 architecture, intended to create a "supercomputer
on a chip".
Other PFLOPS projects include one by Narendra Karmarkar in India,[15] a CDAC

effort targeted for 2010,[16] and the Blue Waters Petascale Computing System funded
by the NSF ($200 million) that is being built by the NCSA at the University of Illinois
at Urbana-Champaign (slated to be completed by 2011).[17]
In May 2008 a collaboration was announced between NASA, SGI and Intel to build a
1 petaflops computer, Pleiades, in 2009, scaling up to 10 PFLOPs by 2012.[18]
Meanwhile, IBM is constructing a 20 PFLOPs supercomputer at Lawrence Livermore
National Laboratory, named Sequoia, which is scheduled to go online in 2011.
Given the current speed of progress, supercomputers are projected to reach 1 exaflops
(1018) (one quintillion FLOPS) in 2019.[19]
Erik P. DeBenedictis of Sandia National Laboratories theorizes that a zettaflops (1021)

(one sextillion FLOPS) computer is required to accomplish full weather modeling,
which could cover a two week time span accurately.[20] Such systems might be built
around 2030.[21]
Timeline of supercomputers
This is a list of the record-holders for fastest general-purpose supercomputer in the
world, and the year each one set the record. For entries prior to 1993, this list refers to
various sources[22][citation needed]. From 1993 to present, the list reflects the Top500
listing[23], and the "Peak speed" is given as the "Rmax" rating.
Peak speed
Year Supercomputer Location
(Rmax)
1938 Zuse Z1 1 OPS Konrad Zuse, Berlin, Germany
1941 Zuse Z3 20 OPS Konrad Zuse, Berlin, Germany
Post Office Research Station, Bletchley
1943 Colossus 1 5 kOPS
Park, UK
Colossus 2 (Single Post Office Research Station, Bletchley
1944 25 kOPS
Processor) Park, UK
Colossus 2 (Parallel Post Office Research Station, Bletchley
1946 50 kOPS
Processor) Park, UK
UPenn ENIAC Department of War
1946
(before 1948+ 5 kOPS Aberdeen Proving Ground, Maryland,
modifications) USA
Department of Defense
1954 IBM NORC 67 kOPS U.S. Naval Proving Ground, Dahlgren,
Virginia, USA
Massachusetts Inst. of Technology,
1956 MIT TX-0 83 kOPS
Lexington, Massachusetts, USA
25 U.S. Air Force sites across the
1958 IBM AN/FSQ-7 400 kOPS continental USA and 1 site in Canada
(52 computers)
Atomic Energy Commission (AEC)
1960 UNIVAC LARC 250 kFLOPS Lawrence Livermore National
Laboratory, California, USA
AEC-Los Alamos National Laboratory,
1961 IBM 7030 "Stretch" 1.2 MFLOPS
New Mexico, USA
1964 CDC 6600 3 MFLOPS
AEC-Lawrence Livermore National
1969 CDC 7600 36 MFLOPS
1974 CDC STAR-100 100 MFLOPS
Burroughs ILLIAC NASA Ames Research Center,
1975 150 MFLOPS
IV California, USA
Energy Research and Development
Administration (ERDA)
1976 Cray-1 250 MFLOPS
Los Alamos National Laboratory, New
Mexico, USA (80+ sold worldwide)
1981 CDC Cyber 205 400 MFLOPS (~40 systems worldwide)
U.S. Department of Energy (DoE)
Los Alamos National Laboratory;
1983 Cray X-MP/4 941 MFLOPS
Lawrence Livermore National
Laboratory; Battelle; Boeing
Scientific Research Institute of
1984 M-13 2.4 GFLOPS
Computer Complexes, Moscow, USSR
DoE-Lawrence Livermore National
1985 Cray-2/8 3.9 GFLOPS
1989 ETA10-G/8 10.3 GFLOPS Florida State University, Florida, USA
1990 NEC SX-3/44R 23.2 GFLOPS NEC Fuchu Plant, Fuchū,_Tokyo, Japan
Thinking Machines DoE-Los Alamos National Laboratory;
59.7 GFLOPS
CM-5/1024 National Security Agency
Fujitsu Numerical National Aerospace Laboratory, Tokyo,
1993 124.50 GFLOPS
Wind Tunnel Japan
Intel Paragon XP/S DoE-Sandia National Laboratories, New
143.40 GFLOPS
140 Mexico, USA
Fujitsu Numerical National Aerospace Laboratory, Tokyo,
1994 170.40 GFLOPS
Wind Tunnel Japan
Hitachi SR2201/1024 220.4 GFLOPS University of Tokyo, Japan
1996 Hitachi/Tsukuba CP- Center for Computational Physics,
368.2 GFLOPS
PACS/2048 University of Tsukuba, Tsukuba, Japan
1997 Intel ASCI Red/9152 1.338 TFLOPS DoE-Sandia National Laboratories, New
1999 Intel ASCI Red/9632 2.3796 TFLOPS Mexico, USA
DoE-Lawrence Livermore National
2000 IBM ASCI White 7.226 TFLOPS
Earth Simulator Center, Yokohama,
2002 NEC Earth Simulator 35.86 TFLOPS
Japan
2004 70.72 TFLOPS DoE/IBM Rochester, Minnesota, USA
136.8 TFLOPS DoE/U.S. National Nuclear Security
2005 IBM Blue Gene/L
280.6 TFLOPS Administration,
Lawrence Livermore National
2007 478.2 TFLOPS Laboratory, California, USA
1.026 PFLOPS DoE-Los Alamos National Laboratory,
2008 IBM Roadrunner
1.105 PFLOPS New Mexico, USA
DoE-Oak Ridge National Laboratory,
2009 Cray Jaguar 1.759 PFLOPS
Tennessee, USA
Server-side hardware offloads of network communications:-
“Ah!” says the typical cluster-HPC user. “You mean RDMA, like InfiniBand!” (some
people might even remember to cite OpenFabrics, which includes iWARP).
No, that’s not what I mean, and that’s the one of the points of this entry.
The hardware offload that I’m referring to is a host-side network adapter that offloads
most of the networking “work” so that the server’s main CPU(s) don’t have to. In this
way, you can have dedicated (read: very fast/optimized) hardware do the heavy lifting
while the rest of the server’s resources are free to do other stuff. Among other things,
this means that the main CPU(s) don’t have to process all that network traffic,
protocol, and other random associated overhead. Depending on the network protocol
used, offloading to dedicated hardware may or may not save a lot of processing
cycles. Sending and receiving TCP data, for example, may take a lot of cycles in a
software-based protocol stack. Sending and receiving raw ethernet frames may not
(YMMV, of course—depending on your networking hardware, server hardware,
operating system, yadda yadda yadda).
That being said, it’s not just processor cycles that are saved. Caches—both
instruction and data—are likely not to be thrashed. Interrupts may be fired less
frequently. There may be (slightly) less data transferred across internal buses. ...and
so on. All of these things add up: server-side network hardware offload is a Good
Thing; it can make a server generally more efficient because of the combination of
several effects.
Hardware offload is frequently associated with operating system (OS) bypass

techniques. The rationale here is that trapping down into the operating system is
sometimes “slow”—you can save a little time by skipping the OS layer and
communicating directly with networking hardware from user space. This is
somewhat of a contested topic; some people [fervently] believe that OS bypass is
necessary for high performance. Others believe that modern OS’s provide fast
enough access from user space to the networking drivers such that the complexities of
OS-bypass methods just aren’t worth it. This is actually quite an involved topic; I
won’t attempt to unravel it today.
Where were we? ...oh yes, network offload.
Over the years, many MPI implementations have benefited from one form of network
offload or another. MPI implementations that take advantage of hardware offload
typically not only increase efficiency as described above, but also provide true
communication / computation overlap (C/C overlap issues have been discussed in
academic literature for many years). True overlap allows a well-coded MPI
application to start a particular communication and then go off and do other
application-meaningful stuff while the MPI (the network offload hardware, for the
purposes of this blog entry) to progress most—if not all—of the message passing
progress independent of the main server processor(s).
Network offload is typically most beneficial with long messages—sending a short
message in its entirety can frequently cost exactly the same as starting a
communication action and then polling it for completion later. The effective overlap
for short messages can be negligible (or even negative). Hence, the biggest “win” of
hardware offload is for well-coded applications that send and receive large messages.
That being said, hardware offload for small messages may also benefit, such as when
associated with deep server-side hardware buffering and the ability to continue
progressing flow control and protocol issues independent of the main processor.
All this being said, note that Remote Direct Memory Access (RDMA) is a popular /
well-known flavor of hardware offload these days—but it is one of many. Vendors
have churned out various forms of network hardware offload over the past 20+ years.
Indeed, there have been many academic discussions over the past few years
discussing returning to the idea of using a “normal” CPU/processor in “dedicated
network” mode (fueled by the fires of manycore, of course): if you have scads and
scads of cores, who’s going to miss one [or more?] of them? Dedicate a few of them
to act as the network proxies for the rest of the cores. Such schemes have both
benefits and drawbacks, of course (and it’s been tried before, but not necessarily in
exactly the same context as manycore). The jury’s still out on how both the
engineering and market forces will affect these ideas.
MPI will use whatever is available when trying to attain high performance—including
hardware offload (such as RDMA). But to be totally clear: RDMA is not what
enables high performance in MPI—hardware offload is (one way) to attain high
performance in an MPI implementation. RDMA just happens to be among the most
recent flavors of network hardware offload.
Data center:-
A data center or datacenter (or datacentre), also called a server farm,[1] is a facility
used to house computer systems and associated components, such as
telecommunications and storage systems. It generally includes redundant or
backup power supplies, redundant data communications connections,
environmental controls (e.g., air conditioning, fire suppression) and security
devices.
Contents
• 1 History
• 2 Requirements for modern data centers
• 3 Data center classification
• 4 Physical layout
• 5 Network infrastructure
• 6 Applications
History
Data centers have their roots in the huge computer rooms of the early ages of the
computing industry. Early computer systems were complex to operate and maintain,
and required a special environment in which to operate. Many cables were necessary
to connect all the components, and methods to accommodate and organize these were
devised, such as standard racks to mount equipment, elevated floors, and cable trays
(installed overhead or under the elevated floor). Also, old computers required a great
deal of power, and had to be cooled to avoid overheating. Security was important –
computers were expensive, and were often used for military purposes. Basic design
guidelines for controlling access to the computer room were therefore devised.
During the boom of the microcomputer industry, and especially during the 1980s,
computers started to be deployed everywhere, in many cases with little or no care
about operating requirements. However, as information technology (IT) operations
started to grow in complexity, companies grew aware of the need to control IT
resources. With the advent of client-server computing, during the 1990s,
microcomputers (now called "servers") started to find their places in the old computer
rooms. The availability of inexpensive networking equipment, coupled with new
standards for network cabling, made it possible to use a hierarchical design that put
the servers in a specific room inside the company. The use of the term "data center,"
as applied to specially designed computer rooms, started to gain popular recognition
about this time.
The boom of data centers came during the dot-com bubble. Companies needed fast
Internet connectivity and nonstop operation to deploy systems and establish a
presence on the Internet. Installing such equipment was not viable for many smaller
companies. Many companies started building very large facilities, called Internet data
centers (IDCs), which provide businesses with a range of solutions for systems
deployment and operation. New technologies and practices were designed to handle
the scale and the operational requirements of such large-scale operations. These
practices eventually migrated toward the private data centers, and were adopted
largely because of their practical results.
As of 2007, data center design, construction, and operation is a well-known discipline.

Standard documents from accredited professional groups, such as the
Telecommunications Industry Association, specify the requirements for data center
design. Well-known operational metrics for data center availability can be used to
evaluate the business impact of a disruption. There is still a lot of development being
done in operation practice, and also in environmentally-friendly data center design.
Data centers are typically very expensive to build and maintain. For instance,
Amazon.com's new 116,000 sq ft data center in Oregon is expected to cost up to $100
million.[2]
Requirements for modern data centers
Racks of telecommunications equipment in part of a data center.
IT operations are a crucial aspect of most organizational operations. One of the main
concerns is business continuity; companies rely on their information systems to run
their operations. If a system becomes unavailable, company operations may be
impaired or stopped completely. It is necessary to provide a reliable infrastructure for
IT operations, in order to minimize any chance of disruption. Information security is
also a concern, and for this reason a data center has to offer a secure environment
which minimizes the chances of a security breach. A data center must therefore keep
high standards for assuring the integrity and functionality of its hosted computer
environment. This is accomplished through redundancy of both fiber optic cables and
power, which includes emergency backup power generation.
Data center classification

The TIA-942:Data Center Standards Overview describes the requirements for the data
center infrastructure. The simplest is a Tier 1 data center, which is basically a server
room, following basic guidelines for the installation of computer systems. The most
stringent level is a Tier 4 data center, which is designed to host mission critical
computer systems, with fully redundant subsystems and compartmentalized security
zones controlled by biometric access controls methods. Another consideration is the
placement of the data center in a subterranean context, for data security as well as
environmental considerations such as cooling requirements.[3]
The four levels are defined, and copyrighted, by the Uptime Institute, a Santa Fe, New
Mexico-based think tank and professional services organization. The levels describe
the availability of data from the hardware at a location. The higher the tier, the greater
the accessibility. The levels are: [4] [5] [6]
• Tier I - Basic site infrastructure guaranteeing 99.671% availability

• Tier II - Redundant site infrastructure capacity components guaranteeing
99.741% availability
• Tier III - Concurrently maintainable site infrastructure guaranteeing 99.982%
availability
• Tier IV - Fault tolerant site infrastructure guaranteeing 99.995% availability
Physical layout
A typical server rack, commonly seen in colocation.
A data center can occupy one room of a building, one or more floors, or an entire
building. Most of the equipment is often in the form of servers mounted in 19 inch
rack cabinets, which are usually placed in single rows forming corridors between
them. This allows people access to the front and rear of each cabinet. Servers differ
greatly in size from 1U servers to large freestanding storage silos which occupy many
tiles on the floor. Some equipment such as mainframe computers and storage devices
are often as big as the racks themselves, and are placed alongside them. Very large
data centers may use shipping containers packed with 1,000 or more servers each;
when repairs or upgrades are needed, whole containers are replaced (rather than
repairing individual servers).[7]
Local building codes may govern the minimum ceiling heights.

A bank of batteries in a large data center, used to provide power until diesel
generators can start.
The physical environment of a data center is rigorously controlled:
• Air conditioning is used to control the temperature and humidity in the data
center. ASHRAE's "Thermal Guidelines for Data Processing Environments"[8]
recommends a temperature range of 16–24 °C (61–75 °F) and humidity range
of 40–55% with a maximum dew point of 15°C as optimal for data center
conditions.[9] The electrical power used heats the air in the data center. Unless
the heat is removed, the ambient temperature will rise, resulting in electronic
equipment malfunction. By controlling the air temperature, the server
components at the board level are kept within the manufacturer's specified
temperature/humidity range. Air conditioning systems help control humidity
by cooling the return space air below the dew point. Too much humidity, and
water may begin to condense on internal components. In case of a dry
atmosphere, ancillary humidification systems may add water vapor if the
humidity is too low, which can result in static electricity discharge problems
which may damage components. Subterranean data centers may keep
computer equipment cool while expending less energy than conventional
designs.
• Modern data centers try to use economizer cooling, where they use outside air
to keep the data center cool. Washington state now has a few data centers that
cool all of the servers using outside air 11 months out of the year. They do not
use chillers/air conditioners, which creates potential energy savings in the
millions.[10].
• Backup power consists of one or more uninterruptible power supplies and/or
diesel generators.
• To prevent single points of failure, all elements of the electrical systems,
including backup system, are typically fully duplicated, and critical servers are
connected to both the "A-side" and "B-side" power feeds. This arrangement is
often made to achieve N+1 Redundancy in the systems. Static switches are
sometimes used to ensure instantaneous switchover from one supply to the
other in the event of a power failure.
• Data centers typically have raised flooring made up of 60 cm (2 ft) removable
square tiles. The trend is towards 80–100 cm (31–39 in) void to cater for better
and uniform air distribution. These provide a plenum for air to circulate below
the floor, as part of the air conditioning system, as well as providing space for
power cabling. Data cabling is typically routed through overhead cable trays in
modern data centers. But some are still recommending under raised floor
cabling for security reasons and to consider the addition of cooling systems
above the racks in case this enhancement is necessary. Smaller/less expensive
data centers without raised flooring may use anti-static tiles for a flooring
surface. Computer cabinets are often organized into a hot aisle arrangement to
maximize airflow efficiency.
• Data centers feature fire protection systems, including passive and active
design elements, as well as implementation of fire prevention programs in
operations. Smoke detectors are usually installed to provide early warning of a
developing fire by detecting particles generated by smoldering components
prior to the development of flame. This allows investigation, interruption of
power, and manual fire suppression using hand held fire extinguishers before
the fire grows to a large size. A fire sprinkler system is often provided to
control a full scale fire if it develops. Fire sprinklers require 18 in (46 cm) of
clearance (free of cable trays, etc.) below the sprinklers. Clean agent fire
suppression gaseous systems are sometimes installed to suppress a fire earlier
than the fire sprinkler system. Passive fire protection elements include the
installation of fire walls around the data center, so a fire can be restricted to a
portion of the facility for a limited time in the event of the failure of the active
fire protection systems, or if they are not installed. For critical facilites these
firewalls are often insufficient to protect heat-sensitive electronic equipment,
however, because conventional firewall construction is only rated for flame
penetration time, not heat penetration. There are also deficiencies in the
protection of vulnerable entry points into the server room, such as cable
penetrations, coolant line penetrations and air ducts. For mission critical data
centers fireproof vaults with a Class 125 rating are necessary to meet NFPA
75[11] standards.
• Physical security also plays a large role with data centers. Physical access to
the site is usually restricted to selected personnel, with controls including
bollards and mantraps.[12] Video camera surveillance and permanent security
guards are almost always present if the data center is large or contains
sensitive information on any of the systems within. The use of finger print
recognition man traps is starting to be commonplace.
Network infrastructure
An example of "rack mounted" servers.
Communications in data centers today are most often based on networks running the
IP protocol suite. Data centers contain a set of routers and switches that transport
traffic between the servers and to the outside world. Redundancy of the Internet
connection is often provided by using two or more upstream service providers (see
Multihoming).
Some of the servers at the data center are used for running the basic Internet and
intranet services needed by internal users in the organization, e.g., e-mail servers,
proxy servers, and DNS servers.
Network security elements are also usually deployed: firewalls, VPN gateways,
intrusion detection systems, etc. Also common are monitoring systems for the
network and some of the applications. Additional off site monitoring systems are also
typical, in case of a failure of communications inside the data center.
Applications
Multiple racks of servers, and how a data center commonly looks.
The main purpose of a data center is running the applications that handle the core
business and operational data of the organization. Such systems may be proprietary
and developed internally by the organization, or bought from enterprise software
vendors. Such common applications are ERP and CRM systems.
A data center may be concerned with just operations architecture or it may provide
other services as well.
Often these applications will be composed of multiple hosts, each running a single
component. Common components of such applications are databases, file servers,
application servers, middleware, and various others.
Data centers are also used for off site backups. Companies may subscribe to backup
services provided by a data center. This is often used in conjunction with backup
tapes. Backups can be taken of servers locally on to tapes., however tapes stored on
site pose a security threat and are also susceptible to fire and flooding. Larger
companies may also send their backups off site for added security. This can be done
by backing up to a data center. Encrypted backups can be sent over the Internet to
another data center where they can be stored securely.
For disaster recovery, several large hardware vendors have developed mobile
solutions that can be installed and made operational in very short time. Vendors such
as Cisco Systems,[13] Sun Microsystems,[14][15]IBM and HP have developed systems
that could be used for this purpose.[16]
InfiniBand:- (A fast interconnect technology with open specifications)

nfiniBand is a switched fabric communications link primarily used in high-
performance computing. Its features include quality of service and failover, and it is
designed to be scalable. The InfiniBand architecture specification defines a
connection between processor nodes and high performance I/O nodes such as storage
devices.
InfiniBand forms a superset of the Virtual Interface Architecture.
Contents
• 1 Description
o 1.1 Signaling rate
o 1.2 Latency
o 1.3 Topology
o 1.4 Messages
• 2 Programming
• 3 History
Description
Effective theoretical throughput in different configurations
Like Fibre Single (SDR) Double (DDR) Quad (QDR)
Channel, PCI
1X 2 Gbit/s 4 Gbit/s 8 Gbit/s
Express, Serial
ATA, and many 4X 8 Gbit/s 16 Gbit/s 32 Gbit/s
other modern 12X 24 Gbit/s 48 Gbit/s 96 Gbit/s
interconnects, InfiniBand offers point-to-point bidirectional serial links intended for
the connection of processors with high-speed peripherals such as disks. It supports
several signalling rates and, as with PCI Express, links can be bonded together for
additional bandwidth.
Signaling rate
The serial connection's signalling rate is 2.5 gigabit per second (Gbit/s) in each
direction per connection. InfiniBand supports double (DDR) and quad data rate
(QDR) speeds, for 5 Gbit/s or 10 Gbit/s respectively, at the same data-clock rate.
Links use 8B/10B encoding — every 10 bits sent carry 8bits of data — making the
useful data transmission rate four-fifths the raw rate. Thus single, double, and quad
data rates carry 2, 4, or 8 Gbit/s respectively.
Implementers can aggregate links in units of 4 or 12, called 4X or 12X. A quad-rate
12X link therefore carries 120 Gbit/s raw, or 96 Gbit/s of useful data. As of 2009 most
systems use either a 4X 10 Gbit/s (SDR), 20 Gbit/s (DDR) or 40 Gbit/s (QDR)
connection. Larger systems with 12x links are typically used for cluster and
supercomputer interconnects and for inter-switch connections.
Latency
The single data rate switch chips have a latency of 200 nanoseconds, and DDR switch
chips have a latency of 140 nanoseconds.The end-to-end latency range ranges from
1.07 microseconds MPI latency (Mellanox ConnectX HCAs) to 1.29 microseconds
MPI latency (Qlogic InfiniPath HTX HCAs) to 2.6 microseconds (Mellanox
InfiniHost III HCAs).[citation needed] As of 2009 various InfiniBand host channel adapters
(HCA) exist in the market, each with different latency and bandwidth characteristics.
InfiniBand also provides RDMA capabilities for low CPU overhead. The latency for
RDMA operations is less than 1 microsecond (Mellanox ConnectX HCAs).
Topology
InfiniBand uses a switched fabric topology, as opposed to a hierarchical switched

network like Ethernet.
As in the channel model used in most mainframe computers, all transmissions begin
or end at a channel adapter. Each processor contains a host channel adapter (HCA)
and each peripheral has a target channel adapter (TCA). These adapters can also
exchange information for security or quality of service.
Messages
InfiniBand transmits data in packets of up to 4 kB that are taken together to form a

message. A message can be:
• a direct memory access read from or, write to, a remote node (RDMA)
• a channel send or receive
• a transaction-based operation (that can be reversed)
• a multicast transmission.
• an atomic operation
Programming
InfiniBand has no standard programming interface. The standard only lists a set of
"verbs"; functions that must exist. The syntax of these functions is left to the vendors.
The most common to date has been the syntax developed by the OpenFabrics
Alliance, which was adopted by most of the InfiniBand vendors, both for Linux and
Windows. The Infiniband software stack developed by OpenFabrics Alliance is
released as "OpenFabrics Enterprise Distribution (OFED)", under a choice of two
licenses GPL2 or BSD license.
History
InfiniBand originated from the 1999 merger of two competing designs:
1. Future I/O, developed by Compaq, IBM, and Hewlett-Packard

2. Next Generation I/O (ngio), developed by Intel, Microsoft, and Sun
From the Compaq side, the roots of the technology derived from Tandem's ServerNet.
For a short time before the group came up with a new name, InfiniBand was called
System I/O.[1]
InfiniBand was originally envisioned[by whom?] as a comprehensive "system area

network" that would connect CPUs and provide all high speed I/O for "back-office"
applications. In this role it would potentially replace just about every datacenter I/O
standard including PCI, Fibre Channel, and various networks like Ethernet. Instead,
all of the CPUs and peripherals would be connected into a single pan-datacenter
switched InfiniBand fabric. This vision offered a number of advantages in addition to
greater speed, not the least of which is that I/O workload would be largely lifted from
computer and storage. In theory, this should make the construction of clusters much
easier, and potentially less expensive, because more devices could be shared and they
could be easily moved around as workloads shifted. Proponents of a less
comprehensive vision saw InfiniBand as a pervasive, low latency, high bandwidth,
low overhead interconnect for commercial datacenters, albeit one that might perhaps
only connect servers and storage to each other, while leaving more local connections
to other protocols and standards such as PCI.[citation needed]
As of 2009 InfiniBand has become the de-facto interconnect of choice for high
performance computing, and its adoption as seen in the TOP500 supercomputers list
is faster than Ethernet[2]. However, note that Top500 uses Linpack for benchmarks,
which as a neatly parallel computing task tends to be fairly easy on the interconnect;
InfiniBand shouldn't be confused with the custom-built interconnects of vector
supercomputers. For example, the NEC SX-9 provides 128 GB/s of low-latency
interconnect bandwidth between each computing node, compared to the 96 Gbit/s of
an InfiniBand 12X Quad Data Rate link. Enterprise datacenters have seen more
limited use. It is used today mostly for performance focused computer cluster
applications, and there are some efforts to adapt InfiniBand as a "standard"
interconnect between low-cost machines as well. A number of the TOP500
supercomputers have used InfiniBand including the former[3] reigning fastest
supercomputer, the IBM Roadrunner. In another example of InfiniBand use within
high performance computing, the Cray XD1 uses built-in Mellanox InfiniBand
switches to create a fabric between HyperTransport-connected Opteron-based
compute nodes.[citation needed]
SGI, LSI and DDN, among others, have also released storage utilizing InfiniBand
"target adapters". These products essentially compete with architectures such as Fibre
Channel, SCSI, and other more traditional connectivity-methods. Such target adapter-
based discs can become a part of the fabric of a given network, in a fashion similar to
DEC VMS clustering. The advantage to this configuration is lower latency and higher
availability to nodes on the network (because of the fabric nature of the network). In
2009, the Jaguar Spider storage system used this type of InfiniBand attached storage
to deliver over 240 gigabytes per second of bandwidth.
InfiniBand uses copper CX4 cable — also commonly used to connect SAS (Serial
Attached SCSI) HBAs to external (SAS) disk arrays. With SAS, this is known as an
SFF-8470 connector, and is referred to as an "Infiniband style" Connector[4].
In 2008 Oracle Corporation released its HP Oracle Database Machine build as a RAC
Database (Real Application Clustered Database) with storage provided on its Exadata
Storage server which utilises InfiniBand as the backend interconnect for all IO and
Interconnect traffic. Updated versions of the Exadata Storage system, now using Sun
computing hardware, continue to utilize Infiniband infrastructure.
In 2009, IBM announced a December 2009 release date for their DB2 pureScale
offering, a shared-disk clustering scheme (inspired by parallel sysplex for DB2 z/OS)
that uses a cluster of IBM System p servers (POWER6/7) communicating with each
other over an InfiniBand interconnect.
In 2010, scale-out network storage manufacturers increasingly adopt InfiniBand as

primary cluster interconnect for modern NAS designs, like Isilon IQ or IBM SONAS.
Since scale-out systems run distributed metadata operations without "master node",
internal low latency communication is a critical success factor for highest scalability
and performance (see TOP500 cluster architectures).

Supercomputer

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Supercomputer

Hochgeladen von

Copyright:

Verfügbare Formate

Why HPC?

Whether your business works in engineering, financial services, design, geosciences,

Today, supercomputers are typically one-of-a-kind custom designs produced by

Supercomputers are used for highly calculation-intensive tasks such as problems

Relevant here is the distinction between capability computing and capacity

Hardware and software design

Processor board of a CRAY YMP vector computer

Supercomputer challenges, technologies

• A supercomputer generates large amounts of heat and must be cooled. Cooling

Technologies developed for supercomputers include:

Supercomputers predominantly run some variant of Linux.[1]

Supercomputers today most often use variants of Linux[1].

Until the early-to-mid-1980s, supercomputers usually sacrificed instruction set

Modern supercomputer architecture

IBM Roadrunner - LANL

Supercomputers today often have a similar top-level architecture consisting of a

• A computer cluster is a collection of computers that are highly interconnected

In February 2009, IBM also announced work on "Sequoia," which appears to be a 20

Examples of special-purpose supercomputers:

• Belle, Deep Blue, and Hydra, for playing chess

The fastest supercomputers today

In general, the speed of a supercomputer is measured in "FLOPS" (FLoating Point

The Top500 list

Main article: TOP500

Current fastest supercomputer system

A Blue Gene/P node card

Some types of large-scale distributed computing for embarrassingly parallel problems

As of December 2009, GIMPS's distributed Mersenne Prime search currently

Also a “quasi-supercomputer” is Google's search engine system with estimated total

Research and development

Other PFLOPS projects include one by Narendra Karmarkar in India,[15] a CDAC

Erik P. DeBenedictis of Sandia National Laboratories theorizes that a zettaflops (1021)

Hardware offload is frequently associated with operating system (OS) bypass

Where were we? ...oh yes, network offload.

As of 2007, data center design, construction, and operation is a well-known discipline.

Racks of telecommunications equipment in part of a data center.

Data center classification

• Tier I - Basic site infrastructure guaranteeing 99.671% availability

A typical server rack, commonly seen in colocation.

Local building codes may govern the minimum ceiling heights.

The physical environment of a data center is rigorously controlled:

An example of "rack mounted" servers.

Multiple racks of servers, and how a data center commonly looks.

InfiniBand:- (A fast interconnect technology with open specifications)

InfiniBand forms a superset of the Virtual Interface Architecture.

InfiniBand uses a switched fabric topology, as opposed to a hierarchical switched

InfiniBand transmits data in packets of up to 4 kB that are taken together to form a

1. Future I/O, developed by Compaq, IBM, and Hewlett-Packard

InfiniBand was originally envisioned[by whom?] as a comprehensive "system area

In 2010, scale-out network storage manufacturers increasingly adopt InfiniBand as

Das könnte Ihnen auch gefallen