Sie sind auf Seite 1von 7

Porting Applications to the Grid

Charles Loomis

Laboratoire de lAcclrateur Linaire, Universit Paris-Sud 11, ee e e Orsay, France

Lecture given at the Joint EU-IndiaGrid/CompChem GRID Tutorial on Chemical and Material Science Applications Trieste, 15-18 September 2008

LNS0924004

loomis@lal.in2p3.fr

Abstract Routine, heavy use of the EGEE Grid infrastructure by a broad spectrum of dierent scientic disciplines is a reality. The Grid infrastructure oers a collaborative environment where scientists can easily share not only their resources but also their data and expertise. Porting scientic applications to a Grid infrastructure can be challenging, but the EGEE project provides support for porting new applications, helping expand and diversify the user community and running applications. Applications from high-energy physics, biology, earth science and other disciplines prove the utility of the Grid infrastructure for scientic research.

Contents
1 Enabling Grids for E-sciencE 59 1.1 EGEE Grid infrastructure . . . . . . . . . . . . . . . . . . . . 59 1.2 gLite middleware . . . . . . . . . . . . . . . . . . . . . . . . . 59 1.3 Support services . . . . . . . . . . . . . . . . . . . . . . . . . 60 2 Porting applications 3 Conclusions 60 61

Porting Applications to the Grid

59

Enabling Grids for E-sciencE

A Grid provides uniform access to distributed computational resources, federating resources from dierent administrative domains or institutions, requiring a coherent user authentication and authorization framework throughout. Scientists benet from a Grid infrastructure because it permits individuals to perform new, larger, or more precise calculations than they would be able to do alone. The Grid acts as a platform for sharing data and expertise (e.g. code or algorithms). In short, it provides an ideal platform to analyze data, to publish result datasets, and to combine previous results. EGEE1 is a publicly-funded project that provides a service-oriented Grid infrastructure. The infrastructure serves primarily the European scientic community, although the project has contacts with researchers throughout the world as well contacts with industry. It collaborates with other Grid projects like Open Science Grid (OSG)2 in the United States, Nordic DataGrid Facility (NDGF)3 in Northern Europe and NAREGI4 in Japan, for example.

1.1

EGEE Grid infrastructure

By nearly all measures, the EGEE Grid infrastructure is the largest in the world with 260 sites from 45 countries participating. Around 80000 CPU cores and many petabytes of disk space are accessible via the Grid infrastructure. More than 13000 registered users, from many scientic disciplines, take advantage of these resources. In the rst half of 2008, the total CPU utilization on the EGEE Grid infrastructure was approximately equivalent to 28000 CPUs running continuously over that 6-month period. The total CPU utilization has been doubling every 12 to 18 months and will likely continue to do so.

1.2

gLite middleware

The gLite middleware5 provides the core functionality available on the EGEE Grid infrastructure. Including software developed both by EGEE itself and
1 2

http://www.eu-egee.org/ http://www.opensciencegrid.org/ 3 http://www.ndgf.org/ 4 http://www.naregi.org/index e.html 5 http://glite.org/

60

C. Loomis

by other projects, gLite provides a core software stack on which to run scientic applications. This core includes a common security infrastructure, services to access resources, and some high-level services, like meta-schedulers. Real applications, however, usually need to use quite a lot of software on top of gLite. Although some of this additional software really is specic to the application, much of it can take the form of generic functionality or services. The EGEE project maintains a list of useful third-party software, providing for instance, encrypted data management, tools for large taskbased calculations, interactive access, etc.

1.3

Support services

The EGEE project provides a full set of support services to the user community. Included are help-desk support, administrative support, and porting support (see below). In addition, the project regularly holds training sessions and tutorials for new and existing users. These traditional support services are complemented with annual User Forums that allow users to exchange information concerning their experiences with the Grid.

Porting applications

Often people come to the Grid with vague ideas about how the Grid can improve their scientic work. For people not experienced with Grid technology it can be extremely dicult to discover the best services and techniques to use when running an application on the Grid. An application support team within EGEE acts as consultants to discuss with scientists their applications and to develop a specic porting plan. This helps people better understand the EGEE Grid services and avoids initial frustration with the signicant learning curve associated with Grid technology. The nature of Grid infrastructures makes some applications more dicult to port than others. Those that are easy to port include trivially parallel applications, parameter sweep applications, those that calculate statistical properties from independent calculations, and those doing simple analyses of le-based data. Slightly more dicult to port are lightly-coupled applications, weakly-parallel or small MPI-based applications, workows (chains of coupled applications), and those using metadata or databases. The most dicult to port are interactive applications, applications requiring their own services, and applications using commercial (or binary-only) executables.

Porting Applications to the Grid

61

There are examples of all of these types of applications running on the infrastructure. Probably more important is to know what applications are not suited to the EGEE Grid infrastructure. Large-scale parallel applications with significant communication between the nodes, typically run on supercomputers, are not well-adapted to the EGEE infrastructure. At the other end of the spectrum, short, one-time analyses are usually better done on a desktop machine, avoiding the sometimes large latencies involved with Grid computations.

Conclusions

Routine, heavy use of the EGEE Grid infrastructure by a broad spectrum of dierent scientic disciplines is a reality. The general nature of the EGEE Grid infrastructure accommodates a broad range of dierent application types. Most importantly, the Grid serves as an ideal collaborative platform allowing scientists to analyze, publish, and combine previous results all within the same environment. EGEE and Grid techology have evolved over the years and will continue to evolve to better meet the needs of the user community by expanding the types of resources available on the Grid, by introducing new services, and by incorporating promising technologies. The hope is that the EGEE projects will be followed by an initiative to guarantee the availability of this critical computational platform as a stable European Grid Infrastructure in the future.

Acknowledgements
The author gratefully acknowledges the support of the European Commission and of the various national governments participating in the EGEE project and their collaborators within the application activity (NA4) within the project. This work is co-funded by the European Commission through the EGEE-III project, contract number INFSO-RI-222667.

Das könnte Ihnen auch gefallen