Beruflich Dokumente
Kultur Dokumente
1. INTRODUCTION will inspire others; we are convinced that such research struc-
Computer science is facing a fundamental paradigm shift. tures are essential for academic research to remain compet-
Multicore architectures, application virtualization, and cloud itive and relevant in todays computing environment.
computing each present on their own radical departures in
the way software is built, deployed, and operated. Taken 2. VISION AND APPROACH
together, these trends frame a scenario that makes sense
Our vision of how mainstream computing is evolving is
from many points of view (usability, scalability, flexibility,
based on the three trends mentioned above: multicore, vir-
development cost) but is also radically different from what
tualization, and cloud computing. To these, we add perva-
we know today. It is fair to say that a coherent answer to
sive computing, and note that the most common information
the challenges raised by the combination of these trends has
access terminals in the near future will be portable devices
yet to emerge from either academia or industry.
rather than traditional computers.
From an academic perspective, these challenges are par-
We summarize the vision as follows: software will run on
ticularly difficult because they imply a considerable depar-
manycore (>16 cores) computers with heterogeneous hard-
ture from established procedures. To start with, multi-
ware (not all cores and processing units will have the same
core computers, the virtualization of computing platforms,
capabilities). Applications will be deployed on clusters of
and the replacement of the one-computer-one-local-copy-of-
thousands if not millions of machines, distributed world-
a-program approach by cloud computing each demand an
wide. The hardware resources of these clusters will be virtu-
interdisciplinary approach. In practice, it is likely that tra-
alized into logical, dynamically configured computing plat-
ditional disciplines such as operating systems, distributed
forms. Through virtualization and the notion of software as
systems, software engineering, or data management will re-
a service, applications on such platforms will operate in a
quire major revision as the boundaries between them become
computing cloud: physically separated from the application
blurred.
client or human user). The principle access device for the
A further challenge for both academic and (to some ex-
cloud will be future portable computing and communication
tent) industrial research is the impossibility of exploring
devices, of which today’s mobile phones are a precursor.
the important design, architectural, and algorithmic chal-
Our research agenda revolves around the many related
lenges ahead using a small number of computers. Indus-
challenges in moving from where we are and what we know
trially meaningful systems today are large distributed plat-
today towards this emerging vision of computing:
forms (hundreds, thousands, ten of thousands nodes) with
complex multi-layered architectures, often geographically dis- 1. Managing resources in a heterogeneous multicore com-
tributed over the globe. Academia often lacks both access puter that itself increasingly resembles a distributed
to such systems and basic information on the operations, system, and architecting applications (specially data
constraints and requirements involved. management applications) to efficiently exploit heter-
The Systems Group, together with the Enterprise Com- geneous multicore machines.
puting Center (ECC) are two recent initiatives at the ETH
Zurich Department of Computer Science to respond to these 2. Architecting applications to efficiently exploit thou-
challenges. The goal of the Systems Group is to redefine, re- sands of hetergeneous multicore machines, and build-
structure, and reorganize systems research to avoid the pit- ing software platforms (database systems, data stream
falls of looking at complex problems from a single, isolated processors, application servers) on this hardware in-
perspective. The goal of the ECC is to establish new re- frastructure.
lationships between academia and industry that are longer
term, more productive for all sides, and give academic re- 3. Evolving existing software systems into multicore ap-
search direct access to real systems and empirical data about plications, and determining which parts of standard
the functioning of these systems. applications can run on specialized, dedicated hard-
In this short paper, we present both these initiatives and ware and which are the appropriate abstractions for
some of the associated research projects. We hope our ideas programming such hardware.
5.3.1 Data management in the cloud: Cloudy 5.3.3 Federated stream processing: MaxStream
Despite the potential cost advantages, cloud-based imple- Despite the availability of several data stream process-
mentations of the functionality found in traditional databases ing engines (SPEs) today, it remains hard to develop and
face significant new challenges, and it appears that tradi- maintain streaming applications. One difficulty is the lack
tional database architectures are poorly equipped to operate of agreed standards, and the wide (and changing) variety
in a cloud environment. application requirements. Consequently, existing SPEs vary
For example, a modern database system generally assumes widely in data and query models, APIs, functionality, and
that it has control over all hardware resources (so as to op- optimization capabilities. Furthermore, data management
timize queries) and all requests to data (so as to guarantee for stored and streaming data are still mostly separate con-
consistency). Unfortunately, this assumption limits scala- cerns, although applications increasingly require integrated
bility and flexibility, and does not correspond to the cloud access to both. In the MaxStream project, our goal is to
model where hardware resources are allocated dynamically design and build a federated stream processing architecture
to applications based on current requirements. Furthermore, that seamlessly integrates multiple autonomous and hetero-
cloud computing mandates a loose coupling between func- geneous SPEs with traditional databases, and hence facil-
tionality (such as data management) and machines. To ad- itates the incorporation of new functionality and require-
dress these challenges, we are developing a system called ments.
Cloudy [3, 4], a novel architecture for data management in MaxStream is a federation layer between client applica-
the cloud. Cloudy is a vehicle for exploring design issues tions and a collection of SPEs and databases. A key idea
such as relaxed consistency models and the cost efficiency of is to present at the application layer a common SQL-based
running transactions in the cloud. query language and programming interface. The federation
We are also rethinking the model for distributed and po- layer performs global optimizations and necessary transla-
tentially long-running transactions across autonomous ser- tions to the native interfaces of the underlying systems. The
vices (such as those found in the cloud). One key idea is to second idea is to implement the federation layer itself using
employ a reservation pattern in which updates are reserved a relational database infrastructure. By doing so, we can
before they are actually committed – in some sense, a gener- build on existing support for SQL, persistence, transactions,
alization of 2-phase commit in which the ability to commit and most importantly traditional federation functionality.
is reserved before the actual commit itself. We are exploring Finally, MaxStream leverages the strengths of the underly-
this pattern in collaboration with Oracle and Credit Suisse ing engines while the federation layer can compensate for
so as to understand its domain of applicability for large-scale any missing functionality by itself adding a number of novel
applications and complex infrastructures. streaming features on top of the relational engine infrastruc-
ture. MaxStream is a collaboration with SAP Labs in the
5.3.2 Self-deploying applications: Rhizoma context of the ECC, and also builds on the SMS storage
Data management is only one challenge posed by deploy- manager project.
ing long-running services on cloud infrastructures. Selecting
cloud providers is becoming more complex as more play- 5.3.4 Global stream overlays: Xtream
ers enter the market, pricing structures change regularly The XTream project is looking at stream processing as the
through competition and innovation, individual providers basis for a global scale, collaborative data processing and
experience transient failures and major outages, and appli- dissemination platform where independent processing units
cation deployment must be adjusted (within constraints) to are linked by channels to form intertwined data stream pro-