Sie sind auf Seite 1von 7

Distributed Systems Fö 1 - 1 Distributed Systems Fö 1 - 2

Course Information

Web page: http://www.ida.liu.se/~TDDD25

DISTRIBUTED SYSTEMS
Examination: written
(TDDD25)

Lecture notes: available from the web page, latest 24


hours before the lecture.
Petru Eles
Text book:
Institutionen för Datavetenskap (IDA)
Linköpings Universitet George Coulouris, Jean Dollimore, Tim Kindberg:
"Distributed Systems - Concepts and Design",
email: petel@ida.liu.se Addison Wesley Publ. Comp., 4th edition, 2005.
http://www.ida.liu.se/~petel
phone: 28 1396
B building, 329:220 Other titles can be used in addition:

Andrew S. Tanenbaum, Maarten van Steen: "Distributed


Systems", Prentice-Hall International, 2002.

Mukesh Singhal, Niranjan G. Shivaratri: "Advanced


Concepts in Operating Systems. Distributed,
Database, and Multiprocessor Operating
Systems", McGraw-Hill, 1994.

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH

Distributed Systems Fö 1 - 3 Distributed Systems Fö 1 - 4

Course Information (cont’d) DISTRIBUTED SYSTEMS

Basic Issues

Labss&Lessons:

Jakob Rosén 1. What is a Distributed System?


Institutionen för Datavetenskap (IDA)
email: jakro@ida.liu.se
http://www.ida.liu.se/~jakro 2. Examples of Distributed Systems
phone: 28 4046
B building, 329:224
3. Advantages and Disadvantages

Sergiu Rafiliu
Institutionen för Datavetenskap (IDA) 4. Design Issues with Distributed Systems
email: serra@ida.liu.se
http://www.ida.liu.se/~serra
phone: 28 2281 5. Preliminary Course Topics
B building, 329:228

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH


Distributed Systems Fö 1 - 5 Distributed Systems Fö 1 - 6

What is a Distributed System? Examples of Distributed Systems

A distributed system is a collection of autonomous Network of workstations


computers linked by a computer network that appear to
the users of the system as a single computer.
Workstations
Some comments:

• System architecture: the machines are


autonomous; this means they are computers which,
in principle, could work independently;
Local area network
• The user’s perception: the distributed system is
perceived as a single system solving a certain
problem (even though, in reality, we have several
computers placed in different locations).
Gateway to
Wide Area
Network
By running a distributed system software the computers File servers
are enabled to:
- coordinate their activities
- share resources: hardware, software, data. • Personal workstations + processors not assigned to
specific users.
• Single file system, with all files accessible from all
☞ According to this definition, the Internet as such, is machines in the same way and using the same
not a distributed system, but an infrastructure on path name.
which to implement distributed applications/services
(such as the World Wide Web). • For a certain command the system can look for the
best place (workstation) to execute it.

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH

Distributed Systems Fö 1 - 7 Distributed Systems Fö 1 - 8

Examples of Distributed Systems (cont’d) Examples of Distributed Systems (cont’d)

Automatic banking (teller machine) system Automotive system (a distributed real-time system)

Bank_2 data Bank_2 backup Actuators Sensors X-by wire


Teller
machines Input/Output Adaptive
FPGA Anti-lock cruise
breaking control
RAM
CPU
CACHE
Network Interf.

Gateway
Safety critical network

Engine control Trottle control

Teller
machines
Bank_1 data Bank_1 backup
Gateway
• Primary requirements: security and reliability. Non-safety critical high-speed network
• Consistency of replicated data.
• Concurrent transactions (operations which involve Non-safety critical low-speed network
accounts in different banks; simultaneous access Entertainment network
from several users, etc).
• Fault tolerance

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH


Distributed Systems Fö 1 - 9 Distributed Systems Fö 1 - 10

Why do we Need Them?


Advantages of Distributed Systems

Examples of Distributed Systems (cont’d)

Performance: very often a collection of processors can


provide higher performance (and better
price/performance ratio) than a
centralized computer.
Distributed Real-Time Systems

Distribution: many applications involve, by their nature,


spatially separated machines (banking,
• Synchronization of physical clocks commercial, automotive system).

• Scheduling with hard time constraints Reliability (fault tolerance): if some of the machines
crash, the system can survive.
• Real-time communication
Incremental growth: as requirements on processing
• Fault tolerance power grow, new machines can be added
incrementally.

Sharing of data/resources: shared data is essential to


many applications (banking, computer-
supported cooperative work, reservation
systems); other resources can be also
shared (e.g. expensive printers).

Communication: facilitates human-to-human


communication.

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH

Distributed Systems Fö 1 - 11 Distributed Systems Fö 1 - 12

Disadvantages of Distributed Systems Design Issues with Distributed Systems

Design issues that arise specifically from the distributed


Difficulties of developing distributed software: how nature of the application:
should operating systems, programming
languages and applications look like?
• Transparency

• Communication
Networking problems: several problems are created by
the network infrastructure, which have to • Performance & scalability
be dealt with: loss of messages,
overloading, ... • Heterogeneity

• Openness

Security problems: sharing generates the problem of • Reliability & fault tolerance
data security.
• Security

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH


Distributed Systems Fö 1 - 13 Distributed Systems Fö 1 - 14

Transparency Transparency (cont’d)

• Replication transparency
☞ How to achieve the single system image? - the system is free to make additional copies of
files and other resources (for purpose of
☞ How to "fool" everyone into thinking that the collection performance and/or reliability), without the
of machines is a "simple" computer? users noticing.
Example: several copies of a file; at a certain
request that copy is accessed which is the
closest to the client.
• Access transparency
• Concurrency transparency
- local and remote resources are accessed using
identical operations. - the users will not notice the existence of other
users in the system (even if they access the
same resources).
• Location transparency
- users cannot tell where hardware and software • Failure transparency
resources (CPUs, files, data bases) are locat-
ed; the name of the resource shouldn’t encode - applications should be able to complete their
the location of the resource. task despite failures occurring in certain
components of the system.
• Migration (mobility) transparency
• Performance transparency
- resources should be free to move from one
location to another without having their names - load variation should not lead to performance
changed. degradation.
This could be achieved by automatic
reconfiguration as response to changes of the
load; it is difficult to achieve.

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH

Distributed Systems Fö 1 - 15 Distributed Systems Fö 1 - 16

Communication Performance and Scalability

Several factors are influencing the performance of a


☞ Components of a distributed system have to distributed system:
communicate in order to interact. This implies
support at two levels: • The performance of individual workstations.

1. Networking infrastructure (interconnections & • The speed of the communication infrastructure.


network software).
• Extent to which reliability (fault tolerance) is
2. Appropriate communication primitives and models provided (replication and preservation of coherence
and their implementation: imply large overheads).
• communication primitives:
• Flexibility in workload allocation: for example, idle
- send
message passing processors (workstations) could be allocated
- receive automatically to a user’s task.
- remote procedure call (RPC)
• communication models
Scalability
- client-server communication: implies a
message exchange between two proc- The system should remain efficient even with a
esses: the process which requests a serv- significant increase in the number of users and
ice and the one which provides it; resources connected:
- group muticast: the target of a message is - cost of adding resources should be reasonable;
a set of processes, which are members of - performance loss with increased number of
a given group. users and resources should be controlled;
- software resources should not run out (number
of bits allocated to addresses, number of
entries in tables, etc.)

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH


Distributed Systems Fö 1 - 17 Distributed Systems Fö 1 - 18

Heterogeneity Openness

☞ Distributed applications are typically heterogeneous: ☞ One of the important features of distributed systems
- different hardware: mainframes, workstations, is openness and flexibility:
PCs, servers, etc.; - every service is equally accessible to every
- different software: UNIX, MS Windows, IBM OS/2, client (local or remote);
Real-time OSs, etc.; - it is easy to implement, install and debug new
- unconventional devices: teller machines, services;
telephone switches, robots, manufacturing - users can write and install their own services.
systems, etc.;
- diverse networks and protocols: Ethernet,
FDDI, ATM, TCP/IP, Novell Netware, etc.
☞ Key aspect of openness:
- Standard interfaces and protocols (like Internet
The solution communication protocols)
Middleware, an additional software layer to mask - Support of heterogeneity (by adequate
heterogeneity middleware, like CORBA)

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH

Distributed Systems Fö 1 - 19 Distributed Systems Fö 1 - 20

Openness (cont’d) Openness (cont’d)

Software Architecture: The same, looking at two distributed nodes:

Applications & Services


Applications & Services

Middleware
Middleware
platform 1

Operating Operating
platform 2

System System
Operating System
"the platform" Hardware: Hardware:
Comp.&Netw. Comp.&Netw.
Hardware: Computer&Network
Node 1 Node 2

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH


Distributed Systems Fö 1 - 21 Distributed Systems Fö 1 - 22

Reliability and Fault Tolerance Security

☞ One of the main goals of building distributed systems


is improvement of reliability. Security of information resources:

Availability: If machines go down, the system should work 1. Confidentiality


with the reduced amount of resources. Protection against disclosure to unauthorised person
• There should be a very small number of critical
resources; 2. Integrity
critical resources: resources which have to be up in Protection against alteration and corruption
order the distributed system to work.
• Key pieces of hardware and software (critical 3. Availability
resources) should be replicated ⇒ if one of them Keep the resource accessible
fails another one takes up - redundancy.

Data on the system must not be lost, and copies stored Distributed systems should allow
redundantly on different servers must be kept consistent. communication between programs/users/
resources on different computers.
• The more copies kept, the better the availability, but
keeping consistency becomes more difficult.

Fault-tolerance is a main issue related to reliability: the


system has to detect faults and act in a reasonable way: Security risks associated with free access.
• mask the fault: continue to work with possibly
reduced performance but without loss of data/
information.
The appropriate use of resources by different users has
• fail gracefully: react to the fault in a predictable way to be guaranteed.
and possibly stop functionality for a short period,
but without loss of data/information.

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH

Distributed Systems Fö 1 - 23 Distributed Systems Fö 1 - 24

Course Topics at a Glance Course Topics

• Introduction
Basics - just finished!
• Introduction
• Models of Distributed Systems • Communication in Distributed Systems
• Communication in Distributed Systems - Message passing and the client/server model
- Remote Procedure Call
Middleware - Group Communication
• Distributed Heterogeneous Applications and CORBA
• Distributed Heterogeneous Applications and CORBA
• Peer-to-Peer Systems
- Heterogeneity in distributed systems
- Middleware
Theoretical Aspects/Distributed Algorithms
- Objects in distributed systems
• Time and State in Distributed Systems
- The CORBA approach
• Distributed Mutual Exclusion
• Election and Agreement
• Peer-to-Peer Systems
- Basic design issues
Distributed Data and Fault Tolerance
- The Napster file sharing system
• Replication
- Peer-to-peer middleware
• Recovery and Fault Tolerance
• Time and State in Distributed Systems
Distributed Real-Time Systems
- Time in distributed systems
- Logical clocks
- Vector clocks
- Causal ordering of messages
- Global states and state recording

Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH


Distributed Systems Fö 1 - 25

Course Topics (cont’d)

• Distributed Mutual Exclusion


- Mutual exclusion in distributes systems
- Non-token based algorithms
- Token based algorithms
- Distributed elections

• Replication
- Motivation for replication
- Consistency and ordering
- Total and causal ordering
- Update protocols and voting

• Recovery and Fault Tolerance


- Transaction recovery
- Checkpointing and recovery
- Fault tolerance in distributed systems
- Hardware and software redundancy
- Byzantine agreement

• Distributed Real-Time Systems


- Physical Clocks
- Clock Synchronization
- Real-Time Scheduling
- Real-Time Communication

Petru Eles, IDA, LiTH

Das könnte Ihnen auch gefallen