Distributed Systems: University of Pennsylvania

University of Pennsylvania
Distributed Systems
1
Introduction to Distributed Systems

Why do we develop distributed systems?
• Availability of powerful yet cheap microprocessors (PCs,
workstations), continuing advances in communication
technology,
What is a distributed system?

A distributed system is a collection of independent computers that
appear to the users of the system as a single system.
Examples:
• Network of workstations
• Distributed manufacturing system (e.g., automated assembly
line)
• Network of branch office computers
2
Advantages of Distributed Systems
over Centralized Systems

• Economics: A collection of microprocessors offer a better
price/performance than mainframes. Low price/performance ratio:
cost effective way to increase computing power.
• Speed: A distributed system may have more total computing power
than a mainframe. Ex. 10,000 CPU chips, each running at 50 MIPS.
Not possible to build 500,000 MIPS single processor since it would
require 0.002 nsec instruction cycle. Enhanced performance through
load distributing.
• Inherent distribution: Some applications are inherently
distributed. Ex. a supermarket chain.
• Reliability: If one machine crashes, the system as a whole can still
survive. Higher availability and improved reliability.
• Incremental growth: Computing power can be added in small
increments. Modular expandability
• Another deriving force: The existence of large number of personal
computers, the need for people to collaborate and share information.
3
Advantages of Distributed Systems
over Independent PCs

• Data sharing: Allow many users to access to a
common data base
• Resource Sharing: Expensive peripherals like color
printers
• Communication: Enhance human-to-human
communication, e.g., email, chat
• Flexibility: Spread the workload over the available
machines
4
Disadvantages of Distributed Systems
• Software: Difficult to develop software for

distributed systems
• Network: The network can saturate or
cause other problems
• Security: Easy access also applies to secrete
data
5
Hardware Concepts
All distributed systems consist of multiple CPUs, there are several different
ways the hardware can be organized interms of how they are interconnected
and how they communicate.
Flynn picked two characterstics they are the number of instructions stream
and the number of data streams.
A computer with a Single instruction and single data stream is called SISD.
All traditional uniprocessor computers.
The next category is SIMD, single instruction stream, multiple data stream.
This type refers to array processors with one instruction unit that fetches an
instruction, and then commands many data units to carry it out in parallel, each
with its own data. Some supercomputers are SIMD.
The next category is MISD, multiple instruction stream, single data stream.
No known computers fit this mode.
Next comes MIMD, we find multiple instruction stream, multiple data stream.
Which essentially means a group of independent computers, each with its own
program counter, program, and data. All distributed systems are MIMD. 6
Hardware Concepts
ATaxonomy of parallel and distributed computer systems.
7
MIMD (Multiple-Instruction Multiple-Data)

Tightly Coupled versus Loosely Coupled
 Tightly coupled systems (multiprocessors)
o shared memory
o intermachine delay short, data rate high
 Loosely coupled systems (multicomputers)
o private memory
o intermachine delay long, data rate low
8
Bus versus Switched MIMD
• Bus: a single network, backplane, bus, cable or other medium

that connects all machines. E.g., cable TV
• Switched: individual wires from machine to machine, with
many different wiring patterns in use.
Multiprocessors (shared memory)
– Bus
– Switched
Multicomputers (private memory)
– Bus
– Switched
9
Switched Multiprocessors
A crossbar switch An Omega switching network
11
Switched Multiprocessors
 for connecting large number (say over 64) of processors
 crossbar switch: n**2 switch points
 omega network: 2x2 switches for n CPUs and n memories,
log n switching stages, each with n/2 switches,
 total (n log n)/2 switches
 delay problem: E.g., n=1024, 10 switching stages from
CPU to memory. a total of 20 switching stages. 100
MIPS 10 nsec instruction execution time need 0.5 nsec
switching time
 NUMA (Non-Uniform Memory Access): placement of
program and data
 building a large, tightly-coupled, shared memory
multiprocessor is possible, but is difficult and expensive
12
Multicomputers
Bus-Based Multicomputers
A Multicomputer consisting of workstations on a LAN

 easy to build
 communication volume much smaller
 relatively slow speed LAN (10-100 MIPS, compared to
300 MIPS and up for a backplane bus)
13
Switched Multicomputers
 interconnection networks: E.g., grid, hypercube

 hypercube: n-dimensional cube
14
Software Concepts
• Software more important for users

• Three types:
1. Network Operating Systems
2. (True) Distributed Systems
3. Multiprocessor Time Sharing
15
Network Operating Systems

 loosely-coupled software on loosely-coupled hardware
 A network of workstations connected by LAN
 each machine has a high degree of autonomy
o rlogin machine
o rcp machine1:file1 machine2:file2
 Files servers: client and server model
 Clients mount directories on file servers
 Best known network OS:
o Sun’s NFS (network file servers) for shared file
systems
 a few system-wide requirements: format and meaning of
all the messages exchanged
16
NFS
NFS Architecture
• Server exports directories
• Clients mount exported directories
NSF Protocols
• For handling mounting
• For read/write: no open/close, stateless
NSF Implementation
17
(True) Distributed Systems
 tightly-coupled software on loosely-coupled hardware

 provide a single-system image or a virtual uniprocessor
 a single, global interprocess communication mechanism,
process management, file system; the same system call
interface everywhere
 Ideal definition:
“ A distributed system runs on a collection of
computers that do not have shared memory, yet looks
like a single computer to its users.”
18
Multiprocessor Operating Systems
Tightly-coupled software on tightly-coupled hardware

 Examples: high-performance servers
 shared memory
 single run queue
 traditional file system as on a single-processor system: central
block cache 19
Comparison of three different ways of organizing n CPUs
ITEM N/W OS DIS. OS MULTIPROCESSOR OS
Does it look like a No Yes Yes

virtual uniprocessor?
Do all have to run the No Yes Yes

same operating
system?
How many copies of N N 1
the operating systems
are there?
How is communication Shared files Messages Shared memory

achieved?
Are agreed upon n/w Yes Yes No

protocols required?
Is there a single run No No Yes

queue?
Does file sharing have Usually No Yes Yes

well-defined
semantics? 20
Design Issues of Distributed Systems
• Transparency
• Flexibility
• Reliability
• Performance
• Scalability
21
1. Transparency
• How to achieve the single-system image, i.e., how to make a
collection of computers appear as a single computer.
• Hiding all the distribution from the users as well as the
application programs can be achieved at two levels:
1) hide the distribution from users
2) at a lower level, make the system look transparent to
programs.
1) and 2) requires uniform interfaces such as access to
files, communication.
22
Types of transparency
– Location Transparency: users cannot tell where hardware and

software resources such as CPUs, printers, files, data bases are
located.
– Migration Transparency: resources must be free to move from one
location to another without their names changed.
E.g., /usr/lee, /central/usr/lee
– Replication Transparency: OS can make additional copies of files
and resources without users noticing.
– Concurrency Transparency: The users are not aware of the
existence of other users. Need to allow multiple users to
concurrently access the same resource. Lock and unlock for
mutual exclusion.
– Parallelism Transparency: Automatic use of parallelism without
having to program explicitly. The holy grail for distributed and
parallel system designers.
Users do not always want complete transparency: a fancy printer 1000
miles away
23
2. Flexibility
• Make it easier to change

• Monolithic Kernel: systems calls are trapped and executed by the kernel. All
system calls are served by the kernel, e.g., UNIX.
• Microkernel: provides minimal services. Shown in above Fig.
1) IPC
2) some memory management
3) some low-level process management and scheduling
4) low-level i/o
E.g., Mach can support multiple file systems, multiple system interfaces .
24
3. Reliability
• Distributed system should be more reliable than single

system. Example: 3 machines with .95 probability of being
up. 1-.05**3 probability of being up.
– Availability: fraction of time the system is usable.
Redundancy improves it.
– Need to maintain consistency
– Need to be secure
– Fault tolerance: need to mask failures, recover from
errors.
25
4. Performance
• Without gain on this, why bother with distributed systems.

• Performance loss due to communication delays:
– fine-grain parallelism: high degree of interaction
– coarse-grain parallelism
• Performance loss due to making the system fault tolerant.
26
5. Scalability
• Systems grow with time or become obsolete.

Techniques that require resources linearly in
terms of the size of the system are not scalable.
e.g., broadcast based query won't work for large
distributed systems.
• Examples of bottlenecks
o Centralized components: a single mail server
o Centralized tables: a single URL address book
o Centralized algorithms: routing based on complete
information
27

Distributed Systems: University of Pennsylvania

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Distributed Systems: University of Pennsylvania

Hochgeladen von

Copyright:

Verfügbare Formate

University of Pennsylvania

Introduction to Distributed Systems

What is a distributed system?

over Centralized Systems

over Independent PCs

Disadvantages of Distributed Systems

• Software: Difficult to develop software for

MIMD (Multiple-Instruction Multiple-Data)

Bus versus Switched MIMD

• Bus: a single network, backplane, bus, cable or other medium

A crossbar switch An Omega switching network

A Multicomputer consisting of workstations on a LAN

 interconnection networks: E.g., grid, hypercube

• Software more important for users

Network Operating Systems

(True) Distributed Systems

 tightly-coupled software on loosely-coupled hardware

Tightly-coupled software on tightly-coupled hardware

Comparison of three different ways of organizing n CPUs

ITEM N/W OS DIS. OS MULTIPROCESSOR OS

Does it look like a No Yes Yes

Do all have to run the No Yes Yes

How is communication Shared files Messages Shared memory

Are agreed upon n/w Yes Yes No

Is there a single run No No Yes

Does file sharing have Usually No Yes Yes

Design Issues of Distributed Systems

– Location Transparency: users cannot tell where hardware and

• Make it easier to change

• Distributed system should be more reliable than single

• Without gain on this, why bother with distributed systems.

• Systems grow with time or become obsolete.

Das könnte Ihnen auch gefallen