Sie sind auf Seite 1von 47

Distributed Systems

Middleware

Prof. Dr.-Ing. Torben Weis


University Duisburg-Essen
Outline

Remote procedure calls (RPC)


Middleware
Distributed objects

Distributed Systems Torben Weis 2


University Duisburg-Essen
Online Shop Example

Implementing an online shop client and server


with just sockets is tedious
Smart approach:
public Receipt order(Book[] books)
public Book[] search(String keyword)
Client calls procedures on the server as if they
were local procedures RPC

Distributed Systems Torben Weis 3


University Duisburg-Essen
RPC Overall Goal I

Provide distribution transparency


Programming as if there is no distribution
Hunt for the holy grail
We will get quite close, but we will never get hold of the
grail
We argue that objects that interact in a distributed system
need to be dealt with in ways that are intrinsically different
from objects that interact in a single address space. These
differences are required because distributed systems
require that the programmer be aware of latency, have a
different model of memory access, and take into account
issues of concurrency and partial failure. [1]
Degenerated case: Programming as if there is only
distribution S. C. Kendall, J. Waldo, A. Wollrath and G. Wyant:
A Note on Distributed Computing
Distributed Systems Torben Weis 4
University Duisburg-Essen
RPC Overall Goal II

Hide expert knowledge in the tool chain


Sockets
Session management
Data representation
Data transformation
Service discovery
you name it

Distributed Systems Torben Weis 5


University Duisburg-Essen
Remote Procedure Call

RPC sits on top of the transport layer


Hiding network communication from application
programmer i.e. building abstraction
Sockets etc. are not visible to the application
programmer
Usually a request-reply protocol is specified
Procedure invocation message (Request)
Procedure result message (Reply)
RPC is responsible for
Marshalling & unmarshalling data (parameters and
results)
External data representation
Addressing
Distributed Systems Torben Weis 6
University Duisburg-Essen
Problems

Heterogeneity
Different data representations (little-endian vs. big-
endian, ASCII vs. EBCDIC)
Addressing
How to identify a remote process?
Partial failure
What happens if the server crashes during execution?
Can we guarantee at least once semantics?
What happens if the client crashes?
Can we detect and remove orphans?
What happens if a crashed machine is rebooted?
Can addresses survive a reboot?

Distributed Systems Torben Weis 7


University Duisburg-Essen
How is it implemented?

Client-side proxy
Implements the interface of the target procedure on
the client side
Client calls this interface locally ( transparency)
Procedure is invoked on a remote machine
Server-side proxy
Implements the interface of the target procedure on
the server side
Incoming requests are dispatched to this interface
locally
Server does not realize that the call is a remote call
The result: almost distribution transparency
if there is no failure
Distributed Systems Torben Weis 8
University Duisburg-Essen
Proxy Generation

We need a tool that generates the proxies


to be continued

Distributed Systems Torben Weis 9


University Duisburg-Essen
Flow of Action I

Call remote procedure Return from call


wait for result
Client

reply
request

Server
wait for request Call local procedure wait for request
and return result
time

Distributed Systems Torben Weis 10


University Duisburg-Essen
Flow of Action II

1. Client calls procedure on client proxy locally


2. Client proxy marshals parameters and sends
message to server proxy
3. Server proxy unmarshals parameters and calls
server procedure locally
4. Procedure computes result and returns result
to the server proxy
5. Server proxy marshals result and sends
message to client proxy
6. Client proxy unmarshals result and returns to
client
Distributed Systems Torben Weis 11
University Duisburg-Essen
Book Store Example

Client Server
Client process Server process

order(Book[ ] books) order(Book[ ] books)

reply reply

Client proxy order, books Server proxy

receipt
Distributed Systems Torben Weis 12
University Duisburg-Essen
Calling Semantics

Local case
Call-by-value
Call-by-reference
Call-by-reference remotely is difficult
Simulate by call-by-copy/restore
Transmit copy of the data and transmit changed copy
back
Slightly different semantics!
Increased overhead for collections (graphs, lists,)

Distributed Systems Torben Weis 14


University Duisburg-Essen
Proxy Generation

We need a tool that generates the proxies


The tool has to know about
The interface
Procedure names
Parameter types
Return types
Exceptions
Data types

Distributed Systems Torben Weis 16


University Duisburg-Essen
Interface Definition

Interfaces are defined by an Interface Definition


Language (IDL)
Language-neutral
Usually C-style syntax
Proxies can be generated from IDL
Different proxies for different programming
languages
E.g. client in Java client proxy in Java
E.g. server in C server proxy in C

Distributed Systems Torben Weis 17


University Duisburg-Essen
IDL Example

module shops {
interface bookshop {
struct Book {
string name;
long isbn;
};
struct Receipt {
string bank;
long accountnumber;
int amount;
};
Receipt order(in Book[] books);
Book[] search(in string keyword);
};
};
Distributed Systems Torben Weis 18
University Duisburg-Essen
IDL in the Tool Chain

shop.idl

shop_cproxy.c IDL-Compiler shop_sproxy.c


C-Compiler C-Compiler
shop_cproxy.o shop.h shop_sproxy.o
#include
shop_client.c shop_server.c
C-Compiler
shop_client.o shop_server.o
Linker Linker
shop_client.exe shop_server.exe

Distributed Systems Torben Weis 19


University Duisburg-Essen
IDL Pros & Cons

Pros
Language neutral
Cons
Generated interface can be ugly
Example: CORBA and C++
Developers have to master two languages
Requires top-down approach
First IDL, then implementation
Cannot simply use existing code & data structures
Solutions? Yes!
Java RMI or .NET Remoting do not require any IDL

Distributed Systems Torben Weis 20


University Duisburg-Essen
RPC Failure I

Local call failure


When call fails the whole program fails
What can go wrong with RPC?
1.Client cannot find the server
2.Client crashes after sending request
3.Request gets lost
4.Server crashes after receiving request, before sending
response
5.Response gets lost
6.Client crashes before receiving response

Distributed Systems Torben Weis 21


University Duisburg-Essen
RPC Failure II

Fault detection
Wait for expected response
After timeout failure
What is a good timeout?
Maybe the network is too slow
Maybe the other computer is too slow
Usually no real-time Calculation of optimal timeout
is impossible
No way to find out what went wrong
Remote machine does not respond. Why?
Machine crashed?
Message loss?

Distributed Systems Torben Weis 22


University Duisburg-Essen
Client Crash

Processes on servers for non-existing clients


(Orphans)
Blocked resources
Solution
Client sends heartbeat
Server pings client
Pings & heartbeats cost resources and are subject to
failures, too
Client restart
Do not mix new with old messages
E.g. counter for every restart

Distributed Systems Torben Weis 23


University Duisburg-Essen
Server Crash

Client gets a timeout waiting for the response


Did the server process the request?
Possible semantics:
Maybe
Nothing to be done
At-least-once
Repeat until response received
At-most-once
Serial numbers for requests
Exactly-once
Transactions

Distributed Systems Torben Weis 24


University Duisburg-Essen
RPC Call Types

Synchronous (blocking) call


Parallelism in distributed system is not exploited
No parallel invocations to multiple servers
Asynchronous RPC
Fire-and-forget
Call function handler once server has finished
Deferred synchronous RPC
Do something else while server is executing
Synchronize at given point

Distributed Systems Torben Weis 25


University Duisburg-Essen
Reentrance I

The procedure
proc is invoked by
Client1 proc(x) Server1 Client2 while it is
working on the
request of Client1

Client2 proc(y)
Solution:
Serialize
requests

Distributed Systems Torben Weis 26


University Duisburg-Essen
Terms: Serialization vs. Serializability

Beware of terms
Serialization of objects/data
Context: data representation
Transforming data to an external format
Also known as marshalling or pickling
Serialization of requests
Context: transaction management
Bringing requests into an order

Distributed Systems Torben Weis 27


University Duisburg-Essen
Reentrance II

Client proc(x) Server1 Server2


proc2(x)

proc(y)

The procedure proc


Serialization is NO is invoked by the
solution. It would Server2 while it is
cause a deadlock working on the
request of the Client

Distributed Systems Torben Weis 28


University Duisburg-Essen
Example RPC Systems

Sun RPC
Used by the Network File System (NFS)
Distributed Computing Environment (DCE) RPC
Basis for Microsoft's DCOM (Component Object Model)
RPC (i.e. DCOM) has been used in several exploits

Distributed Systems Torben Weis 29


University Duisburg-Essen
Outline

Remote procedure calls


Middleware
Distributed objects

Distributed Systems Torben Weis 30


University Duisburg-Essen
Motivation

Implementing a distributed application on top


of sockets is tedious, it means dealing with
challenges of distributed systems on your own
Distribution transparency
Interoperability (heterogeneity: data representation)
Security
Common services: Naming, Persistency, Events,
Transactions,
Higher level of abstraction: middleware
Deals with challenges of distributed systems (to a
certain degree)
Provide additional services (e.g. naming,
persistency,)
Distributed Systems Torben Weis 31
University Duisburg-Essen
Definition

There is no good definition for middleware.


The slash between client/server.
Middleware is the intersection of the stuff that
network engineers dont want to do with the
stuff that application developers dont want to
do.

Distributed Systems Torben Weis 32


University Duisburg-Essen
Classical Approach

Computer A Computer B Computer C

Distributed application Distributed application Distributed application

Network services Network services Network services

Distributed Systems Torben Weis 33


University Duisburg-Essen
Middleware

Computer A Computer B Computer C

Distributed application

Middleware

Network services Network services Network services

Distributed Systems Torben Weis 34


University Duisburg-Essen
Communication Models

Message passing Virtual shared memory

Client Server Client Server


send() write() read()

receive()

Remote procedure call Distributed object systems

Client Server Client Server


proc() o.operate()

Distributed Systems Torben Weis 35


University Duisburg-Essen
Services

Persistency
Security
Naming
Events
Transaction
Trader
Accounting

Distributed Systems Torben Weis 36


University Duisburg-Essen
Examples

Distributed object system


OMG CORBA
Remote procedure call
DCE RPC
Message passing
IBM MQSeries
Virtual shared memory
Linda

Distributed Systems Torben Weis 37


University Duisburg-Essen
Outline

Remote procedure calls


Middleware
Distributed object systems

Distributed Systems Torben Weis 38


University Duisburg-Essen
Distributed Objects

Todays commonly used programming


languages are object-oriented
Remote objects
Objects that can receive method invocations from
objects in other processes
Including processes on a different machine
Objects get a remote interface
Defined in IDL (generic)
Or by programming language-specific interface
Client needs a remote object reference to
perform a remote method invocation

Distributed Systems Torben Weis 39


University Duisburg-Essen
Example

Machine A Machine B

Process B1
Process A1 Remote invocation O4
O1 O3
Remote invocation
Local invocation
O2
O5
Process B2

Distributed Systems Torben Weis 40


University Duisburg-Essen
Why (Distributed) Object Systems?

Middleware provides a programming abstraction


Programming languages changed
C, Pascal, Basic C++, Java, C#, Delphi,
VisualBasic.NET
Object-orientation is the most prominent paradigm
Extend it to the remote case
Objects are well suited for proxies
Objects provide a public interface
The implementation is not visible to the outside
A proxy is just an object with a special (remote)
implementation

Distributed Systems Torben Weis 41


University Duisburg-Essen
Concept similar to RPC

1. Client calls method on client proxy object


2. Client proxy object marshals parameters and
sends message to server proxy object
3. Server proxy unmarshals parameters and calls
server method locally
4. Method computes result and returns result to
the server proxy object
5. Server proxy object marshals result and sends
message to client proxy object
6. Client proxy object unmarshals result and
returns to client
Distributed Systems Torben Weis 42
University Duisburg-Essen
Flow Schematics

Client Server

Client object Remote object

order(Book[ ] books) receipt receipt order(Book[ ] books)

Proxy object Proxy object


order, books

receipt
Distributed Systems Torben Weis 43
University Duisburg-Essen
Difference to RPC

Class A { void foo( int x ); }


vs
void foo( A* object, int x )
In the local case there is no difference
In the remote case there is a difference
A* object is a pointer that cannot be marshalled
Distributed object systems introduce
object references
An object reference is the remote equivalent to a
pointer

Distributed Systems Torben Weis 44


University Duisburg-Essen
Remote Object References

Similar to local object references


Uniquely identifies an object in the distributed system
Can be passed between processes on different
machines
E.g. host, port, object key
Client must bind to an object using the
reference
Binding builds a proxy on the client side
Remote methods can be invoked on proxy
Reference must contain enough information to allow
binding (e.g. endpoint)

Distributed Systems Torben Weis 45


University Duisburg-Essen
Remote Object Activation

Bookstore example:
We treat every book as an object
Every remotely accessible object has a remote object
reference
However, books are stored in a database
We cannot hold all book objects in memory
Solution
Create object references for virtual objects, for
example
(www.mybookstore.com, 80, ISBN:1-12345-434-5)
Virtual objects are incarnated (i.e. created from the
database) upon invocation
They are garbage collected afterwards
Distributed Systems Torben Weis 46
University Duisburg-Essen
Remote Object Activation

Server

Remote object
Remote object
Remote object Data
dispatch method call base
to requested object
activate object:
Proxy object incarnate from database

Distributed Systems Torben Weis 47


University Duisburg-Essen
Distributed Objects Realization

Language integrated
Definition of remote objects at language level
Easy to use
Language dependent
E.g. Java RMI
Language independent
IDL to specify interface
Objects can be implemented in any language
Even in a procedural language using procedures and
data structures as object state
More programming overhead
E.g. CORBA

Distributed Systems Torben Weis 48


University Duisburg-Essen
Static vs. Dynamic Invocation

Static invocation
Interface of the remote objects is known while client
is being developed
Client must be recompiled when interface changes
Example: C++, Java
Dynamic invocation
Compose method invocation at runtime
Inspect target object or interface implicit in client
implementation
Available methods, parameters,
any invoke(object, method, parameters[])
Typically used for interpreted languages & scripting
languages e.g. TCL, Python, Ruby,
Distributed Systems Torben Weis 49
University Duisburg-Essen

Das könnte Ihnen auch gefallen