Chap 2

Message passing
ô Key concepts
Introduction
IPC
Remote Procedure Calls
Group communication
m m m m
¬ntroduction
ô In distributed system, processes executing on different computers
often need to communicate with each other to achieve some
common goal
ô Inter process communication (I.P.C.) requires information sharing

among two or more processes
ô m basic methods for information sharing are
. Original sharing or shared-Memory approach
m. Copy sharing or message-passing approach
© m m m
¬ntroduction (contd«)
ô ~hared-Memory approach
Write A Read A
Shared common

Memory area
ô Message-passing approach
Send A Receive A

ô A message passing system is a subsystem of a distributed O~ that

provides a set of message-based IPC protocols
ô It serves as a suitable infrastructure for building other higher level
IPC systems, such as remote procedure call(RPC) and distributed
shared memory(D~M)
§ m m m
Ôesirable Features of a good message-
message-passing
system
ô ~implicity
ô Uniform semantics
ôcal communication
Remote communication
ô ufficiency
If the message passing system is not efficient, IPC become more
expensive. i.e. users will not feel like using this mechanism)
6 m m m
Features of a good message-
message-passing system
(contd
contd«)
«)
~ome optimizations normally adopted
è Avoiding the cost of establishing and terminating connection between
the processes for each and every message exchange.
è Minimizing the costs of maintaining connections
è Piggy backing of acknowledgement.
ô Reliability
D~ are prone to node crashes or link failures. Retransmit the message (
may be based on timeouts)
Due to timeouts ± Duplicate Message
A good Message passing system should have IPC protocols to handle

these issues.
º m m m
(contd
contd«)
«)
ô Correctness - related to group communication
Atomicity ± either to all or None
Ordered delivery ± Order acceptable to the application
~urvivability ± guarantees message delivery despite of failures
ô lexibility
IPC primitives must also have the flexibility to permit any kind of
control flow between the co-operating processes including
synchronous and asynchronous send Receive.
m m m
(contd«)
ô ~ecurity
è Authentication of the receiver sender
è uncrypted message
ô Portability
m aspects of portability
è [he message passing system should itself be portable
è [he applications written by using the primitives of IPC protocols of the
message passing system should be portable. ~o, Heterogeneity must
be considered while desiging message passing system
D m m m
¬ssues in ¬ by message passing
ô A message is a block of information formatted by a sending process in
such a manner that it is meaningful to receiving process
ô It consists of a fixed length header and a variable size collection of
typed data objects
ô [he header consists of:
Address ± to identify the sending receiving process
~equence number ± message identifier for identifying lost

duplicate message
~tructural information
. [ype ± data or pointer to data
m. êngth of the variable size message
m m m
¬ssues in ¬ (contd«)
Structural information
Addresses
Actual data Sequence

or Number number
pointer of or Receiving Sending
Type
to the data bytes/ Message id rocess process
elements address address
Variable Fixed-length header

size
collection
of typed
data
A typical message structure
m m m
ô In the design of an IPC protocol, following important issues need to be
considered
Who is the sender?
Who is the receiver?
Is there one receiver or many receivers?
Is the message guaranteed to have been accepted by its receiver's?
Does the sender need to wait for the reply?
What should be done if a node crash or link failure occurs?
What should be done if the receiver is not ready to accept the message?
If there are several outstanding messages for a receiver, can it choose the
order in which to service the outstanding messages?
m m m
ô Issues in IPC are addressed by:
~ynchronization
Buffering
Multi Data gram Messages
uncoding and Decoding
Process Addressing
ailure Handling
Group Massaging
m m m m
Synchronization
ô ~emantics used for synchronization may be broadly classified as
Blocking ± its invocation blocks the execution of its invoker
Nonblocking - Its invocation does not block the execution of its invoker
ô How a non blocking RuCuIVING process knows message arrival?

è Polling : Periodically poll the Kernal to check the Buffer status.
è Interrupts: When message is filled in the buffer, a software interrupt is

used to notify the receiving process.
© m m m
Synchronous mode of communication with both < and
primitives having blocking-type semantics
Sender¶s execution Receiver¶s execution
Receive (message);
Execution suspended
Send (message)
Execution suspended Message
Execution resumed
Send (acknowledgement)
Execution resumed
Acknowledgement
Blocked state
Executing state
§ m m m
Buffering
ô Messages can be transmitted from one process to another by copying
the body of the message from the address space of sending process to
the address space of the receiving process
ô [he message buffering strategy in IPC is strongly related to
synchronization strategy
ô our types of buffering strategy are:
Null buffer ( or no buffering)
~ingle message buffer
Buffer with unbounded capacity
inite-bound (or multiple-message) buffer
6 m m m
Buffering (contd«)
ô Null buffer (or No buffering)
[here is no place to temporarily store the message
~trategies used are:
è [he message remains in the senders process¶s address space and
the execution of the send is delayed until the receiver executes the
corresponding receive
è [he message is simply discarded and the timeout mechanism is used
to resend the message after a timeout period
º m m m
Buffering (contd«)
ô [he logical path of message transfer is directly from the sender¶s

address space to the receiver¶s address space, involving single copy
operation
Sending process Receiving process
MSG
Message transfer in synchronous send with no buffering strategy
m m m
Buffering (contd«)
ô ~ingle-message buffer
Null buffer strategy is not suitable for synchronous communication
èA message has to be transferred two or more times, and receiver of the
message has to wait for the entire time taken to transfer the message
across the network
~ynchronous communication mechanisms in Distributed systems use a
single-message buffer strategy
A buffer having the capacity to store a single-message is used on the
receiver¶s node
D m m m
Buffering (contd«)
ô Idea is to keep the message ready for use at location of the receiver
ô [he request message is buffered on the receiver¶s node if the receiver is not
ready to receive the message
ô [he message buffer may be either in kernel¶s address space or in the
receiver¶s process¶s address space
Sending process Receiving process
Single-message
buffer
Node boundary
Message transfer in synchronous send with single-message buffering

strategy (two copy operations needed)
m m m
Buffering (contd«)
ô Unbounded-capacity buffer
In asynchronous mode of communication, since a sender does not wait
for the receiver to be ready, there may be several pending messages
that have not yet been accepted by the receiver
An unbounded-capacity buffer is needed that can store all unreceived
messages to support asynchronous communication
è With assurance that all the messages sent to the receiver will be
delivered
m m m m
Buffering (contd«)
ô inite-bound (or multiple-message) buffer
Unbounded capacity of a buffer is practically impossible
When buffer has finite-bound - problem is buffer overflow
[he buffer overflow can be dealt with one of m ways:

è Unsuccessful communication
o Message transfers simply fail whenever there is no more buffer space
è low-controlled communication
o [he sender is blocked until the receiver accepts some messages, thus
creating space in the buffer for new messages
m m m m
Buffering (contd«)
Receive
Send
MSG
Multiple-message
buffer/mailbox port
Message transfer in asynchronous send with multiple-message buffering strategy
ô [he message is first copied from the sending process¶s memory into the
receiving process¶s mailbox
ô [hen message is copied from the mailbox to the receiver¶s memory
when the receiver calls for the message
mm m m m
Multidatagram messages
ô Maximum transfer unit (M[U)
Upper bound on the size of data that can be transmitted at a time
ô Message whose size is greater than M[U has to be fragmented into
multiples of M[U and sent separately
ô uach fragment is sent in a packet (known as datagram)
ô Messages smaller than M[U can be sent in a single packet (known as
single-datagram messages)
ô Messages larger than M[U have to separated and sent in multiple
packets (known as Multidatagram messages)
ô [he disassembling and reassembling of messages on sender and
receiver side is the responsibility of message passing system
m© m m m
Encoding and Ôecoding of message data
ô [he structure of the message data should be preserved between the
sending and receiving processes
ô It is very difficult to achieve this goal in both heterogeneous and
homogenous systems
m reasons
è An absolute pointer value loses its meaning when transferred from

one process address space to another
è Different program objects occupy varying amount of storage space
o A message must normally contain several types of program objects,
such as long integers, short int, variable length characters and so on
m§ m m m
Encoding and decoding of message data
(contd«)
ô [wo representation for encoding and decoding of message data:
[agged representation
è [he type of each program object along with its value is encoded in the
message
è Because of self-describing nature of the coded data format
o Receiving process does not need prior knowledge
Untagged representation
è Message data only contains program objects
è No information is included in the message data to specify the type of each
program object
o Receiving process must have prior knowledge of how to decode the
received data
m6 m m m
rocess addressing
ô Message passing system usually supports m types of process addressing
uxplicit addressing
è [he process with which communication is desired is explicitly named as

a parameter in the communication primitive used
o ~end (Process-id, Message)
[o the process
o Receive (Process_id, Message)
rom the process
mº m m m
rocess addressing (contd«)
Implicit addressing
è Processwilling to communicate does not explicitly name a process for
communication
o ~end-any (service_id, Message)
~end a message to any process that provides the service of
type ³service id´
o Receive any (Process_id, Message)
Receive a message from any process & return the
³process_id´ of the process from which message was received.
m m m m
ô Processes can be identified by the combination of three fields:
Machine_id, local_id, machine_id
è irst field identifies the node on which process is created
è ~econd field is a local identifier generated by the node on which processes

is created
è [hird filed identifies the last known location (node) of the process
[he value of the first m fields of its identifier never change; the third field,
however, may
è [his method of addressing is known as link-based addressing
mD m m m
ô înk-based addressing:
When a process is migrated from its current node to a new node, a link
information {process id, networks M c id} is left on its previous node and
on a new node,
a new local id is assigned to a process, and its process identifier and the
new local-id is entered in a mapping table maintained by the kernel of
the new node for all processes created on another node but running on
their node.
If the value of the third field is equal to the first field, the message will be
sent to the node on which the process was created
m m m m
ô Drawbacks: uventhough it supports migration facility, it suffers from m
main drawbacks
è [he overhead of locating a process may be large if the process has
migrated several times during its lifetime
è It
may not be possible to locate a process if an intermediate node on
which the process once resided during its lifetime is down
ô Both process addressing methods are nontransparent due to the need
to specify the machine identifier
ô What are the alternatives?
© m m m
. Centralized process identifier allocator
Maintains a counter. When it receives a request for identifier, it
returns the current value of the counter and increments the counter
It suffers from poor reliability and scalability
m. [wo level naming scheme for processes
. Machine independent high level name
m. Machine dependent low level name
with a centralized( or replicated distributed) name server maintaining
the map table that maps high level name to the low level name
© m m m
Failure handling
ô Possible problems in IPC due to different types of system failures
ôss of request message
è ailure of communication link between sender and receiver or receiver¶s

node is down at time the request reaches there
ôss of response message
è ailureof communication link or ~ender¶s node is down at the time the

response message reaches there
Unsuccessful execution of the request
è Receiver¶s node crashing while request is being processed
©m m m m
Failure handling (contd«)
Sender Receiver
Send Request
a) Request request message
message is lost
Lost
Send Request
b) Response request message
message is lost Successful request execution
Response
message Send response
Lost
Send
c) Receiver¶s Request
request
computer crashed message
Successful request
execution
rash
Restarted
©© m m m
ô our-message reliable IPC protocol for client-server communication
between two processes
lient Server
Request
Acknowledgement
Reply Executing state
Blocked state
Acknowledgement
©§ m m m
ô [hree-message reliable IPC protocol for client-server communication
lient Server
Request
Blocked state
Acknowledgement
©6 m m m
ô [wo-message reliable IPC protocol for client-server communication
lient Server
Request
Blocked state
©º m m m
ô ault tolerant communication between a client and a server
Client Server
Send
Request REQUEST Message
Time Out Lost

Send Retransmit REQUEST Message
Request
Time Out Unsuccessful

Execution

Send
Request Retransmit REQUEST Message Restarted
Time Out Successful Execution

Send Response
Lost These 2 successful

Send executions of the
Request Retransmit REQUEST Message message may
produce different
results
Successful Execution
Response
© m m m
Failure handling (contd
(contd«)
«)
ô Idempotency and handling of duplicate request messages
Idempotency means repeatability
An Idempotent operation produces same results without any side

effects no matter how many times it is performed with the same
arguments
uxample simpleIntrest( m, m, 8 ) procedure produces same

result when executed repeatedly
A Non Idempotent operation produces different results for the same

set of arguments when executed repeatedly
©D m m m
ô uxample : Non Idempotent operation
int Cal_inal_Marks (int und_~em_Marks, int attndnce)
{ [otal_Marks += und_~em_Marks ;
if ( attndnce > 9 )
[otal_Marks += ;
else if ( attndnce > 9 )
[otal_Marks += 3 ;
[otal_Marks += m ;
[otal_Marks += ;
return([otal_Marks );
}
© m m m
CLIENT SERVER
Send Total_Marks = 43
Request Cal_Final_Marks (34, 87)
Execute Cal_Final_Marks.
Total_Marks=43+34+2 = 79
Timeout Retrun(79)
Lost
Send Retransmit REQUEST Message

Request Cal_Final_Marks(34, 87)
Execute Cal_Final_Marks.
Total_Marks=79++34+2 = 115
Retrun(115)
Receive Total_Marks
= 115
A nonidempotent procedure
§ m m m
When no response is received by the client, it is impossible to
determine whether the failure was due to server crash or loss of the
request or response message.
Using timeouts client resends the request.
Repeated execution of NonIdempotent requests results in
³ORPHAN´ executions
How to ensure only one execution of NonIdempotent requests ?
Using uxactly once semantics
uxactly once semantics is implemented using unique identifier for
each request at the client side and reply cache on the server side
§ m m m
CLIENT SERVER
Total_Marks = 43
Send
Cal_Final_Marks (34, 87) Reply Cache
Request01 Check reply Cache for request01.
NOT FOUND
Execute Cal_Final_Marks. REQUEST REPLY TO BE
IDENTIFIER SENT
Total_Marks=43+34+2 = 79
Time Save Reply
out Retrun(79) Request 01 79
Request02 45
Lost
.. ..
Send Retransmit
Request01 Cal_Final_Marks(34, 87)
Check reply Cache for request01.

FOUND
Extract Reply
Retrun(79)
Receive
Total_Marks =
79
EXACTLY ONCE SEMANTICS USING REQUEST IDENTIFIERS AND REPLY CACHE

m m
Keeping track of ôst &out-of-sequence packets in multi data gram
Messages.
ôHow ensure reliable delivery of all the packets of the Multidatagram
message?
ô~imple approach is using ~[OP & WAI[ Protocol
o Acknowledge each packet seperately
Disadvantage: Communication Overhead.
ôBetter approach is using BÂ~[ Protocol

o ~ingle Acknowledgement packet for all the packets of a
multidatagram message
§© m m m
When BÂ~[ protocol is used, Node or Common link failure leads to
ôss of packets
Out of sequence delivery of Packets
[o solve this:
Use of Bitmap to identify the packet of a message using m extra
fields to the Header.
[otal No of Packets, Bit map specifying the position of the packet.
Use ³ ~uûC[IVu RuPuA[ ³ method to transmit the ôst packets
after time out period.
§§ m m m
SENDER RECEIVER
Send Request message
Buffer for 4 packets

Create a buffer for 4 packets and
place this packet in position 1
Packets of
the
Response Timeout
message
Retransmit the request of missing packets
Resend
Missing
packets
§6 m m m
Group communication
ô [hree types of group communication:
One to many (single sender and multiple receivers)
Many to one (multiple senders and single receiver)
Many to many ( multiple senders and multiple receivers)
§º m m m
Group communication (contd«)
ô One-to-many communication
Also known as multicast communication
~pecial case of multicast communication is broadcast communication
è Message is sent to all processors connected to a network
ô Group management
è Closed Group - Only the members of the group can send message to
the group.
è Open Group ± Any person in the system can send the message to the
group.
è CentralizedGroup ~ervers (with Replication) ± or dynamic
management of Group members.
§ m m m
ô Group addressing
m level naming scheme is normally used for group addressing
High level group name is an A~CII string that is independent of the

location information of processes in the group
ôw level group name depends on underlying hardware
~pecial address to which multiple machines can listen is called multicast

address
Networks that do not have multicast address have broadcasting facility
with Broadcast address
§D m m m
ô Message delivery to receiver process
User applications use high-level group names in programs
[he centralized group server maintains a mapping of high-level group
names to their low-level names
Group server also maintains a list of the process identifiers of all the
processes for each group
§ m m m
ô Buffered and unbuffered multicast
Multicast is an asynchronous communication mechanism
Multicast send cannot be synchronous due to:
è Itis unrealistic to expect a sending process to wait until all the receiving
processes that belong to the multicast group are ready to receive the
multicast message
è [he sending process may not be aware of all the receiving processes that
belong to the multicast group
or unbuffered multicast, the message is not buffered
è ôst if receiving process is not in a state to receive it
or buffered multicast, the message is buffered for receiving process
è uach process of group receive the message
6 m m m
ô m types of semantics for one-to-many communications
è ~end-to-all semantics
è Bulletin-board semantics
Bulletin-board semantics is more flexible than send-to-all semantics,

because of the following factors ignored by send to all:
è [he relevance of a message to a particular receiver may depend on the
receiver¶s state
è Messages not accepted within a certain time after transmission may no
longer be useful
6 m m m
ô lexible reliability in multicast communication
In one to many communication, the degree of reliability is normally
expressed in:
è [he -reliable
o ux. [ime signal generation

è [he -reliable
o ux. Request for service

è [he m-out-of-n-reliable
o ( m n) ux. Consistency Control Algorithm

è All reliable
o ux. Updation of replicas
6m m m m
ô Atomic multicast
Has an all-or-nothing property
When message is sent to group, it is either received by all processes that are
members of the group or else it is not received by any of them
ô Many-to-one communication
Multiple senders send messages to a single receiver
~ingle receiver may be selective or nonselective
~elective receiver specifies a unique sender
è Message exchange takes place only if that sender sends a message
Nonselective receiver specifies a set of senders

è Message exchange takes place only if any sender in the set sends a
message to this receiver
6© m m m
ô Many-to-many communication
Multiple senders send messages to multiple receivers
Important issue is ordered message delivery
Ordered message delivery ensures that all messages are delivered to

all receivers in an order acceptable to the application
Ordered message delivery requires message sequencing
Commonly used semantics for ordered delivery of multicast messages

are:
è Absolute ordering
è Consistent ordering
è Casual ordering
6§ m m m
Group communication (ontd«)
ô Absolute ordering
All messages are delivered to all receiver processes in the exact order
in which they were sent
~ystem is assumed to have clock at each machine, and clocks are
synchronized with each other
Uses global timestamp as message identifiers
Kernal of the receiver places the message in a queue
~liding window mechanism is used to deliver the message periodically
Messages whose time stamp falls within the current window are
delivered to the receiver
66 m m m
ô Absolute ordering
S1 R1 R2 S2
Time
t1
m1
t2
m1 t1 < t2
m2
m2
Absolute ordering of messages

6º m m m
ô Consistent ordering
All messages are delivered to all receiver processes in the same order
However this order may be different from the order in which messages
were sent
S1 R1 R2 S2
Time
t1
t2
m2
m2 t1 < t2
m1 m1
Consistent ordering of messages

6 m m m
ô Implementation of consistent-ordering
I Approach : Centralised ~equencer Method
Many-to-many scheme appear as a combination of many-to-one and
one-to-many schemes
Kernels of sending machines send messages to a single receiver
(known as sequencer)
è Assigns sequence number to each message and then multicasts it
Kernel of each receiving machine saves all incoming messages meant

for a receiver in a separate queue
è Messages in queue are delivered immediately to receiver unless there
is a gap in the sequence number
6D m m m
ô Implementation of consistent-ordering (Contd«)
~equencer based method is subject to single point failure and has

poor reliability
ô II Approach : ABCA~[ protocol (Distributed)
Assigns sequence number to a message by Distributed agreement

among the group members and the sender
. ~ender assigns a temporary sequence number to the message and
sends it to all members of the multicast group.
6 m m m
ô ABCA~[ protocol :
[his sequence number should be greater than the previous number used
by the sender. A counter is used.
m. On receiving the message, each member of the group returns a
proposed sequence number to the sender
è Member(i) calculates its proposed sequence number as
max ( max, Pmax) + + i N
o max o largest final sequence number agreed upon so far for a
message received by the group
o Pmax o largest proposed sequence number by this member
o N o total number of members in the multicast group
o i o member number
º m m m
ô ABCA~[ protocol :
3. When sender has received the proposed sequence numbers from all
the members, it selects the largest one as the final sequence number
for the message and sends it to all members in a COMMI[ message
On receiving the COMMI[ message, each member attaches the final

sequence number to the message
Committed messages with final sequence numbers are delivered to

the application programs in order of their final sequence numbers
º m m m
ô Casual ordering
unsures that if the event of sending one message is casually related to

the event of sending another message, the two messages are delivered
to all receivers in the correct order
[wo message sending events are said to be casually related if they are
co-related by the happened-before relation
ºm m m m
ô Casual ordering
S1 R1 R2 R3 S2
m1
t1
Time
m1 m2
m3
m1
m2
m3
CASUAL ORDERING OF MESSAGES

º© m m m
ô Implementation of casual ordering
CBCA~[ protocol
. uach member process of a group maintains a vector of ³n´

components, where ³n´ is the total number of members in the group
m. uach member is assigned a sequence number from to n.
3. ith component of the vector corresponds to the member with

sequence number i and it is equal to the number of last message
received in sequence by the ith member.
º§ m m m
§. [o send a message, a process increments the value of its own component
in its own vector and sends the vector as part of the message
. When message arrives at a receiver process¶s site, it is buffered by the
runtime system and the Runtime system tests the two conditions, to decide
whether message can be delivered or it must be delayed to ensure casual-
ordering semantics
è ~ i ] = R i ] + and
è ~ j ] = R j ] for all j != i
where ~ is Vector of ~ender process and R is Vector of Receiver process
º6 m m m
~ i] = R i] + ensures that the receiver has not missed any message from the
sender
~ j] = R j] for all j!=i ensures that the sender has not received any message
that the receiver has not yet received
6. If message passes these two tests, the runtime system delivers it to the
user process
7. Otherwise the message is left in the buffer and the test is carried out again
for it when a new message arrives
ºº m m m
ô CBCA~[ protocol for implementing casual ordering
Status of vectors at some instance of time

Vector of Vector of Vector of Vector of
process A process B process C process D
3 2 5 1 3 2 5 1 2 2 5 1 3 2 4 1
Process A sends a
new message to
other processes
4 2 5 1 message data Delay because
Deliver the condition
A[1]=C[1] + 1
Delay because
is FALSE
the condition
A[3]<=D[3]
is not TRUE
º m m m
Remote rocedure alls
ô It is a special case of general message-passing model of IPC
ô RPC has become a widely accepted IPC mechanism in distributed
systems because of the following features
~imple call syntax
amiliar semantics ( similar to ôcal procedure calls)
Well-defined interface
uase of use
Generality
ufficiency
Can be used as an IPC mechanism to communicate between
processes on different machines as well as between different
processes on the same machine
ºD m m m
R model
ô RPC model is similar to the procedure call model used for the transfer of
control and data within a program in the following manner:
or making a procedure call, the caller places arguments to the
procedure in some well specified location
Control is then transferred to the sequence of instructions that
constitutes the body of the procedure
[he procedure body is executed in a newly created execution
environment
After the procedure¶s execution, control returns to the calling point,
possibly returning a result
º m m m
Typical Model of a R
aller allee
(lient rocess) (Server rocess)
all rocedure &

Wait for reply
Receive request
it can be & start rocedure
asynchronous , Execution
so that client
can do other rocedure Executes
task while
waiting for replya Send Reply & Wait
for next Request
Resume Execution
m m m
Transparency of R
ô A transparent RPC mechanism is one in which local procedures and
remote procedures are indistinguishable to programmers
ô [ransparent RPC require
. ~yntactic transparency
è RPC should have exactly the same syntax as a local procedure call
m. ~emantic transparency
è ~emantics of RPC should be identical to those of a local procedure

call
m m m
Transparency of R (ontd«)
ô Differences between RPC and ^PC:
With RPC, the called procedure is executed in an address space that
is disjoint from the calling program¶s address space. ~o, remote
procedure cannot have access to any variables or data values in the
calling program¶s environment
RPC are more vulnerable to failure than ^PC¶s
è ~ince they involve m different processes and possibly a network and m
different computers
RPCs consume much more time (- times more) than ^PCs
è Due to involvement of a communication network
m m m m
¬mplementation of R mechanism
ô Implementation of RPC mechanism involves five elements of program:
. [he client
m. [he client stub
3. [he RPCRuntime
§. [he server stub
. [he server
ô [he client, the client stub, and one instance of RPCRuntime execute
on the client machine
ô [he ~erver, the ~erver stub, and one instance of RPCRuntime execute
on the server machine
© m m m
Client Machine Server Machine
Client Process Server Process

Call Return Return Execute Call
10
1 6 5
Client Stub Server Stub

Pack Unpack Pack Unpack
2 9 7 4
RPC Runtime RPC Runtime

Wait
Send Receive
Send Receive
Result packet D
Call packet 3
§ m m m
ô Client
User process that initiates a RPC
Makes perfectly normal local procedure call that in turn invokes

corresponding procedure in client stub
ô Client stub
[wo tasks:
è On receipt of call request from client, it packs a specification of the

target procedure and the arguments into a message and then asks the
local RPC Runtime to send it to the server stub
è On receipt of the result of procedure execution, it unpacks the result
and passes to the client
6 m m m
ô RPCRuntime
Handles transmission of messages across the network between client
and server machines
It is responsible for retransmission, acknowledgements, packet routing
and encryption
RPC runtime on the client machine receives the call request message
from the client stub and sends it to the server machine. It also receives
the result message from the server and passes it to the client stub
RPC runtime on the ~erver machine receives the result message from
the server stub and sends it to the client machine. It also receives the
call request message from the client and passes it to the server stub
º m m m
ô ~erver stub
[wo tasks:
è On receipt of request from local RPCRuntime, it unpacks it and makes

a perfectly normal call to invoke the appropriate procedure in the server
è On receipt of result, it packs the result into a message and then asks
the local RPCRuntime to send it to the client stub
ô ~erver
On receiving call request from server stub, the server executes the
appropriate procedure and returns the result of procedure execution to
the server stub
m m m
ô ~tub generation:
m ways
è Manually : RPC implementor provides a set of translation functions
from which a user can construct stubs
è Automatically : Uses Interface Definition ânguage (ID^) to define the
interface between a client and a server.
ô RPC messages:
m types of messages involved in the implementation of an RPC system
are:
è Call messages
è Reply messages
D m m m
ô Call messages:
m basic components necessary in a call message are:
è [he identification information of the remote procedure to be executed
è [he arguments necessary for the execution of the procedure
In addition to these fields, a call message normally has
è A message identification field
è A message type field
è A client identification field
m m m
A typical RPC call message format
Remote procedure identifier

Message Message
identifier type lient Version
rogram Arguments
identifier number rocedure
number
(all / number
(Seq.No.)
Reply)
D m m m
RPC reply message format
Message Reply Result

Message
identifier status
type
(successful)
a) A successful reply message format
Message Message Reply status Reason for

identifier type failure
(unsuccessful)
b) A unsuccessful reply message format

D m m m
Server Management
) ~erver Implementation
m) ~erver Creation
ô Based on style of server Implementation sever can be classified as
. ~tateful server ± Maintains client state information. ~o client need
not send the information all the time.
m. ~tateless server ± Does not Maintain client state information.
Dm m m m
Stateful server
Open ( Filename, Mode )
File Mode R/W Pointer
Id
Return ( Fid )
Read ( Fid , 200, buffer )
Return ( bytes 0 to 199 )
Open ( Fid, 5, buffer )
Close ( fid )
Return ( Successful )
~tateful file server
D© m m m
Stateless server
Read( Filename,0, 200,buffer )
File Mode R/W Pointer
Id
Read( Filename,400,20,buffer )
~tateless file server

D§ m m m
Staless vs. Stateful servers
ô ~tateful servers provide an easier programming paradigm, clients
need not keep track of state information
ô ~tateful servers are more efficient than stateless servers
ô ~tateless servers make crash recovery easy in the event of server

crash
ô Choice of using stateless or stateful server is purely application

dependent
D6 m m m
Server reation Semantics
ô ~ever processes may either be created and installed before their client
processes or be created on demand basis.
ô Based on the time duration for which RPC server survive, RPC servers
are classified as
. Instance ± per-call ~erver.
m. Instance ± per- session ~erver
3. Persistent ~erver
. Instance±per-call ~erver : ~ervers exist only for the duration of a single

call.
It is created by RPC Runtime on the server machine, only when the call
message arrives.
~erver is deleted after the call execution.
Dº m m m
Server reation Semantics
Not commonly used approach because,
It is stateless approach, needs state information to be presented
either at client process ([ime consuming and loss of data
abstraction) or at server O.~. (uxpensive)
Multiple invocation of same server becomes more expensive.
m. Instance ± per- session ~erver : ~erver exists for the entire session
for which client & server interact. ~erver can maintain internal state
information. Overhead involved in creation and destruction is
minimized.
3. Persistent ~erver : ~erver remains in existence indefinitely. A
persistent server can be shared unlike other two.
D m m m
ommunication protocols for Rs
. [he Request(R) protocol
lient Server
Request message
rocedure
First R
execution
Request message
Next R rocedure
execution
DD m m m
ommunication protocols for Rs
ô [he Request protocol
Used in RPC in which the called procedure has nothing to return and
client requires no confirmation that procedure is executed
Only one message per call is transmitted
An RPC that uses the R protocol is called asynchronous RPC
In asynchronous RPC, the RPCRuntime does not take responsibility for

retrying a request in case of communication failure
Asynchronous RPC with unreliable transport protocol are generally useful
for implementing periodic update services
D m m m
ommunication protocols for R¶s
m. [he Request Reply(RR) protocol
lient Server
Request message
First R rocedure

execution
Also serves as acknowledgement

for the request message
Request message
Also serves as acknowledgement for the

Next R reply of previous R
rocedure
execution
Reply message
Also serves as acknowledgement for the

request message
m m m
ô [he Request Reply (RR) protocol
~uitable for simple RPC in which all the arguments & results fit in a
single packet buffer and duration of call and interval between the call is
short (less than transmission time)
It is based on the idea of using implicit acknowledgement to eliminate
explicit acknowledgment messages
In this protocol
èA server¶s reply message is regarded as an acknowledgment of client¶s
request message
èA subsequent call packet from a client is regarded as an
acknowledgement of the server¶s reply message of the previous call
made by that client
m m m
3. [he Request Reply Acknowledge-reply(RRA) protocol
lient Server
Request message
First rocedure
R execution
Reply message
Reply ack message
Request message
rocedure
Next
execution
R
Reply message
Reply ack message

m m m m
ô [he RRA protocol
Message identifiers associated with request messages are ordered
Client acknowledges the reply message only if it has received the reply
for all the previous requests
~erver deletes information from its cache only after receiving an
acknowledgement for it from the client
ôss of acknowledgement is harmless, since an acknowledgement
message guarantees the receipt of reply for earlier messages
© m m m
lient Server Binding
ô Binding: Process by which client become associated with server so that
calls can take place.
~erver locating:
. Broadcasting:
Message is broadcast to all nodes.
Node housing the desired server responds.
uasy to implement & suitable for small networks. uxpensive for large
networks.
m. Binding Agent:
A name server used to bind a client to a server.
Name server maintains the Binding [able.
§ m m m
Name Server
Binding Agent
2 1
3
4
Client Calls the Server
6 m m m
Advantages of using Binding Agent:
ô Can support Multiple ~ervers having the same interface type so that any
of the available server may be used to service the client¶s request.
ô Binding agent can Balance the load evenly among the servers providing
the same service.
ô User Authorization facility can be provided for binding
Disadvantages:
ô Overhead becomes large when many client processes are short lived.
ô Binding Agent may become a performance bottleneck
º m m m
Binding time: -
. Compile time Binding ĺ Hard coding of ~erver¶s network addresses.
uxtremely Inflexible (if configuration changes)
m. înk time Binding ĺ Request B.A. before making call
~erver process exports its services by registering it
Client makes Import request to the binding agent for the service before
making call
Binding Agent returns the server details to the client
Client caches it to avoid contacting the Binding agent for subsequent
calls
3. Call time Binding
Client is bound to a server at the time when it calls the server for the
first time during its execution.
m m m
lient Server Binding - all time Binding
Binding Agent
1
2
4 3
5
Subsequent calls are Server Process
Client Process Sent directly
D m m m
omplicated R¶s
ô m types of complicated RPC¶s are:
. RPC¶s involving long-duration calls or large gaps between calls
è m methods used to handle
o Periodic probing of the server by the client
o Periodic generation of an acknowledgement by the server
m. RPC¶s involving arguments and or results that are too large to fit in a
single-datagram packet
è A long RPC argument or result is fragmented and transmitted in
multiple packets
m m m
Special types of R¶s
. Call Back RPC
m. Broadcast RPC
3. Batch-mode RPC
. Call Back RPC
ô In Normal RPC, the caller and callee processes have a client-server
relationship, where as in call back RPC uses Peer-to-Peer paradigm
where a node acts as both client and ~erver.
ô Call Back RPC is for interactive applications, which require user
intermediate inputs
ô During procedure execution the server process makes a callback RPC
to client process
m m m

ô Callback RPC
lient Server
Start procedure
execution
Stop procedure
rocess callback execution
request and send temporarily
reply
Reply (result of callback)
Resume procedure
execution
rocedure
execution ends
m m m

ô Implementation of callback RPC should address:
Providing the server with the clients Handle
è ~erver should have clients handle to call the client back. Clients
handle uniquely identifies the client. Using handle server makes a
normal RPC to client
Making client process wait for the call back RPC
è Call back RPC should not be mistaken for reply to the RPC
Handling of call back Dead Backs
è care must be taken to avoid call back dead locks will be discussed
later
m m m m

Special types of R¶s (ontd«)
ô Handling callback deadlocks
R R
R

is waiting for R (reply from to )

© m m m
Special types of R¶s (ontd«)
m. Broadcast RPC
Client request is Broadcast on Network & processed by all the
servers providing that service.
[wo ways
è Using Binding Agent, which forwards the request to all ~ervers
registered with it.
è Using Broadcast Ports of servers.
o Client process may wait for zero, one, m-out±of-n, all replies
Depending on reliability desired.
§ m m m

3. Batch-mode RPC
ueue separate RPC request at client side in a transmission buffer &
send them over network in a batch.
è Reduces overhead of sending each RPC.
è Applications requiring higher RPC call rates (- RPC sec) can be
implemented easily.
è [ransmission Buffer is flushed when
o Predetermined interval lapses.
o Predetermined number of requests received.
o Amount of batch data exceeds the buffer size.
o A call is made to one of the server¶s procedure for which result is
expected. ( Nonqueuing RPC)
6 m m m

¦ptimizations in R for better
performance
. Concurrent Access to Multiple ~ervers
a) Use of threads: - uach thread can independently make calls to

different servers.
b) uarly Reply Approach: -
è RPC is split into m RPC calls
. One RPC for Passing Parameters
m. One RPC for requesting result
c) Call Buffering Approach
º m m m

Optimizations in RPC for better performance***
ô uarly Reply Approach: - to provide concurrent access to multiple
servers.
lient Server
Return (tag)
Reply (tag) Execute the

arry out other procedure
activities Store (result)
Request result (tag)
Return (result)
Reply (result)
m m m

Optimizations in RPC for better performance
ô Call buffering approach : to provide concurrent access to multiple servers.
Clients and servers do not interact directly with each other
Interact indirectly via a call buffer sever
[o make an RPC call
è A client sends its call request to the call buffer server
è Client then performs other activities until it needs the result
è Client periodically polls the call buffer server, when it needs the result
è If result is available it recovers the result
D m m m

ô Call buffering approach
On server side
è When server is free, it periodically polls the call buffer server, if there is
any call for it
è Ifthere is, it recovers the call request, executes it and makes a call
back to the call buffer server
è Returns the result of execution to the call buffer server
m m m

Client Call Buffer Server Server
Check for a waiting request
Polling
for
Waiting
Reply ( Tag) request
Carry
out other Reply ( Tag, Parameter)
activities Check for result ( Tag)
Execute the
Procedure
Polling
for Acknowledgement
result
m m m

m. ~erving multiple requests simultaneously
Delays encountered in RPC systems :
è Delay caused while a server waits for a resource that is temporarily
unavailable
è A delay can occur when a server calls a remote function that involves
a considerable amount of computation to complete or involves
considerable transmission delay
Use of Multi-threaded server with dynamic thread creation facility allow
the server to accept and process other requests, instead of being idle
while waiting will provide better performance
m m m

3. Reducing per-call workload of servers
One way to achieve this improvement is to use stateless servers
§. Reply caching of idempotent remote procedures
Proper selection of timeout values
è [oo small timeout value will cause timers to expire too often, resulting in
unnecessary retransmissions
è [oo large timeout value will cause a needlessly long delay in the event
that a message is actually lost
m m m m

è ~ervers are likely to take varying amounts of time to service individual
requests, depending on various factors like server load, network routing
and network congestion
è If the clients continue to retry sending requests, the server loading and
network congestion problem will become worse
è One method for proper selection of timeout values is to use some back-
off strategy or exponentially increasing timeout values
. Proper design of RPC protocol specification
© m m m

ase studies
ô ~un RPC
~teps in creating an RPC application in ~un RPC
è Application programmer manually writes the client program and server

program for the application
è [he client program file is compiled to get a client object file
è [he server program file is compiled to get a server object file
è [he server stub file and the XDR filters file are compiled to get a client
stub object file
§ m m m

ase studies ( ontd«)
ô ~un RPC
[he server stub file and the XDR filters file are compiled to get a
server stub object file
[he client object file, the client stub file, and the client-side
RPCRuntime library are linked together to get the client executable
file
[he server object file, the server stub object file, and the server-side
RPCRuntime library are linked together to get the server executable
file
6 m m m

º m m m
End of chapter
m m m

Chap 2

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Chap 2

Hochgeladen von

Copyright:

Verfügbare Formate

Message passing

 Remote Procedure Calls

ô Inter process communication (I.P.C.) requires information sharing

ô m basic methods for information sharing are

. Original sharing or shared-Memory approach

m. Copy sharing or message-passing approach

ô A message passing system is a subsystem of a distributed O~ that

 A good Message passing system should have IPC protocols to handle

 Address ± to identify the sending receiving process

 ~equence number ± message identifier for identifying lost

Actual data Sequence

Variable Fixed-length header

 Multi Data gram Messages

 uncoding and Decoding

ô How a non blocking RuCuIVING process knows message arrival?

è Interrupts: When message is filled in the buffer, a software interrupt is

Sender¶s execution Receiver¶s execution

 ~ingle message buffer

 Buffer with unbounded capacity

 inite-bound (or multiple-message) buffer

ô [he logical path of message transfer is directly from the sender¶s

Sending process Receiving process

Message transfer in synchronous send with no buffering strategy

Sending process Receiving process

Message transfer in synchronous send with single-message buffering

 When buffer has finite-bound - problem is buffer overflow

 [he buffer overflow can be dealt with one of m ways:

è An absolute pointer value loses its meaning when transferred from

è [he process with which communication is desired is explicitly named as

o ~end (Process-id, Message)

o Receive (Process_id, Message)

rom the process

è ~econd field is a local identifier generated by the node on which processes

è ailure of communication link between sender and receiver or receiver¶s

è ailureof communication link or ~ender¶s node is down at the time the

è Receiver¶s node crashing while request is being processed

Reply Executing state

Reply Executing state

Reply Executing state

Time Out Lost

Time Out Unsuccessful

Time Out Successful Execution

Lost These 2 successful

 Idempotency means repeatability

 An Idempotent operation produces same results without any side

 uxample simpleIntrest( m, m, 8 ) procedure produces same

 A Non Idempotent operation produces different results for the same

Send Retransmit REQUEST Message

Check reply Cache for request01.

EXACTLY ONCE SEMANTICS USING REQUEST IDENTIFIERS AND REPLY CACHE

ôBetter approach is using B^A~[ Protocol

 Out of sequence delivery of Packets

Buffer for 4 packets

Retransmit the request of missing packets

 One to many (single sender and multiple receivers)

 Many to one (multiple senders and single receiver)

 Many to many ( multiple senders and multiple receivers)

 High level group name is an A~CII string that is independent of the

 ~pecial address to which multiple machines can listen is called multicast

 Bulletin-board semantics is more flexible than send-to-all semantics,

o ux. [ime signal generation

o ux. Request for service

Remote Procedure Calls

A good Message passing system should have IPC protocols to handle

Address ± to identify the sending receiving process

~equence number ± message identifier for identifying lost

Multi Data gram Messages

uncoding and Decoding

~ingle message buffer

Buffer with unbounded capacity

inite-bound (or multiple-message) buffer

When buffer has finite-bound - problem is buffer overflow

[he buffer overflow can be dealt with one of m ways:

rom the process

è ailure of communication link between sender and receiver or receiver¶s

è ailureof communication link or ~ender¶s node is down at the time the

Idempotency means repeatability

An Idempotent operation produces same results without any side

uxample simpleIntrest( m, m, 8 ) procedure produces same

A Non Idempotent operation produces different results for the same

Out of sequence delivery of Packets

One to many (single sender and multiple receivers)

Many to one (multiple senders and single receiver)

Many to many ( multiple senders and multiple receivers)

High level group name is an A~CII string that is independent of the

~pecial address to which multiple machines can listen is called multicast

Bulletin-board semantics is more flexible than send-to-all semantics,

Nonselective receiver specifies a set of senders

Important issue is ordered message delivery

Ordered message delivery ensures that all messages are delivered to

Commonly used semantics for ordered delivery of multicast messages

Kernel of each receiving machine saves all incoming messages meant

~equencer based method is subject to single point failure and has

Assigns sequence number to a message by Distributed agreement

On receiving the COMMI[ message, each member attaches the final

Committed messages with final sequence numbers are delivered to

unsures that if the event of sending one message is casually related to

all rocedure &

Makes perfectly normal local procedure call that in turn invokes

m basic components necessary in a call message are:

In addition to these fields, a call message normally has

An RPC that uses the R protocol is called asynchronous RPC

In asynchronous RPC, the RPCRuntime does not take responsibility for

First R rocedure

is waiting for R (reply from to )