Beruflich Dokumente
Kultur Dokumente
Service Execution
2. Motivation
Taking advantage of parallelism is one of the most
important methods for improving performance of
distributed applications. In current web service-based
applications, a service is a software agent that
distributed in network [19]. Therefore, service’s
operation is inherently capable of being processed in
parallel and the key is how to make the service
invocations executed concurrently at the caller side.
Currently, the most common approach is writing
concurrent programs by approaches such as
multithread or event-driven programming. That is,
developers should manually analyze which services can
be invoked simultaneously and explicitly write codes to
invoke services in parallel. Some applications,
especially the so-called “embarrassingly parallel”
applications, can fit the concurrent programming model
nicely. However, for applications that is not well
structured and regular, concurrent programming is Figure 1. The Operation Sequence of the
usually a burdensome and error-prone job. Just as Travel Agent Web Service
Sutter and Larus observe [7], “Humans are quickly Another approach is exploiting parallelism via the
overwhelmed by concurrency and find it much more run-time system. For example, in computer hardware
difficult to reason about concurrent than sequential design, a processor can execute several instructions in
code. Even careful people miss possible interleaving pipeline. The success of the run-time approach
among even simple collections of partially ordered depends on techniques to solve two problems , first is
operations”. Consequently, it’s a natural conclusion: how to determine the parts of a program that can be
“Concurrency is fundamentally hard; avoid whenever executed concurrently at the runtime; second is how to
possible”[10]. determine the dependence between concurrent units.
How to avoid concurrent programming while still For example, in CPU, the concurrent part is an
make the modules of programs run concurrently? One instruction and CPU is able to check the dependence
possible solution [13] [16] is automatically exploiting between instructions.
parallelism in sequential programs by compilers. Therefore, in order to support the automatic
However, these automatic techniques mainly focus on parallelism, we should solve the above two problems in
the “embarrassingly parallel”programs but not general- service-based application. We now introduce a use
purpose programs . case, travel agent application provided by W3C [23], to
explain the possibility of a successful run-time
approach.
Figure 1 shows the operation sequence of the travel l Reducing cost: Run-time parallelism detecting
agent Web service. In this example OP1, OP2 and OP3 and dynamic service execution will bring extra
can be executed concurrently. Because of the data processing and memory costs. So another
dependence, OP4 and OP5 can’t be processed until the important design goal is to reducing costs
OP1, OP2 and OP3 finished. But the OP6, OP7, OP8 is brought by AVM when bring performance
able to be continued without the return of OP1, OP2 benefits to service-based applications.
and OP3. Similarly, OPi9, OP10 can’t be executed until 3.2 Design Principles
the OP6, OP7 and OP8 returned. OP 11, OP12 and OP13 In Section 2, we summa rized three key issues of the
also have their depended operations. automatically concurrent run-time environment. First, it
We can find out that in a service-based application, is able to automatically determine the concurrent
the first problem is rather easy. A parallel part is exactly granularity; second it is capable of detecting whether
a service invocation. The difficulty lies in how to the concurrent unit can be executed. At last, it can
determine whether an invocation is able to be executed. execute independent services invocation
However, with the run-time data dependence analysis simultaneously.
technique, the run-time environment is able to make the For the first issue, AVM adopts a service invocation
decision. For example, the OP12 depends on the as the basic unit of parallelism. As mentioned before,
reserved ID and the Authorized ID. Therefore, the the stateless property of Web service is the key for us
runtime environment must check that whether the OP 10 to implement implicit parallelism for service-based
and OP 11 has finished and decide whether OP12 can applications. Therefore, AVM needs mechanisms to
be executed. recognize which entity is a service. In AVM, service is a
In this use case, in order to automatically parallelize fundamental abstraction, and operations on services,
the execution of service-based applications, a run-time such as creation, invocation are virtualized to a normal
environment should first be able to determine the instruction of virtual machine. The virtual machine just
parallel granularity for a service-based application. Next, regards a service invocation as an ordinary operation
it should be able to analyze the dependence of the that has long and uncertain latency.
parallel part, which in this case, is a service operation. For the second issue, the dependence analysis
At last, when the program is running, it should be able module of AVM can detect the dependence between
to dynamically execute independent services instructions. If instruction i depends on a long-running
simultaneously. These issues are the most important instruction j, which is usually a service invocation
considerations when we design and implement the instruction, AVM will delay the execution of i until j is
Abacus Virtual Machine. finished. Using an appropriate dependence analysis
policy, the virtual machine can guarantee the
3. Design and Implementation of AVM correctness of data flow.
In this section, we introduce design and For the third issue, as mentioned in Section 2, the
implementation of Abacus Virtual Machine, which is a dynamic execution engine of AVM can execute two or
run-time environment for service-based application that more service invocations simultaneously. Whether a
supports implicit parallelism via a dynamic service service invocation depends on other instruction is
execution engine. determined by dependence analysis module .
700
20
600
15 User time
500 Sys time
S-NGB on SEE VM 10 Total time
400 T-NGB on SEE VM
S-NGB on DEE VM
300 5
200
0
SEE DEE SEE DEE SEE DEE SEE DEE
100
ED HC VP MB
0
ED HC VP MB Figure 4. CPU Time Comparison
Figure 3. Performance Comparison
Second, we measure the maximal memory cost of the
NGB program in both DEE VM and SEE VM.
Table 1. Speedup of NGB Programs
S-NGB on T-NGB on SEE VM S-NGB on DEE 250000
SEE VM VM 200000
Execution Execution Speed Execution Spee 150000 DEE VM Mem(k)
Time Time up Time d up 100000 SEE VM Mem(k)
50000
ED 670.576 79.34 8.45 79.045 8.48
0
HC 103.49 106.941 0.97 104.924 0.99 ED HC VP MB
task_result[6]=task_manager.getExecutor(6).step(task_input);
task_input=new BMRequest[3];
task_input[0]=task_result[3];
task_input[1]=task_result[4];
task_input[2]=task_result[5];
task_result[7]=task_manager.getExecutor(7).step(task_input);
task_input=new BMRequest[1];
task_input[0]=task_result[5];
task_result[8]=task_manager.getExecutor(8).step(task_input);
Figure 6. Dataflow of MB Problem
//waits for all tasks finish ed and get s results
The following code just sequentially describes the int finnum=0;
dataflow of MB problem. AVM is able to automatically BMResults ret[]=new BMResults[dfg.getnumNodes()];
execute this section of code in parallel and the speedup while(finnum<dfg_node_num){
is 2.609 on three machines, which is close to the ideal ss.sleep(1);
for(i=0;i<dfg_node_num;i+=1){
speedup of this problem.
if(dfg.getnode(i).getattribute()==1) continue;
//This service needs a taskmanager service that has already been finnum+=1;
deployed
dfg.getnode(i).setattribute(1);
extern TaskManager TaskManager.task_manager;
dfg.getnode(i).setverified(ret[i].getverified());
ret[i]=task_result[i].getResults();
service NGBServer{
}
IOService ios=new IOService();
}
SystemService ss = new SystemService();
// Initialization
for(i=0;i<dfg_node_num;i+=1){
DGNode nd=dfg.getnode(i);
nd.setverified(1);
dfg.getnode(i).setattribute(0);
}
task_result[0]=task_manager.getExecutor(0).step(task_input);
t ask_result[1]=task_manager.getExecutor(1).step(task_input);
task_result[2]=task_manager.getExecutor(2).step(task_input);
task_input=new BMRequest[2];
task_input[0]=task_result[0];