Sie sind auf Seite 1von 63

A Project Report on Load Balancing in Cloud Computing

In Partial Fulfillment of the Requirement for 8 Semester, B.E. (Computer Science and Engineering)
th

Submitted by DeeptiAgrawal Swati Nasre Vibhuti Kumar Upadhyay Mohammed TaslimAlam Ansari

Under the guidance of Prof. S.D. Chaudhari

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

J. L. CHATURVEDI COLLEGE OF ENGINEERING


846, New Nandanvan Layout, Nagpur 440 009.

Session 2011-2012

CERTIFICATE

This is to certify that the Project entitled

Load Balancing in Cloud Computing


is submitted by

DeeptiAgrawal Swati Nasre Vibhuti Kumar Upadhyay Mohammed TaslimAlam Ansari


to RashtrasantTukadojiMaharaj Nagpur University, in the partial fulfillment of the requirement for Project in 8th semester "Computer Science and Engineering " for the academic year 2011 2012. This report is a record of the work carried out by them and underwent requisite directions as per the University Curriculum.

Under Guidance of

Prof. S.D. Chaudhari

Prof.S.D.Chaudharni (Head of Department)

Dr. S.S.Salankar (Principal)

ACKNOWLEDGEMENT

The success of any work depends on efforts of many individuals. We would like to take this opportunity to express our deep gratitude to those who extended their support and have guided us to complete this project work. We wish to express our sincere and deepest gratitude to our guide Prof.S.D. Chaudharifor his invaluable and unique guidance. We would also like to thank her for the constant source of help, inspiration and encouragement in the successful completion of project. It has been our privilege and pleasure to work under her expert guidance. We like to thank Prof.S.D.Chaudhari (HOD) for providing us the necessary information about topic. We would again like to thank Dr. S.S.Salankar, Principal of our College, for providing us the necessary help and facilities we needed. We express our thanks to all the staff members of CSE Department who have directly or indirectly extended their kind co-operation in the completion of our project Report.

Date: Place:

DeeptiAgrawal Swati Nasre Vibhuti Kumar Upadhyay Mohammed TaslimAlam Ansari

INDEX
No. 1. 2. Chapters Problem Definition Introduction 2.1 Cloud Computing 2.2 Cloud Components 2.3 Types of Clouds 2.4 Virtualization 2.5 Services provided by Cloud Computing 2.6 Characteristics of Cloud Computing 2.7 Load Balancing Page No. 1 2 4 5 7 9 12 13 15

Platform, Tools & Techniques (Hardware & Software) 3.1 Back End 3.2 Front End 3.3 Hardware 4. Data Flow Diagrams 5. Project Methodology 5.1 The stages of SDLC 5.2 Planning & Analysis stage 5.3 Development stage 5.4 Integration and testing stage 6. Screen Shots 7. Advantages & Applications 8. Future Scope & Conclusion 9. Publications 10. Bibliography

3.

16 20 20 21 26 31 36 42 43 44 47

LIST OF FIGURES
Figure no. 1.1 2.1 2.2 2.3 2.4 2.5 2.6 2.7 4.1 4.2 4.3 4.4 5.1 5.2 5.3 6.1 6.2 6.3 6.4 6.5 6.6 Figure Title Page No. 2 4 6 7 8 9 11 11 16 17

A Typical Load Balancer used in Cloud Computing Three components make up a cloud computing solution Types of Cloud Full Virtualization Partial virtualization Software as a service (SaaS). Platform as a service (PaaS) Hardware as a service (HaaS) The Data Flow Diagram of The System Flow Chart For getting details of clients connected to the server Flow Chart for Task Execution Flow Chart for Selecting the appropriate client for job assignment An illustration of a distributed object application Interfaces and Classes in java.rmi package Integration of the system Main Window(1) Main Window (2) Task Placement Window Factorial Task Placement File Read Task Placement The About Window

18 19 29 29 31 36 37 38 39 40 41

LIST OF TABLES

Table no.
5.1

Table Title

Page No.
26

Method Summary of SIGAR.

CpuInfo class of

5.2 5.3

Method Summary of Mem class of SIGAR. Method Summary of SysInfo class of SIGAR.

26 27

Problem Definition

Problem Definition
The goal of a cloud-based architecture is to provide some form of elasticity, the ability to expand and contract capacity on-demand. The implication is that at some point additional instances of an application will be needed in order for the architecture to scale and meet demand. That means there needs to be some mechanism in place to balance requests between two or more instances of that application. The mechanism most likely to be successful in performing such a task is a load balancer. The challenges of attempting to build such architecture without a load balancer are staggering. There's no other good way to take advantage of additional capacity introduced by multiple instances of an application that's also efficient in terms of configuration and deployment. All other methods require modifications and changes to multiple network devices in order to properly distribute requests across multiple instances of an application. Likewise, when the additional instances of that application are de-provisioned, the changes to the network configuration need to be reversed. A load balancer provides the means by which instances of applications can be provisioned and de-provisioned automatically, without requiring changes to the network or its configuration. It automatically handles the increases and decreases in capacity and adapts its distribution decisions based on the capacity available at the time a request is made. Because the end-user is always directed to a virtual server, or IP address, on the load balancer the increase or decrease of capacity provided by the provisioning and de-provisioning of application instances is non-disruptive. As is required by even the most basic of cloud computing definitions, the end user is abstracted by the load balancer from the actual implementation and needs not care about the actual implementation. The load balancer makes one, two, or two-hundred resources whether physical or virtual - appear to be one resource; this decouples the user from the physical implementation of the application and allows the internal implementation to grow, to shrink, and to change without any obvious effect on the user. Choosing the right load balancer at the beginning of such an initiative is imperative to the success of more complex implementations later. The right load

balancer will be able to provide the basics required to lay the foundation for more advanced cloud computing architectures in the future, while supporting even the most basic architectures today. The right load balancer will be extensible.

Client

PhysicalServe r VirtualServer

Figure 1.1: A Typical Load Balancer used in Cloud Computing Load balancing ensures that all the processor in the system or every node in the network does approximately the equal amount of work at any instant of time. This technique can be sender initiated, receiver initiated or symmetric type (combination of sender initiated and receiver initiated types).

Our objective is to develop an effective load balancing algorithm maximize or minimize different performance parameters (throughput, latency for example) for the clouds of different sizes (virtual topology depending on the application requirement).

The system will take the service as input which is requested by the cloud customer and then the role of load balancer begins. It will communicate with the clients present in its network and assign the load to the least loaded one.

Introduction

2.1 Cloud Computing


Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility (like the electricity grid) over a network (typically the Internet). Cloud computing provides computation, software applications, data access, data management and storage resources without requiring cloud users to know the location and other details of the computing infrastructure. End users access cloud based applications through a web browser or a light weight desktop or mobile app while the business software and data are stored on servers at a remote location. Cloud application providers strive to give the same or better service and performance as if the software programs were installed locally on end-user computers. At the foundation of cloud computing is the broader concept of infrastructure convergence (or Converged Infrastructure) and shared. This type of data centre environment allows enterprises to get their applications up and running faster, with easier manageability and less maintenance, and enables IT to more rapidly adjust IT resources (such as servers, storage, and networking) to meet fluctuating and unpredictable business demand. In case of Cloud computing services can be used from diverse and widespread resources, rather than remote servers or local machines. There is no standard definition of Cloud computing. Generally it consists of a bunch of distributed servers known as masters, providing demanded services and resources to different clients known as clients in a network with scalability and reliability of data center. The distributed computers provide on-demand services. Services may be of software resources (e.g. Software asa Service, SaaS) or physical resources (e.g. Platform as a Service, PaaS) or hardware/infrastructure (e.g. Hardware as a Service, HaaS or Infrastructure as a Service, IaaS ). Amazon EC2 (Amazon Elastic Compute Cloud) is an example of cloud computing services.

2.2 Cloud Components


A Cloud system consists of 3 major components such as clients, datacenter, and distributed servers. Each element has a definite purpose and plays a specific role.

Figure 2.1: Three components make up a cloud computing solution.

2.2.1 Clients End users interact with the clients to manage information related to the cloud. Clients generally fall into three categories: Mobile: Windows Mobile Smartphone, Smartphones, like a Blackberry, or an iPhone. Thin: They dont do any computation work. They only display the information. Servers do all the works for them. Thin clients dont have any internal memory. Thick: These use different browsers like IE or Mozilla Firefox or Google Chrome to connect to the Internet cloud. Now-a-days thin clients are more popular as compared to other clients because of their low price, security, low consumption of power, less noise, easily replaceable and repairable etc. 2.2.2 Datacenter Datacenter is nothing but a collection of servers hosting different applications. An end user connects to the datacenter to subscribe different applications. A datacenter may exist at a large distance from the clients.

Now-a-days a concept called virtualisation is used to install software that allows multiple instances of virtual server applications.

2.2.3 Distributed Servers Distributed servers are the parts of a cloud which are present throughout the Internet hosting different applications. But while using the application from the cloud, the user will feel that he is using this application from its own machine.

2.3 Type of Clouds


Based on the domain or environment in which clouds are used, clouds can be divided into following categories:

Public Cloud Private Cloud Hybrid Cloud Community cloud

2.3.1.Public Cloud Applications, storage, and other resources are made available to the general public by a service provider. Public cloud services may be free or offered on a pay-per-usage model. There are limited service providers like Microsoft, Google etc owns all Infrastructures at their Data Center and the access will be through Internet mode only. No direct connectivity proposed in Public Cloud Architecture.

2.3.2. Private Cloud Private cloud is infrastructure operated solely for a single organization, whether managed internally or by a third-party and hosted internally or externally. They have attracted criticism because users "still have to buy, build, and manage them" and thus do not benefit from less hands-on management, essentially "[lacking] the economic model that makes cloud computing such an intriguing concept".

2.3.3. Hybrid Cloud Hybrid cloud is a composition of two or more clouds (private, community or public) that remain unique entities but are bound together, offering the benefits of multiple deployment models. 2.3.4. Community cloud Community cloud shares infrastructure between several organizations from a specific community with common concerns (security, compliance, jurisdiction, etc.), whether managed internally or by a third-party and hosted internally or externally. The costs are spread over fewer users than a public cloud (but more than a private cloud), so only some of the cost savings potential of cloud computing are realized.

Figure 2.2: Types of Cloud

2.4 Virtualization
It is a very useful concept in context of cloud systems. Virtualisation means something which isnt real, but gives all the facilities of a real. It is the software implementation of a computer which will execute different programs like a real machine. Virtualisation is related to cloud, because using virtualisation an end user can use different services of a cloud. The remote datacenter will provide different services in a full or partial virtualised manner.

Two types of virtualization are found in case of clouds are: Full virtualization Partial virtualization

2.4.1 Full Virtualization In case of full virtualisation a complete installation of one machine is done on the machine. It will result in a virtual machine which which will have all the softwarethat are present in the actual server.

Figure 2.3: Full Virtualization

Here the remote datacenter delivers the services in a fully virtualised manner. Full virtualization has been successful for several purposes: Sharing a computer system among multiple users Isolating users from each other and from the control program Emulating hardware on another machine

2.4.2 Partial virtualization In partial virtualisation, the hardware allows multiple operating systems to run on single machine by efficient use of system resources such as memory and processor. e.g. VMware software. Here all the services are not fully available, rather the services are provided partially.

Figure 2.4: Partial virtualization. Partial virtualization has the following advantages: Disaster recovery: In the event of a system failure, guest instances are moved to another hardware until the machine is repaired or replaced. Migration: As the hardware can be replaced easily, hence migrating or moving the different parts of a new machine is faster and easier. Capacity management: In a virtualised environment, it is easier and faster to add more hard drive capacity and processing power. As the system parts or hardware can be moved or replaced or repaired easily, capacity management is simple and easier.

2.5 Services provided by Cloud computing


Service means different types of applications provided by different servers across the cloud. It is generally given asas a service. Services in a cloud are of 3 types: Software as a Service (SaaS) Platform as a Service (PaaS) Hardware as a Service (HaaS) or Infrastructure as a Service (IaaS)

2.5.1 Software as a Service (SaaS)

In SaaS, the user uses different software applications from different servers through the Internet. The user uses the software as it is without any change and do not needs to make lots of changes or doesnt require integration to other systems. The provider does all the upgrades and patching while keeping the infrastructure running.

Figure 2.5: Software as a service (SaaS).

The client will have to pay for the time he uses the software. The software that does a simple task without any need to interact with other systems makes it an ideal candidate for Software as a Service. Customer who isnt inclined to perform software development but needs high-powered applications can also be benefitted from SaaS. Some of these applications include: Customer resource management (CRM) Video conferencing IT service management Accounting Web analytics Web content management

Benefits: The biggest benefit of SaaS is costing less money than buying the whole application. The service provider generally offers cheaper and more reliable applications as compared to

the organisation. Some other benefits include: Familiarity with the Internet, Better marketing, smaller staff, reliability of the Internet, data Security, More bandwidth etc. Obstacles: SaaS isnt of any help when the organisation has a very specific computational need that doesnt match to the SaaS services While making the contract with a new vendor, there may be a problem. Because the old vendor may charge the moving fee. Thus it will increase the unnecessary costs. SaaS faces challenges from the availability of cheaper hardwares and open source applications. 2.5.2 Platform as a Service (PaaS) PaaS provides all the resources that are required for building applications and services completely from the Internet, without downloading or installing software. PaaS services are software design, development, testing, deployment, and hosting. Other services can be team collaboration, database integration, web service integration, data security, storage and versioning etc.

Downfall: Lack of portability among different providers. If the service provider is out of business, the users applications, data will be lost.

Figure 2.6: Platform as a service (PaaS) 2.5.3 Hardware as a Service (HaaS)

It is also known as Infrastructure as a Service (IaaS). It offers the hardware as a servicetoa organisation so that it can put anything into the hardware according to its will. HaaS allows the user to rent resources as

Server space Network equipment Memory CPU cycles Storage space

Figure 2.7: Hardware as a service (HaaS) Cloud computing provides a Service Oriented Architecture (SOA) and Internet of Services (IoS) type applications, including fault tolerance, high scalability, availability, flexibility, reduced information technology overhead for the user, reduced cost of ownership, on demand services etc. Central to these issues lies the establishment of an effective load balancing algorithm.

2.6 Characteristics of Cloud Computing


Virtualization: Virtualizationtechnology allows servers and storage devices to be shared and utilization is increased. Applications can be easily migrated from one physical server to another. Multi-tenancy: Multi-tenancyenables sharing of resources and costs across a large pool of users thus allowing for: Centralization of infrastructure in locations with lower costs (such as real estate, electricity, etc.)

Peak-load capacity increases (users need not engineer for highest possible loadlevels) Utilization and efficiency improvements for systems that are often only 1020% utilized.

Reliability: Reliability is improved if multiple redundant sites are used, which makes well-designed cloud computing suitable for business continuity and disaster recovery. Scalability: Scalability and Elasticity via dynamic ("on-demand")provisioning of resources on a fine-grained, self-service basis near real-time, without users having to engineer for peak loads. Performance: Performance is monitored and consistent and loosely coupled

architectures are constructed using web services as the system interface. Security: Security could improve due to centralization of data, increased securityfocused resources, etc., but concerns can persist about loss of control over certain sensitive data, and the lack of security for stored kernels. Private cloud installations are in part motivated by users' desire to retain control over the infrastructure and avoid losing control of information security.

Maintenance: Maintenance of cloud computing applications is easier, because they do not need to be installed on each user's computer and can be accessed from different places.

2.7 Load balancing


It is a process of reassigning the total load to the individual nodes of the collective system to make resource utilization effective and to improve the response time of the job simultaneously removing a condition in which some of the nodes are over loaded while some others are under loaded. A load balancing algorithm which is dynamic in nature does not consider the previous state or behaviour of the system, that is, it depends on the present behaviour of the system. The important things to consider while developing such algorithm are : estimation of load, comparison of load, stability of different system, performance of system, interaction between the nodes, nature of work to be transferred, selecting of nodes

and many other ones. This load considered can be in terms of CPU load, amount of memory used, delay or Network load. 2.7.2 Goals of Load balancing The goals of load balancing are:

To improve the performance substantially To have a backup plan in case the system fails even partially To maintain the system stability To accommodate future modication in the system
2.7.3 Types of Load balancing algorithms Depending on who initiated the process, load balancing algorithms can be of three categories:

Sender Initiated: If the load balancing algorithm is initialised by the sender Receiver Initiated: If the load balancing algorithm is initiated by the receiver Symmetric: It is the combination of both sender initiated and receiver initiated
Depending on the current state of the system, load balancing algorithms can be divided into 2 categories: Static Load Balancing Algorithm Dynamic Load Balancing Algorithm

Here we will concentrate upon the later type of algorithm. Dynamic Load balancing algorithm In a distributed system, dynamic load balancing can be done in two different ways: distributed and non-distributed. In the distributed one, the dynamic load balancing algorithm is executed by all nodes present in the system and the task of load balancing is shared among them. The interaction among nodes to achieve load balancing can take two forms: cooperative and non-cooperative]. In the first one, the nodes work side-by-side to achieve a common objective, for example, to improve the

overall response time, etc. In the second form, each node works independently toward a goal local to it, for example, to improve the response time of a local task. Dynamic load balancing algorithms of distributed nature, usually generate more messages than the non-distributed ones because, each of the nodes in the system needs to interact with every other node. A benefit, of this is that even if one or more nodes in the system fail, it will not cause the total load balancing process to halt, it instead would affect the system performance to some extent.

Platform, Tools &Techniques (Hardware & Software)

3.1 Back End MySQL Version 5.1.36

3.2 Front End Java 1.6 (jdk 1.6.0_04)

3.3 Hardware Port Fast Ethernet Switch (DLink) LAN Wires

Data Flow Diagrams

Data Flow Diagrams

Figure: 4.1 The Data Flow Diagram of The System.

START

BIND TO MULTISOCKET ADDRESS

NOT FOUND

CHECK FOR CLIENT CONNECTION

FOUND

GOODBYE

CHECK PROTOCOL TYPE

SYSTEMINFORMATION

WELCOME DOES ENTRY ALREADY EXIST ? ADD NAME TO DATABASE YES EXIT UPDATE THE DATABASE INSERT NEW ENTRY INTO DATABASE

REMOVE NAME FROM DATABASE

UPDATE THE MAIN WINDOW Figure 4.2. Flow Chart For getting details of clients connected to the server.

START

INPUT NUMBER FOR FACTORIAL ENTER THE FILE NAME

TYPE OF TASK

FILE READ

FACTORIAL A A

SEARCH FOR THE APPROPRIATE CLIENT

RERURN THE FACTORIAL OF THE NUMBER

RETURN THE CONTENT OF FILE ASSIGN JOB

EXIT SEARCH IN THE RMI REGISTRY

EXECUTE THE JOB

Figure: 4.3. Flow Chart for Task Execution.

START

GET THE DATABASE ENTRIES

END-OF-LIST

WHILE (CLIEINT INFORMATION DTO)

IS CPU USAGE MINIMUM ?

NO

YES

IS RAM USAGE MINIMUM ?

NO

YES

RETURN THE CLIENT NAME

Figure 4.4. Flow Chart for Selecting the appropriate client for job assignment.

Project Methodology

The stages of SDLC- Agile model:


Agile Modeling is a practice-based methodology for modeling and documentation of software-based systems. It is intended to be a collection of values, principles, and practices for Modeling software that can be applied on a software development project in a more flexible manner than traditional Modeling methods. Agile software development is a group of software development methods based on iterative and incremental development, where requirements and solutions evolve through collaboration between self-organizing, cross-functional teams. It promotes adaptive planning, evolutionary development and delivery, a time-boxed iterative approach, and encourages rapid and flexible response to change. It is a conceptual framework that promotes foreseen interactions throughout the development cycle. Characteristics: Agile methods break tasks into small increments with minimal planning and do not directly involve long-term planning. Iterations are short time frames (timeboxes) that typically last from one to four weeks. Each iteration involves a team working through a full software development cycle, including planning,requirements analysis, design, coding, unit testing, and acceptance testing when a working product is demonstrated to stakeholders. This minimizes overall risk and allows the project to adapt to changes quickly. Team composition in an agile project is usually cross-functional and self-organizing, without consideration for any existing corporate hierarchy or the corporate roles of team members. Agile methods emphasize face-to-face communication over written documents when the team is all in the same location.Agile development emphasizes working software as the primary measure of progress. This, combined with the preference for face-to-face communication, produces less written documentation than other methods. Testing in Agile and Waterfall: One of the similarities of the agile and traditional methods, such as the waterfall model of software design, is to conduct the testing of the software as it is being developed.The key difference is that in the agile method, the customer and developers are in close communication, whereas in the traditional method, the customer is initially represented by the requirement and design documents.

5.1 The Planning & Analysis Stage:


5.1.1 Opearational Feasibility Study: The important things to consider while developing a load balancing algorithm are : estimation of load, comparison of load, stability of different system, performance of system, interaction between the nodes, nature of work to be transferred, selecting of nodes and many other ones. This load considered can be in terms of CPU load, amount of memory used, delay or Network load.

Depending on the current state of the system, load balancing algorithms can be divided into 2 categories as given below:

Static Load Balancing: It doesnt depend on the current state of the system. Prior knowledge of the system is needed.

Dynamic Load Balancing: Decisions on load balancing are based on current state of the system. No prior knowledge is needed. So it is better than static approach.

For the design of the dynamic load balancer we need some way of fetching out the current state of the system frequently so that the balancing decision can be based upon it.

The entire system of balancing is divided into following 3 basic tasks:

Take out the information: The part of the dynamic load balancing algorithm responsible for collecting information about the nodes in the system.

Select an appropriate system: It specifies the processors involved in the load exchange, i.e. the part of the load balancing algorithm which selects a destination node for a transferred task.

Execution of task: The part of the dynamic load balancing algorithm which selects a job for transferring from a local node to a remote node.

5.1.2 Technical Feasibility Study:

Specific Technology Keywords: JAVA Java is a high-level, third generation programming language, like C, FORTRAN, Smalltalk, Perl, and many others. User can use Java to write computer applications that crunch numbers, process words, play games, store data or do any of the thousands of other things computer software can do.Compared to other programming languages, Java is most similar to C. However although Java shares much of C's syntax, it is not C. Knowing how to program in C or, better yet, C++, will certainly help user to learn Java more quickly, but user don't need to know C to learn Java. Unlike C++ Java is not a superset of C. A Java compiler won't compile C code, and most large C programs need to be changed substantially before they can become Java programs.What's most special about Java in relation to other programming languages is that it lets One write special programs called applets that can be downloaded from the Internet and played safely within a web browser. . A Java applet cannot write to hard disk without permission. It cannot write to arbitrary addresses in memory and thereby introduce a virus into your computer. It should not crash system. Java is the blend of the best elements of its rich heritages combined with the innovative concepts required by its unique environment. It is the language grounded in the needs and experiences of the people who devised it. Java is cohesive and logically consistent. It gives the programmer full control. Its a language for professional programmers. Java enhances and refines the object-oriented paradigm used by C++. The original impetus for java was not the internet. Instead the primary motivation was the need for platform independent. The goal for java developers was write once, run anywhere any time forever. Java code runs on variety of CPU under differing environments. It doesnt require the expensive compiler to be built for each CPU separately. In java there are clearly defined ways to accomplish a given task. It has various packages and classes along with the grateful methods to implement vast requirements. When developing a Java program it is important to select the appropriate Java Graphical User Interface (GUI) components.

Users have no way of checking these programs for bugs or for out-and-out malicious behavior before downloading and running them. Java solves this problem by severely restricting what an applet can do. Java was designed to make it much easier to write bug free code. According to The most important part of helping programmers write bug-free code is keeping the language simple. Java is easy to learn: Java was designed to be easy to use and is therefore easy to write, compile, debug, and learn than other programming languages. Java is object-oriented: This allows you to create modular programs and reusable code. Java is platform-independent: One of the most significant advantages of Java is its ability to move easily from one computer system to another. The ability to run the same program on many different systems is crucial to World Wide Web software, and Java succeeds at this by being platformindependent at both the source and binary levels. Java is distributed: Java is designed to make distributed computing easy with the networking capability that is inherently integrated into it. Writing network programs in Java is like sending and receiving data to and from a file. Java is secure: Java considers security as part of its design. The Java language, compiler, interpreter, and runtime environment were each developed with security in mind. Java is robust: Robust means reliability. Java puts a lot of emphasis on early checking for possible errors, as Java compilers are able to detect many problems that would first show up during execution time in other languages. Java is multithreaded: Multithreaded is the capability for a program to perform several tasks simultaneously within a program. In Java, multithreaded programming has been smoothly integrated into it, while in other languages, operating system-specific procedures have to be called in order to enable multithreading.

The Sigar API The Sigar API provides a portable interface for gathering system information such as: System memory, swap, cpu, load average, uptime, logins Per-process memory, cpu, credential info, state, arguments, environment, open files File system detection and metrics Network interface detection, configuration info and metrics TCP and UDP connection tables Network route table This information is available in most operating systems, but each OS has their own way(s) providing it. SIGAR provides developers with one API to access this information regardless of the underlying platform.The core API is implemented in pure C with bindings currently implemented for Java, Perl, Ruby, Python, Erlang, PHP and C#.

MySQL 5.1 The following are the features that make MySQL worth sing in the project:

I.

Partitioning: This capability enables distributing portions of individual tables across a file system, according to rules which can be set when the table is created.

II.

Row-based replication: Replication capabilities in MySQL originally were based on propagation of SQL statements from master to slave. This is called statement-based replication.

III.

Plugin API: MySQL 5.1 adds support for a very flexible plugin API that enables loading and unloading of various components at runtime, without restarting the server. Although the work on this is not finished yet, plugin full-text parsers are a first step in this direction. This permits users to implement their own input filter on the indexed text, enabling full-text search capability on arbitrary data such as PDF files or other document formats.

IV.

Event scheduler: MySQL Events are tasks that run according to a schedule. When you create an event, you are creating a named database object containing one or more SQL statements to be

executed at one or more regular intervals, beginning and ending at a specific date and time. V. Server log tables: Before MySQL 5.1, the server writes general query log and slow query log entries to log files.

JDBC
JDBC is a Java-based data access technology (Java Standard Edition platform) from Sun Microsystems, Inc.. It is not an acronym as it is unofficially referred to as Java Database Connectivity. This technology is an API for the Java programming language that defines how a client may access a database. It provides methods for querying and updating data in a database. JDBC is oriented towards relational databases. A JDBC-to-ODBC bridge enables connections to any ODBC-accessible data source in the JVM host environment. JDBC allows multiple implementations to exist and be used by the same application. The API provides a mechanism for dynamically loading the correct Java packages and registering them with the JDBC Driver Manager. The Driver Manager is used as a connection factory for creating JDBC connections. JDBC connections support creating and executing statements. These may be update statements such as SQL's CREATE, INSERT, UPDATE and DELETE, or they may be query statements such as SELECT. Additionally, stored procedures may be invoked through a JDBC connection. NET BEANS 6.7 The following windows and tools are central to performing your daily development tasks in the IDE: Projects window Files window Runtime window Navigator window Source Editor GUI Builder Compiler

5.3 The Development Stage:


5.3.1. Module 1 Implementation of the Information Policy: This module is implemented by using the SIGAR (System Information Reporting and Gathering) API (Application Programming Interface). The main classes that are used in this module are : CpuInfo : this class has following methos that take out the information about CPU: long getCacheSize() Get the CPU cache size. int getCoresPerSocket() Get the Number of CPU cores per CPU socket. int getMhz() Get the CPU speed. java.lang.String getModel() Get the CPU model. int getTotalCores() Get the Total CPU cores (logical). int getTotalSockets() Get the Total CPU sockets (physical). java.lang.String getVendor() Get the CPU vendor id. Table 5.1: Method Summary of CpuInfo class of SIGAR. Mem: this class has following methos that take out the information about memory: long getActualFree() Get the Actual total free system memory. long getActualUsed() Get the Actual total used system memory (e.g. long getFree() Get the Total free system memory (e.g. double getFreePercent() Get the Percent total free system memory. long getRam() Get the System Random Access Memory (in MB). long getTotal() Get the Total system memory.

long getUsed() Get the Total used system memory. double getUsedPercent() Get the Percent total used system memory. Table 5.2: Method Summary of Mem class of SIGAR. SysInfo: this class has following methos that take out the information about Operating System:

java.lang.String getDescription() Get the description. java.lang.String getMachine() Get the machine. java.lang.String getName() Get the name. java.lang.String getPatchLevel() Get the patch_level. java.lang.String getVendor() Get the vendor. java.lang.String getVendorCodeName() Get the vendor_code_name. java.lang.String getVendorName() Get the vendor_name. java.lang.String getVendorVersion() Get the vendor_version. java.lang.String getVersion() Get the version. Table 5.3: Method Summary of SysInfo class of SIGAR. 5.3.2. Module 2 Implementation of the Location Policy: After receiving the details of all the clients connected in a network by using the information policy, the location policy now transfers for the job to that respective client. For this we use the following technology: RMI:Java Remote Method Invocation This is a technical literature study which purpose is to describe the basic parts of Java Remote Method Invocation. Remote Method Invocation, abbreviated as RMI provides support for distributed objects in Java, i.e. it allows objects to invoke methods on remote objects. The calling objects can use the exact same syntax as for

local invocations. The Java RMI model has two general requirements. The first requirement is that the RMI model shall be simple and easy to use and the second requirement it that the model shall fit into the Java language in a natural way. Distributed Object Application: An RMI application is often composed of two separate programs, a server and a client. The server creates remotes objects and makes references to those objects accessible. Then it waits for clients to invoke methods on the objects. The client gets remote references to remote objects in the server and invokes methods on those remote objects. The RMI model provides an distributed object application to the programmer. It is a mechanism that the server and the client use to communicate and pass information between each other. A distributed object application has to handle the following properties: Locate remote objects: The system has to obtain references to remote objects. This can be done in two ways. Either by using RMIs naming facility, the rmiregistry, or by passing and returning remote objects. Communicate with remote objects: The programmer doesnt have to handle communication between the remote objects since this is handled by the RMI system. The remote communication looks like an ordinary method invocation for the programmer. Load class bytecodes for objects that are passed as parameters or return values: All mechanisms for loading an objects code and transmitting data is provided by the RMI system. Figure 2.1, below, illustrates an RMI distributed application. In this example the RMI registry is used to obtain references to a remote object. First the server associates a name with a remote object in the RMI registry. When a client wants access to a remote object it looks up the object, by its name, in the registry. Then the client can invoke methods on the remote object at the server.

Figure 5.1: An illustration of a distributed object application Interfaces and Classes: Since Java RMI is a single-language system, the programming of distributed application in RMI is rather simple. All interfaces and classes for the RMI system are defined in the java.rmipackage. Figure 2.2, below, illustrates the relationship between some of the classes and interfaces. The RemoteObjectclass implements the Remote interface while the other classes extend RemoteObject.

Figure 5.2: Interfaces and Classes in java.rmi package The Remote Interface: A remote interface is defined by extending the Remote interface that is provided in the java.rmi package. The remote interface is the interface that declares methods that clients can invoke from a remote virtual machine. The remote interface must satisfy the following conditions: It must extend the interface Remote. Each remote method declaration in the remote interface must include the exception RemoteException(or one of its superclasses) in its thrown clause.

The RemoteObject Class: RMI server functions are provided by the class RemoteObjectand its subclasses RemoteServer, UnicastRemoteObjectand Activatable. Here is a short description of what the different classes handle: RemoteObject provides implementations of the methods hashCode, equals and toStringin the class java.lang.Object. The classes UnicastRemoteObject and Activatable create remote objects and export them, i.e. the classes make the remote objects available to remote clients. The RemoteException Class: The class RemoteExceptionis a superclass of the exceptions that the RMI system throws during a remote method invocation. Each remote method that is declared in a remote interface must specify RemoteException(or one of itssuperclasses) in its throws clause to ensure the robustness of applications in the RMI system. When a remote method invocation fails, the exception RemoteException is thrown. Communication failure, protocol errors and failure during marshalling or unmarshalling of parameters or return values are some reasons for RMI failure. RemoteExceptionis an exception that must be handled by the caller of the remote method, i.e. it is a checked exception. The compiler ensures that the programmer have handled these exceptions. Implementation of a simple RMI system: This is a simple RMI system with a client and a server. The server contains one method (helloWorld) that returns a string to the client. To build the RMI system all files has to be compiled. Then the stub and the skeleton, which are standard mechanisms communication with remote objects, are created with the rmic compiler. This RMI system contains the following files HelloWorld.java: The remote interface. HelloWorldClient.java: The client application in the RMI system. HelloWorldServer.java: The server application in the RMI system.

When all files are compiled, performing the following command will create the stud and the skeleton: rmicHelloWorldServer Then the two classes will be created, HelloWorldServer_Stub.classand

HelloWorldServer_Skel.class, where the first class represents the client side of the RMI System and the second file represents the server side of the RMI system.

5.4 The Integration & Testing Stage:


APPLICATION MANAGER

COMMUNICATION MANAGER

GUI MANAGER

RECEIVER MESSAGE

MAIN WINDOW

WELCOME

GOODBYE

SYSTEMINFORMATION

UPDATE DATABASE

Figure 5.3: Integration of the system.

5.4.1 Integration of the system The system is integrated in such a way that when its execution starts the application manager is invoked. The application manager starts every time the application starts. The

application manager has two sub managers inside it named Communication manager and the GUI manager. The communication manager receives the packet coming from the node and tests it for its category that is what the type of protocol is has that the data contained in. The communication manager has 3 types of protocols namely Welcome protocol, Goodbye protocol and SystemInformation protocol. The welcome protocol adds the name of the client into the database, the Boodbye protocol deletes the name of client from the database whereas the SystemInformation protocol updates database by adding current status of the client machine. This is done by the handle protocol that is present inside all these three protocols. The operations performed by communication manager are taken into account by the GUI manager that throws all the modifications done onto the screen and user sees the result.

5.4.2 Testing of the system

During the software development process, errors are inevitably introduced and some of them are even amplified as a project progresses. Software Testing is the process of executing a

program or system with the intent of finding errors. . Planning for software testing involves organizing testing at three levelsunit, integration, and high-order. The intent and scope of testing varies for these three levels. Planning for software testing also involves procuring tools to automate testing and identifying the people who will perform testing. A problem with software testing is that testing all combinations of inputs and preconditions is not feasible when testing anything other than a simple product. This means that the number of defects in a software product can be very large and defects that occur infrequently are difficult to find in testing. More significantly, par functional dimensions of quality--for example, usability, scalability, performance, compatibility, reliability--can be highly subjective; something that constitutes sufficient value to one person may be intolerable to another.

Software bugs will almost always exist in any software module with moderate size: not because programmers are careless or irresponsible, but because the complexity of software is generally intractable -- and humans have only limited ability to manage complexity. It is also true that for any complex systems, design defects can never be ruled out completely. When Testing Is Carried Out? A common practice of software testing is that it is performed by an independent group of testers after the functionality is developed but before it is shipped to the customer. This practice often results in the testing phase being used as project buffer to compensate for project delays, thereby compromising the time devoted to testing. Another practice is to start software testing at the same moment the project starts and it is a continuous process until the project finishes. . Testing is usually performed for the following purposes:

To improve quality

As computers and software are used in critical applications, the outcome of a bug can be severe. Bugs can cause huge losses. Bugs in critical systems have caused airplane crashes, allowed space shuttle missions to go awry, halted trading on the stock market, and worse. Bugs can kill. In a computerized embedded world, the quality and reliability of software is a matter of life and death. Quality means the conformance to the specified design requirement. the minimum requirement of quality, means performing as required under specified circumstances. Debugging, a narrow view of software testing, is performed heavily to find out design defects by the programmer. Finding the problems and get them fixed is the purpose of debugging in programming phase.

For Verification & Validation

Testing can serve as metrics. It is heavily used as a tool in the Verification & Validation process. Testers can make claims based on interpretations of the testing results, which either the product works under certain situations, or it does not work. We can also compare the quality among different products under the same specification, based on results from the same test.

We cannot test quality directly, but we can test related factors to make quality visible. Quality has three sets of factors -- functionality, engineering, and adaptability. These three sets of factors can be thought of as dimensions in the software quality space. Each dimension may be broken down into its component factors and considerations at successively lower levels of detail.

For reliability estimation

Software reliability has important relations with many aspects of software, including the structure, and the amount of testing it has been subjected to. Based on an operational profile (an estimate of the relative frequency of use of various inputs to the program testing can serve as a statistical sampling method to gain failure data for reliability estimation. The system is tested in steps, in line with the planned build and release strategies, from individual units of code through integrated subsystems to the deployed releases and to the final system. Testing proceeds through various physical levels of the application development lifecycle. Each completed level represents a milestone on the project plan and each stage represents a known level of physical integration and quality. These stages of integration are known as test levels. Levels of testing include the following:

1. Unit Test - Verifies the program specifications to the internal logic of the program or module and validates the logic. 2. Integration Test - Verifies proper execution of application components including interfaces. Communication between modules within the sub-system is tested in a controlled and isolated environment within the project. String testing is part of the Integration testing level/phase which is both the detection as well as the correction of programming/code generation problems. Once a series of components or unit, which must eventually work or communicate with each other have been coded and unit tested, performance of an initial "string" test is conducted. This "stringing" together of the components or units for execution as a unit, tends to be a somewhat informal process directed at finding any communication or parameter passing problems which may not yet have been detected. A "sign-off" should not be given until the entirety of the connected/integrated units or components are working as a smooth, seamless, and error free module. 3. System Test - Verifies proper execution of the entire application components including interfaces to other applications. Both functional and structural types of tests are performed to verify that the system is functionally and operationally sound.

4. System Integration Test - Verifies the integration of all applications, including interfaces internal and external to the organization, with their hardware, software and infrastructure components in a production-like environment. 5. User Acceptance Test - Verifies that the system meets user requirements as specified. It simulates the user environment and emphasizes security, documentation and regression tests. 6. Operability Test - Verifies that the application can operate in the production environment. Operability tests are performed after, or concurrent with User Acceptance Tests.

When to stop testing? Testing is potentially endless. We cannot test till all the defects are unearthed and removed -- it is simply impossible. At some point, we have to stop testing and ship the software. The question is when. Realistically, testing is a trade-off between budget, time and quality. It is driven by profit models. The pessimistic and unfortunately most often used approach is to stop testing whenever some or any of the allocated resources -- time, budget, or test cases -- are exhausted. The optimistic stopping rule is to stop testing when either reliability meets the requirement, or the benefit from continuing testing cannot justify the testing cost. This will usually require the use of reliability models to evaluate and predict reliability of the software under test. Each evaluation requires repeated running of the following cycle: failure data gathering -- modelling -- prediction. This method does not fit well for ultra-dependable systems, however, because the real field failure data will take too long to accumulate.

Screen Shots

Figure 6.1: Main Window (1)

The output screen shows a scenario where there is only one client is connected to the server. The table on the screen shows the system details of the client connected to it.

Figure 6.2: Main Window (2)

The output screen shows a scenario where there are three clients connected to the server. The table on the screen shows the system details of the clients connected to it.

Figure 6.3: Task Placement Window

This window allows the user to choose a task or application that he/she desires to use on the cloud. Here weve two small applications: One is Factorial calculation and another is the File Read Service.

Figure 6.4: Factorial Task Placement The figure shows that the task is placed to the client which was least loaded and gives client-name and total time required for the execution along with the result.

Figure 6.5: File Read Task Placement The figure shows that the task is placed to the client which was least loaded and gives client-name and total time required for the execution along with the result.

Figure 6.6: The About Window

Advantages & Applications

Advantages
In complex and large systems, there is a tremendous need for load balancing. For simplifying load balancing globally (e.g. in a cloud), one thing which can be done is, employing techniques would act at the components of the clouds in such a way that the load of the whole cloud is balanced.

Applications
This application is useful in all types of load balancing applications whether it is sender initiated or receiver initiated or symmetric load balancing technique where the parameters like CPU usage, memory usage, network load, etc. have got greater significance. The application can also be modified so that it can modularize the task and assign its subparts to different clients present in the network. This module can be useful with both centralized and decentralized computing environment. The modularization of the tasks into smaller subtasks speeds up the computing process and improves the performance.

Future Scope & Conclusion

Future Work
Cloud Computing is a vast concept and load balancing plays a very important role in case of Clouds. There is a huge scope of improvement in this area. We have discussed only two divisible load scheduling algorithms that can be applied to clouds, but there are still other approaches that can be applied to balance the load in clouds. The performance of the given algorithms can also be increased by varying different parameters. This module can further be refined by adding feature of modularizing of the larger tasks into smaller ones so that services that need more time and hardware may be executed by different clients on the network and thereby improving performance and reducing execution time.

Conclusion
Cloud computing provides a Service Oriented Architecture (SOA) and Internet of Services (IoS) type applications, including fault tolerance, high scalability, availability, flexibility, reduced information technology overhead for the user, reduced cost of ownership, on demand services etc. Central to these issues lies the establishment of an effective load balancing algorithm. A load balancer provides the means by which instances of applications can be provisioned and de-provisioned automatically, without requiring changes to the network or its configuration. It automatically handles the increases and decreases in capacity and adapts its distribution decisions based on the capacity available at the time a request is made.

Publications

Paper 1:
Deepti Agrawal, Swati Nasre, Vibhuti Kumar Upadhyay, Mohammed Taslim Alam Ansari Presented a paper on topic Load Balancing in Cloud Computing on a national level event SPITZE 2012 on 18/2/2012 held at J. L. Chaturvedi College of Engineering, Nagpur.

Load Balancing in Cloud Computing


Deepti Agrawal
B.E. (CSE) J. L. C. C. E., Nagpur

Vibhuti Upadhyay Md. Taslim Alam Ansari


B.E. (CSE) J. L. C. C. E., Nagpur B.E. (CSE) J. L. C. C. E., Nagpur

Swati Nasre
B.E. (CSE) J. L. C. C. E., Nagpur

Abstract: Managing large compute clusters requires benchmarks with representative work- loads to evaluate performance metrics such as task scheduling delays and machine resource utilizations, machine configurations, and scheduling algorithms. Existing approaches to workload characterization for high performance computing and grids focus on requirements for CPU, memory, disk, I/O and network. However, in addition to resource requirements, cloud computing workloads commonly include task placement constraints. Task placement constraints (hereafter, just constraints) address machine heterogeneity, diverse application requirements, application optimization, and fault tolerance. Constraints limit the machines on which a task can run. A load balancer provides the means by which instances of applications can be provisioned and de-provisioned automatically, without requiring changes to the network or its configuration. It automatically handles the increases and decreases in capacity and adapts its distribution decisions based on the capacity available at the time a request is made. Because the end-user is always directed to a virtual server, or IP address, on the load balancer the increase or decrease of capacity provided by the provisioning and de-provisioning of application instances is nondisruptive. As is required by even the most basic of cloud computing definitions, the end user is abstracted by the load balancer from the actual implementation and needs not care about the actual implementation. The load balancer makes one, two, or two-hundred resources - whether physical or virtual appear to be one resource; this decouples the user from the physical implementation of the application and allows the internal implementation to grow, to shrink, and to change without any obvious affect on the user.

these concepts in better way consider the example given below.

Introduction Cloud computing is where software applications, processing power, data and potentially even artificial intelligence are accessed over the Internet. Today the cloud computing is gaining much more importance in the IT field as it provide different types of services namely Saas (Software as a service), Paas (Platform as a service) and Iaas (Infrastructure as a service ) also known as Haas (Hardware as a service). Due to on demand service by the cloud vendor the use of cloud computing as tremendously increased and in turn there is a basic need to balance the load of the task placed by the client. This requires certain benchmarks like CPU Utilization, RAM utilization, Operating system needed. In order to understand

Figure 1: Illustration of the impact of constraints on machine utilization in a Load Balancing in Cloud Computing By combination of the line thickness and style. Task can be scheduled only on the machines that have the corresponding line thickness and style. Figure 1 illustrates the impact of constraints on machine utilization in a compute cluster. There are six machines M1, , M6 (depicted by squares) and ten tasks T1, , T10 (Depicted by circles). There are four constraints c1, , c4. Constraints are indicated by the combinations of line thickness and line styles. In this example, each task requests a single constraint, and each machine satisfies a single constraint. A task can only be assigned to a machine that satisfies its constraint; that is, the line style and thickness of a circle must be the same as its containing square. One way to quantify machine utilization is the ratio of tasks to styles. In this example, each task requests a single constraint, and each machine satisfies a single constraint. A task can only be assigned to a machine that satisfies its constraint; that is, the line style and thickness of a circle must be the same as its containing square. One way to quantify machine utilization is the ratio of tasks to machines. Proposed Architecture: The proposed architecture for load balancing comprises of client side program, a central server program and clustered server program. The several cluster connected with the central server sends its OS, RAM and CPU information to the central server. When the client issues any request the central server checks the configuration of the entire machine and send the task for execution to that machine whose response time is less than a second. Those clusters whose response time is very high will not be given the task for execution as it can increase the scheduling delay in the

Application:
1. Efficiently manages the task placement in compute cluster. 2. Efficiently manages the task placement in distributed network.

References:
1. M. F. Arlitt and C. L. Williamson. Web server workload characterization: the search for invariants. In Proceedings of the ACM SIGMETRICS international

conference on Measurement and modeling of computer systems, 1996. 2. P. Barford and M. Crovella. Generating representative web workloads for network and server performance evaluation. In Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1998. 3. M. Calzarossa and D. Ferrari. A sensitivity study of the clustering approach to workload modeling. SIGMETRICS Performance Evaluation Review, 1986.

Bibliography

References
[1] David Flanagan JAVA in NUTSHELL 2005 ,OReilly [2] Michael Morrison Java 1.1 Unleashed, 3rd Edition- 1997 [3] Steven Holzner JAVA 2 PROGRAMMING AWT SWING Black Book-2005 [4] Herbert Shieldt JAVA Complete Reference ,6th Edition-2006 [5] Zbigniew M. Sikora "Java: practical guide for programmers-2003 [6]Patrick Niemeyer, Jonathan Knudsen "Learning Java- 2005 [7] Anthony T.Velte, Toby J.Velte, Robert Elsenpeter, Cloud Computing A Practical Approach ,TATA McGRAW-HILL Edition 2010. [8] Martin Randles, David Lamb, A. Taleb-Bendiab, A Comparative Study into Distributed Load Balancing Algorithms for Cloud Computing, 2010 IEEE 24th International Conference on Advanced Information Networking and Applications Workshops. [9] http://www-03.ibm.com/press/us/en/pressrelease/22613.wss. [10] http://www.amazon.com/gp/browse.html?node=201590011 [11]http://www.enisa.europa.eu/act/rm/files/deliverables/cloud-computingriskassessment/at_download/fullReport [12] http://download.microsoft. com/download/e/4/3/e43bb484-3b52-4fa8-a9f9ec60a32954bc/Azure_Services_Platform.pdf

Das könnte Ihnen auch gefallen