View Point

Performance Monitoring in Cloud
- Vineetha V

Abstract
Performance Monitoring is an integral part of maintenance. Requirements for a monitoring solution for Cloud are totally different
from a legacy and virtualized environment monitoring solution.
There are many third party tools/solutions available for monitoring a cloud. But there are no standard models exist for such a
solution as what all parameters needs to be covered in the solution so that an exhaustive performance report can be produced
for the service provider. In the context of Cloud Computing becoming the most sought after technology, there is a need for such
a standard model.
This paper is intended to provide a very brief introduction about Cloud computing and its architecture, illustrates how Cloud
Performance monitoring differs from traditional monitoring, the different types of monitoring in cloud and metrics of interest
to various users. The paper also proposes a high level view of a model which can be used as a guideline while coming up with
performance monitoring solutions for cloud.

www.infosys.com

Cloud Service Provider Cloud Service Consumer Cloud Service Developer Software-as-a-service Cloud Services Platform-as-a-service Infrastructure-as-a-service Server Virtualization Virtualized Infrastructure System Resources 2 | Infosys – View Point Server Storage Virtualization Storage Servers Storage N/W Virtualization N/W Hardware . Multi-tenancy. Google AppEngine and Windows Azure Services Platform Private Cloud Private cloud (also called internal cloud or corporate cloud) is a cloud set up with in a corporate or organization that provides hosted services to a limited number of people behind a firewall. Sun Cloud. Payper-Use make Cloud computing the most wanted technology today. Primary motive behind more organizations moving to cloud is the reduction in cost and dynamic resource allocation. There are different types of Cloud: Public Cloud A public cloud is one based on the standard cloud computing model. Hybrid Cloud A hybrid cloud is a cloud computing environment in which an organization provides and manages some resources inhouse and has others provided externally. IBM’s Blue Cloud. and it typically involves over-the-Internet provision of dynamically scalable and often virtualized resources. available to the general public over the Internet. In Cloud computing environment everything becomes a Service. and delivery model for IT services based on the Internet. SMEs do not need to worry about the maintenance of them. Theoretically a Cloud has all types of dedicated servers present in it to host practically any computer program. Cloud Architecture The most significant components of Cloud are the Front End and the Back End. Public cloud services may be free or offered on a pay-per-usage model. Also. A central server administers the system. For example. characteristics like Scalability. Since the underlying infrastructure is hosted by the cloud provider. such as Amazon’s Elastic Compute Cloud (EC2) for general computing but store customer data within its own data center II. This covers the application used to access the Cloud through interface like web browser and client’s network. monitoring traffic and client demands to ensure everything runs smoothly. Front End is the Client side. application and bandwidth costs are covered by the provider • Scalability to meet needs • No wasted resources because you pay for what you use Examples of public clouds include Amazon Elastic Compute Cloud (EC2). storage devices. The main benefits of using a public cloud service are: • Easy and inexpensive set-up because hardware. elasticity. Back End is the actual Cloud itself with various servers. There are set of rules and Middleware which helps in networked computers to communicate with each other. an organization might use a public cloud service. such as applications and storage.I. consumption. the User of the applications hosted in Cloud. network etc. in which a service provider makes resources. Of course there are some areas of concerns in Cloud like security and reliability. What Is Cloud Computing? Cloud Computing can be simply defined as Computing in a remote location or location independent with shared and dynamic resource availability on demand Cloud computing describes a new supplement. This frequently takes the form of web-based tools or applications that users can access and use through a web browser as if it is a program installed locally on their own computer.

that is not the case with virtual components due to the shared and dynamic nature of resource allocation. Since the Customer pays for the services/Infrastructure he uses.pdf] Typical data flow between various components in a DB transaction is as follows: Servers/ VMs VM Network Network Fabric SAN Fabric Storage Infosys – View Point | 3 .A Cloud service provider is interested in this kind of report. Thus performance Monitoring of Cloud should monitor the capability of components of cloud in delivering the expected service. IV. This involves performance of the various infrastructure components in the cloud like Virtual Machines. 1. the traditional performance management which focuses on specific components will not work for Cloud. resources will be dynamically allocated to applications. Infrastructure Performance .virtualinstruments. The monitoring solution should be capable of dynamically identifying the VMs in which the application is currently running and then collect the parameters. [definition from http://www. They are not well equipped to provide a more holistic view of the cloud environment. Also depending on the agreement between the service provider and the customer. Monitoring from Service providers view and Monitoring from Cloud Consumer’s view. It should help the provider assess whether customer’s demands can be met with the current resources/performance.III. a new approach called Infrastructure Response Time is being researched upon to get a more accurate picture of the performance of a virtualized/ cloud environment. even though they provide a good view of individual components. customer needs to be assured of a level of service at any time. Infrastructure Response Time (IRT) is defined as the time it takes for any workload (application) to place a request for work on the virtual environment and for the virtual environment to complete the request (from the guest to spindle and back again). The request could be a simple data exchange between 2 VMs or a complex request which involves database transaction and writes into a storage array. they fail to provide a visibility into the performance of the business applications hosted on these platforms. we can broadly classify it to 2 categories. Cloud Performance Monitoring When we consider monitoring performance of a Cloud. Although it is possible to approximate the performance of Physical infrastructure based on how its resources are utilized. convenient and holistic view of the entire environment. Thus Virtualized/Cloud environment requires a specialized monitoring solution which can provide effective. Service Level Agreements (SLAs) are very important in a Cloud environment. Network etc. Most of the Virtualization vendors provide management/monitoring solutions which collect a robust set of resource utilization statistics. To a large extend Clouds are based on virtualized resources. More than independent management of Physical and virtual infrastructure elements. they fail to provide a complete picture of the performance of the entire cloud environment For example with VMware’s VCenter/vSphere we can get basic resource utilization information of the ESX/ESXi host and virtual machines. How/Why Performance Monitoring Differs from Traditional Server Monitoring Since we have various types of IT Components in a cloud. Since individual component’s performance fail to provide an accurate view of the overall cloud performance. Also we need to get a view of individual applications hosted on cloud.com/ files/pdfs/WP_APM-Experts-Infrastructure-PerformanceManagement-for-Virtualized-Systems. focus should be on how they perform to deliver the Business Service to the User. Storage. In a nutshell we need a monitoring model for Cloud which can provide a view of the health of the entire cloud in delivering a service.

Key requirements for an Infrastructure Performance Management Solution for a Virtual/Cloud environment can be identified as below: • Support any application hosted in the environment. Usage. • Memory. Free. applications move around and so the monitoring solution needs to track and map them. Solution should be able to automatically discover and calculate IRT for these new entrants as well. percentage used. Cloud consumer. granted. is interested in this kind of report. and delta between CPUs • Disk usage. packets Disk read rate. Need to dynamically identify virtual and physical resources used by the application at a given point in time. Hyper-V etc • Percentage Ready • Network. balloon. total. overhead and shared Memory active. Application Response Time is the key metric in Application Performance management which actually calculates the time taken for the application to respond to user requests. CPU extra and guaranteed Number of CPU Cores CPU Usage in MHz. applications designed to be hosted in a cloud tend to have monitoring solution built in to the application itself. Full scope of the environment needs to be considered. free. Solution should be able to automatically identify the applications and their topologies and this need to be independent of the application architecture. used • Disk Latency • Continuous discovery. read and write requests VM Disk read and write rate Network data receive and transmit rate Guest OS and heartbeats in period Network packets received and transmitted Amount of guaranteed resource 2. bytes in/out • Host System State • Host System Resource Usage • Virtual Machine Configuration • Virtual Machine State • Metrics/Parameters of Interest in Infrastructure Performance Monitoring Sample list of Parameters monitored for Host and VM for VMware ESX/ESXi Host Machine Virtual Machine Running State Number of CPUs Overall Alarm Status Overall CPU Usage. CPU sample count CPU Speed per Core CPU Active average and peak over a period CPU Threads CPU Refused average over a period CPU Active average and peak over a period CPU running average and peak over a period Memory Size. active (% of memory) Memory swap in/swap out Memory swap in/swap out Guest and Host memory usage Memory heap. Infrastructure Response Time is the Key metric along with the various resource utilization metrics as follows: • CPU usage. 4 | Infosys – View Point . per CPU. whose application is hosted in the cloud. In Calculating Application Performance also we cannot go by the resources utilized by the application as in a cloud.Performance of the applications hosted in the cloud. • Percentage Busy • IRT must be calculated across the breadth & depth of the virtual environment. • Support for multiple platforms like VMware. Disk Capacity. total – all CPUs. Since this is of more interest to the application owners. swap activity • Be prepared for new applications and new infrastructure being added to the environment. Application Performance . consumed. command abort and issued VM network data receive and transmit rate.

An ideal monitoring solution for Cloud should be capable of providing all the above details. Infrastructure Response Time (IRT) As already discussed. we need to collect the resource utilization data from the Virtual machines. • Multi-dimensional reports - Different levels of Report for different users like overall infrastructure usage. Virtualization metrics Similar to the physical machines. • Solution to be capable enough to collect the response time per transaction and per application. Reporting and Collecting Performance Data Reporting The following reports would help the service provider to understand the cloud usage and its performance. Other important parameters related to Virtual Machines like • Number of VMs used by application • Time taken to create a new VM • Time taken to move an app from one VM to another • Time taken to allocate additional resources to VM are of importance as they also contribute to IRT and performance of the applications hosted in cloud. The agent needs to track and capture details of each transaction happening with the applications hosted in the VM. D. Metrics like Success percentage of transactions. count of transactions etc. VI. infrastructure reports like. Lot of decision making and determining SLAs are driven by the Cloud performance. Transaction metrics Transaction metrics can be considered as a derivative from IRT. • Derive transaction metrics/data from collected data on response time. Performance of the resources in the cloud • Busy-hour / peak Usage Report - Helps to get a clear view of the usage of the application for better planning of resources and SLAs • What If analysis • Trend Analysis Collecting • Monitoring application should collect Performance parameters like CPU utilization. utilization parameters of physical servers/infrastructure is an important factor in cloud monitoring. Need to ensure that the solution works well with different virtualization platforms. Parameters of Interest for Cloud Service Provider Cloud Service Provider needs to get an exhaustive view of the health of the entire cloud to assess the situation. Resource Utilization details Just like in any other performance monitoring. as these servers make up the cloud.V. A. Infosys – View Point | 5 . • Collect virtualization metrics from underlying virtualization platform. for an application would give a clearer picture of the performance of an application in cloud at a particular instance. usage reports of specific resources/datacenters etc - Application level reports like. IRT is very crucial as it has an impact on the application performance and availability which in turn affects the SLAs C. This provides a picture of how much of the VM is being utilized and this data helps in analysing the resource utilization by applications and to decide on the scaling requirements. Could be done either using an agent residing in the VM or an external monitoring agent. from physical as well as virtual hosts. IRT gives a clear picture of the overall performance of the cloud as it checks the time taken for each transaction to complete. Reports showing the infrastructure usage by each application. Cloud service providers would be interested in the below details to assess the actual performance of cloud. B. memory utilization etc.

Below is a proposed high level model for a Cloud monitoring solution with a view. Reporting Processing Data Collection Monitoring Agent IRT Application Data Application Platform VM Resource Data 6 | Infosys – View Point VM VM Physical Infrastructure VM Virtualization on Metrics .

jot.pdf 2.html 6.com/vsp40_i/wwhelp/wwhimpl/js/html/wwhelp. http://www.com/definition/ Infosys – View Point | 7 . http://pubs.com/cloud-computing1.Conclusion A detailed framework of the given high level model is the next stage of study.html 5. http://communication.ciozone.howstuffworks.com/vendorzones/ca/infrastr-perform-mgmt_238145. http://www.vmware.wikipedia.fm/issues/issue_2009_11/column3/index.htm 4. This paper is intended to provide a guideline or requirements for developing a cloud monitoring solution for service providers.htm#href=admin/c_performance_ metrics. A detailed view of components involved in the framework and integration with interfacing components can be considered as part of framework.org/wiki/Cloud_computing 3. http://en.techtarget. http://searchcloudcomputing. This can be extended further for hosting the monitoring solution on the cloud. REFERENCES 1.

suchinformation is subject to change without notice. . technology. Infosys provides business consulting.infosys.com www.com © 2012 Infosys Limited.About Infosys Many of the world's most successful organizations rely on Infosys to deliver measurable business value. India. For more information. Infosys believes the information in this publication is accurate as of its publication date. engineering and outsourcing services to help clients in over 30 countries build tomorrow's enterprise. Bangalore. contact askus@infosys. Infosys acknowledges the proprietary rights of the trademarks and product names of other companies mentioned in this document.