Sie sind auf Seite 1von 5

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 6June 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 1959



Storage Management and Data Acquisition
Kavya Bhat
Computer Network Engineering, R.V. College of Engineering, Bangalore

Abstract Enterprise storage is the computer data storage
designed for large-scale, high-technology environments of the
modern enterprises which is very time efficient where in
stored data can be accessed in less time. When comparing to
the consumer storage, it has higher scalability, higher
reliability and better fault tolerance. As well, criticality of
data varies between enterprises. Challenges faced in current
scenario to store big data are in terms of cost, data loss,
efficiency while accessing data, maintaining consistency of
data and many more. In order to provide better storage
solution and data management, the proposed solution came
up with platform of Workflow Automation (WFA). WFA is
an active management tool which directly allocates storage
on storage server based on client request. It depends on a
data source i.e., OnCommand Unified Manager (OCUM) to
monitor the storage components. OCUM acts as a passive
reporting tool, which polls all the storage data at different
time stamps. The monitored data includes parameters and
attributes of storage component like corrupted disk data,
normal disk data or may be some lack of storage space. WFA
has cache based intelligence and it acquires only relevant
data of context from OCUM. Based on this acquired cache
data, WFA can provide better storage solutions and data
management by which it takes care of conditions like
maintaining health of storage and takes appropriate actions
like migrating data, replacing corrupted disk etc., The
acquired cache data can be queried by filter/ finders to select
storage component as a resource on which data is stored. The
results of which will work on selective resource, to execute
specific task of interest using workflows. Query results
return the count of storage components and related
information to verify consistency and no data loss from any
storage resource. Hence the proposed solution helps in
performance tuning of big data storage solutions in terms of
data access time, reliability, efficiency, data consistency and
security. It reduces the cost of managing storage, enables
adherence to best practices for storage processes.

Keywords Workflow Automation (WFA), Workflow,
WFA Server, Cache Acquisition, Data Source, OnCommand
Unified Manager (OCUM).
I. INTRODUCTION
In today's world enterprises respond quickly with
storage solutions, by evolving business conditions with
extreme flexibility. With automation and advanced
virtualization technologies, business enterprises deliver
competitive advantages that allow businesses to become
more successful and contribute to a smarter planet.
Enterprise storage is the computer data storage designed
for large-scale, high-technology environments of the
modern enterprises. When comparing to the consumer
storage, it has higher scalability, higher reliability, better
fault tolerance, and much higher initial price.
Enterprise storage is a centralized repository for
business information that provides common data
management and protection, as well as data sharing
functions, through connections to numerous (and possibly
dissimilar) computer systems. Developed as a solution for
the enterprise that deals with heavy workloads of
business-critical information, enterprise storage systems
should be scalable for workloads of up to 300 gigabytes
without relying on excessive cabling or the creation of
subsystems. Other important aspects of the enterprise
storage system are unlimited connectivity and support for
all the different platforms in operation.
Enterprise storage involves the use of a storage
area network (SAN), rather than a distributed storage
system, and includes benefits such as high availability and
disaster recovery, data sharing, and efficient, reliable
backup and restoration functions, as well as centralized
administration and remote support. Through the SAN,
multiple paths are created to all data, so that failure of a
server never results in a loss of access to critical
information.
Work Flow Automation (WFA) is software
developed for fast and customized deployment of storage
for critical applications. It is a highly flexible automation
product that can be used to automate storage management
processes such as provisioning, setup, migration and de-
commissioning. Workflows are series of steps grouped
together to performs an end-to-end Storage process.
Automation of such repetitive workflows reduces
operational costs and saves money for the business. Tool
provides flexibility for the storage architect to customize
the workflows to suit customer defined processes, both at
the time of deployment and on an ongoing basis. Hence
provides architects full control to define the process for
storage automation at a low level of granularity. It is
simple and easy to use portal for storage administrators
and operators to execute pre-defined workflows with little
training. [1]
Cache is being used to increase storage
performance i.e., for faster and reliable read and write
operations. Cache plays very important role since it stores
information related to basic buildings blocks of WFA.
Further data access completely dependent on cache. Thus
performance tuning can be done from cache data to
improve the storage data access performance. Cache
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 6June 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 1960

acquisition in WFA provides proper, customized reliable
data access with less waiting time.
Various clients need storage solutions for
different services. WFA is the one application which
provides end to end solution to meet various services.
Access to WFA is given through web service to execute
different work flows. WFA cache stores different objects
along with their parameter values. OnCommand
management software improves storage and service
efficiency through functions that help you control,
automate and analyze your shared storage infrastructure.
Workflow is executed from data acquired from
OnCommand which ends up with necessary changes in
storage system. At the same time trigger updates the cache
about changes in objects of storage system. Hence
whenever client needs access to same data, cache helps in
reducing data access time.
The purpose and scope of the application is to
acquire the data from data sources which in turn have poll
storage management data from storage system.
Maintaining consistency of data over storage system and
maintaining health of storage system is very complicated.
Such error prone, expensive and complex tasks are being
done in this, without any manual effort. In reality, it
reduces a lot manual effort and saves a lot of cost involved
in such complex processes. It also provides an extensible
platform for flexible automation utilizing the power of
smart storage. Thus performance of WFA directly
dependent on cache, there is need for cache acquisition [2].
II. RELATED WORK
Literature survey is carried out to get background
information on the various issues to be considered in this.
Many organizations perform most of their storage
management tasks manually. This time-consuming
process impedes staff use of the applications, potentially
delaying important project tasks. Whats more, even well-
trained IT personnel may make errors in these manual
processes, which can lead to provisioning delays,
unavailability of data and reduced productivity. The
manual processes themselves may not incorporate best
practices or take advantage of advanced features. As a
result, storage resources may not be optimized, increasing
the cost of storage. At the same time, cloud-based
processes require high levels of automationnot manual
processes [3].
The alternatives of custom software and
orchestration solutions are typically difficult and
expensive to operate, maintain, and scale, and lack a
comprehensive storage component, respectively. Different
customers seek service in terms of storage and its
management. Hence after getting storage services, for
accessing the stored data customers need to execute set of
operations called workflows. Each customer looks to
automate these processes using their own unique settings,
naming standards, options and best practices.
Many storage processes based on customer
requirements, require different operating system
configuration (VM, tickets, storage switches, etc.).
Storage teams who provide storage services do not have
programming skill sets which is specific to customer
standards and Vendor based products often cant cater for
every use case.
Study of some of key factors and issues like
different workflow based on storage systems and
applications depending on it lead to the development of
cache acquisition in order to increase performance in the
same. Delay and error in accessing data for various clients
lead to need for efficient ways of accessing stored data,
for which acquiring data in cache and automating the
same for providing further service to access uncorrupted
data at faster rate, was required. Complexity in storage
system became difficult for inexpert end user or
application for processing data which needs to be
automated in order to reduce burden. Study of all such
issues motivated to develop proposed application [4].
Based on study of all survey data and papers
there is need for proposed solution. Following are some of
main reasons for developing this solution,
Custom software development: Many organizations
turn to in house software engineers or partners to
write custom code for automation. These approaches
often prove expensive to operate and slow to adapt to
the changing needs of storage consumers. Lastly, the
total cost of ownership of custom software makes it
an ineffective solution in most cases [2].
Rely on data-center orchestration solution: Although
orchestration solutions prove to be beneficial for end
to end automation. They lack a comprehensive
storage component to meet customer process needs.
As seen in other domains such as monitoring, an
expert storage solution is required in order to address
storage automation requirements. Do nothing
approach is expensive [5]. Costs include people time,
inability to deploy cloud and self-service
environments and mistakes that stem from manual
operation.
III. SYSTEM OVERVIEW
This section describes overview of proposed
solution. Workflow Automation (WFA) is a highly
flexible automation product that can be used to automate
storage management processes such as provisioning, setup,
migration and de-commissioning. To provide storage
solutions to customers certain set of operations are carried
out which are called as workflow. Each workflow is a
series of steps that complete an end-to-end process.
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 6June 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 1961

Performance tuning in high end data storage
services in terms of data access time, reliability, efficiency
and security is achieved by acquiring related storage data
in cache automatically and acquired data consistency and
persistence is verified for the same.
WFA cache includes creating and editing of
Cache schemes, cache tables, cache queries, data source
types. It has reservation tables for storing information
related to newly created resource and cache table for
storing information related to their updates. Thus
collecting and storing all such information about various
resources and their transaction updates in cache is called
as cache acquisition. Accessing the WFA cache DB by
developing filters and testing SQL queries directly on the
database results in quickly verifying contents in the cache
DB. WFA has cache based intelligence by which storage
solutions and related decisions are taken in certain
circumstances like disaster recovery, data loss etc. WFA
depends on data acquired from data source which in turn
depends on data polled from the clusters.
The high level overview of proposed solution is
represented in the following figure 1. Issues to be
considered while developing WFA cache acquisition are
in order to acquire the data from OnCommand storage
management software which in turn polls data from
storage system like ONTAP. It includes several issues if
manual effort is involved. Hence to avoid that acquisition
automation is carried out which needs to consider several
issues like:
With development of cache acquisition in different
workflow based storage systems and applications
should lead to increase performance in the same.
Delay and error in accessing data for various clients
lead to need for efficient ways of accessing stored
data, for which acquiring data in cache and
automating the same was required.
Complexity in storage system became difficult for
inexpert end user or application for processing data
which needs to be automated in order to reduce
burden.
Network connectivity with OnCommand Software
Management software should be fair enough.

Figure 1: System Overview
IV. SYSTEM DESIGN
Design is one of the most important phases of
software development. The design is a creative process in
which a system organization is established that will satisfy
the functional and non-functional system requirements. In
the process of design, main intent is to find various
independent and dependent modules called subsystems.
Large systems are decomposed into sub-systems that
provide some related set of services. Interaction between
subsystems is established to achieve the desired
requirement. The initial design process of identifying
these sub-systems and establishing a framework for sub-
system control and communication is called architecture
design and the output of this design process is a
description of the software architecture. The architectural
design process is concerned with establishing a basic
structural framework for a system. It involves identifying
the major components of the system and communications
between these components. [6]
The system architecture shows the various
subsystems of the system along with their interactions
with other subsystems required for the project. Figure 3.2
shows the system architecture. Bigger modules called as
systems are divided into various small modules called
subsystems as shown. So this section includes description
of various factors like design considerations and
constraints in carrying out the work considered during
design which will ultimately leads to design of system
architecture [3].
System Architecture
In the process of design, main intent is to find
various independent and dependent modules called
subsystems. Large systems are decomposed into sub-
systems that provide some related set of services.
Interaction between subsystems is established to achieve
the desired requirement. The initial design process of
identifying these sub-systems and establishing a
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 6June 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 1962

framework for sub-system control and communication is
called architecture design and the output of this design
process is a description of the software architecture. The
architectural design process is concerned with establishing
a basic structural framework for a system. It involves
identifying the major components of the system and
communications between these components [3].
The system architecture shows the various
subsystems of the system along with their interactions
with other subsystems required for the application. Figure
2 shows the system architecture. Bigger modules called as
system are divided into various small modules called
subsystems as shown.

Figure 2: System Architecture
V. IMPLEMENTATION
The proposed application of WFA cache acquisition
can be modularized in to following modules:
Web Service Development: Development of web
service for invoking acquisition from different data
sources from WFA portal is required. WFA supports
utilization and monitoring of its workflows via web-
services. Web services are supported using two
common REST and SOAP binding. Once request
arises from different orchestration system, request
with REST binding through WFA forms acquisition
request and execution of acquisition request is
scheduled by scheduler.
WFA Cache Acquisition: As acquisition request
arises from WFA portal, scheduler schedules the
acquisition. In WFA, add the data source from which
data has to be acquired with proper input parameter.
Data sources poll data from different ONTAP storage.
Once data source is been added, acquire the dada
from it. WFA cache has acquired data in it. Develop
E2E infrastructure to support cache acquisition web
service. Cache acquisition is verified by writing direct
query in terms of filters to cache database. Count and
attribute filters are developed to verify completion in
acquisition, consistency of data acquired and also
performance. While developing filters for querying
different storage resources in terms of volumes,
aggregates etc follow different object templates which
are considered [7].
Workflow execution: Acquisition of data, results in
reporting the status of storage management with only
relevant parameters. Based on these parameters to
provide better solution through WFA workflows
which are set of commands can be executed.
Workflow execution results directly on storage
systems, Data ONTAP.
In terms of processes included throughout the project
can be divided as below:
Product Development: In this module, development
of web service for invoking Cache Acquisition is
included. WFA supports utilization and monitoring of
its workflows via web-services. Web services are
supported using two common REST and SOAP
binding. Development of filters and finders for
resource selection is next criteria to be considered.
Different filters and finders are developed in order to
select different resources based on different criterias.
Developing filters involves writing Sql queries to
WFA cache. Add custom filters to test all cacheable
objects. After cache acquires data regarding different
resources in terms of storage, volumes, aggregates,
Resource groups, Dataset different customized filters
are defined or developed to verify them.
Automation Work: Includes to automation of cache
acquisition of any one version OnCommand.
Acquiring data from different data source or
controllers can be automated by reducing manual
effort for the same. Acquire cache from different OC
versions like considering different versions of
OnCommand versions, acquisition is done.
Developing E2E infrastructure to support cache
acquisition web service. Web services are developed
in order to invoke client request of automation. For
these web services invocation E2E infrastructure
support is developed.
Validation: Includes execution of filters for validation.
Validation of data acquired is carried out by selecting
different filters which helps to verify the acquired
data, if values of different attributes are found
changed then validity of data fails, then check object
count i.e., number of objects(data acquired) should
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 6June 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 1963

match after automation of cache acquisition with
actual data after manual effort of acquisition for same
data source. Validation consistency refers to polling
the cache database for every regular intervals of time
should return same data without any modifications in
it. It should not fail to answer the filter queries to
select any data over cache acquisition across all
supported versions of OC. The validation should
include assurance that no warning appears in the logs.
On similar way same can be represented in following
dataflow diagram as shown in figure 3.


Figure 3: DFD for WFA Cache Acquisition
In this Data Flow Diagram WFA managing
storage is subdivided into different processes involving
WFA cache acquisition from data sources, validating
acquired data, considering cache and validating acquired
data directly with it, generating validation report based on
which acquisition is successful or not is checked, after
acquisition executing different workflows from WFA
directly on storage ONTAP systems [8].
VI. CONCLUSION
The proposed solution helps in performance tuning in
high end data storage services in terms of data access time,
reliability, efficiency and security. Some of the key
benefits of proposed solution are:
Automates a range of storage processes, including
provisioning, setup and migration.
Decommissioning
Reduces the cost of managing storage.
Enables fast and reliable turnkey storage deployments
for key applications,
Helps enable adherence to best practices for storage
processes
Integrates with key internal IT systems
Provides process customization to meet
organizational needs.
ACKNOWLEDGEMENTS
I would like to thank Dr. S. R. Swamy from R.V.
College of Engineering, Bangalore for feedback on the
project.

REFERENCES

[1] Emalayan Vairavanathan, Samer Al-Kiswany, Lauro
Beltro Costa, Zhao Zhang,Daniel S. Katz, Michael Wilde,
Matei Ripeanu, A Workflow-Aware Storage System:An
Opportunity Study, The University of British Columbia,
University of Chicago. 2012.
[2] Xiaoping Du, Jianjun Song , Yangsheng Zhao, The
Workflow Management System of the Acquisition based
on Capacity and Simulation, The University of British
Columbia,University of Chicago, The Academy of
Equipment Command and Technology Beijing.2010.
[3] J. Wozniak and M. Wilde, Case studies in storage
access by loosely coupled petascale applications.
Petascale Data Storage Workshop.2009.
[4] L. B. Costa and M. Ripeanu,Towards Automating
the Configuration of a Distributed Storage System.
ACM/IEEE International Conference on Grid Computing
(Grid). 2010
[5] Z. Zhang, D. Katz, M. Ripean, M. Wilde, et al. AME
An Anyscale Many-Task Computing Engine., in
Workshop on Workflows in Support of Large-Scale
Science.2011
[6] S. Bharathi, A. Chervenak, E. Deelman, G.Mehta, et al,
Characterization of Scientific Workflows, in Workshop
on Workflows in Support of Large-Scale Science. 2008.
[7] S. Al-Kiswany, A. Gharaibeh, and M. Ripeanu. The
Case for Versatile Storage System, In Workshop on Hot
Topics in Storage and File Systems (HotStorage). 2009.
[8] Jilanic, Nadeem A, Tai-hoon Kim Eun-suk Cho,
Formal Representations of the Data Flow Diagram: A
Survey, Advanced Software Engineering and Its
Applications, 2008

Das könnte Ihnen auch gefallen