Beruflich Dokumente
Kultur Dokumente
Table of Contents
The First Step: Measure ...................................................................................................................3
The Next Steps: Manage and Improve.............................................................................................4
Transaction Timing ...........................................................................................................................4
Measuring Unpredictable Response Times .................................................................................5
Measuring Unpredictable Occurrence Patterns ...........................................................................5
Data Usage Patterns ........................................................................................................................6
Measuring Transactions with Unpredictable Purposes................................................................6
Measuring Unpredictable Access Paths ......................................................................................6
Storage Requirements......................................................................................................................7
Measuring Unused Data ..............................................................................................................8
Data Usage Products for Performance Management in the Complex Data Environment ...............8
Embarcadero Technologies Security Solutions –Monitoring and Management Made Easy ...........8
About Embarcadero Technologies ...................................................................................................9
We all want optimal database performance, and in the past we could rely on traditional
monitoring techniques to help us achieve that. But in today’s Complex Data Environment (CDE),
traditional database monitoring is no longer an adequate tool for managing database performance.
Over the last two decades, corporations have reshaped themselves to compete in a world driven
by the changing needs of the customer. Data is paramount for processing critical transactions and
making decisions that increase competitiveness. Today’s CDE includes many database types:
transactional, data warehouses, data marts, operational data stores, departmental, mixed-use, ad-
hoc, reporting, and so forth. These have become the vehicles to provide corporations the
information needed to survive and grow.
Whatever the type of application, databases drive the application and are the backbone of the IT
infrastructure. IT departments are charged with providing better and faster access to the right data
while controlling the bottom line. IT is responsible for providing constant availability to vast
amounts of data throughout multiple, heterogeneous databases. With data being a key asset to an
organization’s operations, IT must ensure the peak performance of the CDE.
How can IT management understand what is happening at the data level of their operations and
use that information to improve and ensure performance and end user satisfaction? The answer is
in three simple steps:
• Measure
• Manage
• Improve
It is difficult to manage when we don’t know what we don’t know! Since we can’t improve what
we can’t manage, and we can’t manage what we can’t measure, IT must first focus on measuring
the complex data environment. Then we can use the measurements to gain insight into steps that
can be taken to improve performance, control costs, increase efficiency and improve IT resource
planning.
In simpler times, databases had limited purposes, such as Online Transaction Processing (OLTP),
in which technicians could predefine and control the types of transactions that would occur and
could easily optimize performance for those transactions. There was predictability about what
data would be accessed and exactly how. Back then, internal metrics were adequate for ensuring
performance was stable and meeting service levels.
But in today’s CDE, databases serve broader purposes. They provide a foundation for managing
historical information and serve as a basis for business intelligence. The world of the CDE is
typified by instability, and the nature of the work being done is inherently unpredictable. Usage
patterns may change daily! Even many OLTP systems have evolved to mixed-use, in which non-
OLTP activity can interfere with OLTP transactions.
To manage performance today, we need external metrics in addition to the internal ones. What do
we mean by internal and external?
Internal Metrics – These focus primarily on the internal workings of the operating system and
the database, and include metrics such as cache hit rates, CPU usage, disk I/O activity, and so
forth. These make up the core of traditional monitoring.
External Metrics – These focus outside the operating system and the database, on the interaction
of users with the data, such as who is using what data, when, how, how long, how often, and from
what applications. This allows IT to understand how the business is using the data in order to
proactively manage and provide the best performance possible.
The challenge for IT is to understand the underlying factors that drive these problems and how
they can be managed and improved to ensure end user satisfaction. Then IT can make informed
decisions on how to improve the usability, predictability, manageability and cost of their data
environments.
The underlying factors all derive from the differences between yesterday’s simpler data
environment and the modern CDE. They are:
• Transaction timing
• Data usage patterns
• Storage requirements
Now let’s look at these underlying factors and at recommendations on how to manage and
improve them.
Transaction Timing
Many CDE transactions are fundamentally different than predefined transactions. Predefined
transactions are measured in the number of seconds required for execution. Typically they require
the access and manipulation of only a few units of data, so response time is expected to be
immediate. And they occur at predictable times of the day.
CDE transactions, on the other hand, have response times from sub-second up to many hours.
Thirty minutes is not an unusual response time for a CDE transaction. CDE transactions may
involve a few units of data or millions of units of data. And they may occur at unpredictable
times of the day.
Because of these very fundamental differences between unpredictable CDE transactions and
repetitive, static transactions, measuring and monitoring must be different as well.
In some cases the difference between the two times will be minimal or even zero. But in other
cases the difference will be substantial. Both measurements need to be made because the end
users’ perception of response time is shaped by both measurements. Some will care about how
long it takes to get any data. Some will care only how long it takes to get all the data. For these
types of applications, longer response times may be acceptable, as response time of a few seconds
may not be expected.
Just because a response time is long, it does not mean it is a problem. IT management must
determine what is acceptable to the end-user.
By measuring response times by department, user, query type, or which tables are accessed, IT
can work with users to determine if the response times are a problem and changes can be
explored. Perhaps new query tool capabilities are needed. Maybe queries could be run outside of
peak usage times or at night, when there is less usage. Perhaps there are network issues that can
be addressed.
In the end, IT must know 1) the response time for first data returned, 2) the response time for last
data returned, and 3) user expectations for the above two measurements so that management and
improvement can be addressed.
IT can address improving access and performance of data environments once they have a
comprehensive view of all daily activities. By understanding the activities of the day and the time
of day when they occurred, IT management can identify ways to manage the varying workload.
When we identify peak periods and light periods, we may be able to shift some queries to the
lighter times, smoothing out response times. Users may need additional training or new tools.
Perhaps a SQL tuning tool may be needed or the user used an improper or unsupported SQL
generator. IT management needs to be aware of how the choice of tools or lack of training can
affect performance.
In a word, the workload of modern data environments is one of inherent instability and
unpredictability. Yet this workload requires management as much as a static, predictable
operational workload.
Also, whereas workload sampling is adequate for monitoring predefined environments, the
unpredictability of CDE workloads requires that data usage tracking be continuous so that we
have a complete picture of who uses what data and how. This ensures that our data management
decisions are based on a true understanding of how the business uses the data.
Once usage patterns are understood, IT can take the following steps to improve access to the data
and, ultimately, the performance of the query:
• Indices can be added
• Redundant units of data can be intelligently deleted
• The physical location of data can be optimized
• The merging of tables with like patterns of access can be accomplished
• Summary tables can be built
• Standard queries can be designed
In short, once the pattern of access and usage of data are understood, there are many ways in
which the data can be manipulated in order to make data access most efficient. However, without
knowing about the usage of the data, there is no way to intelligently reorganize it. Data usage
tracking allows us to find patterns so we can ensure that the data design keeps up with the
changing needs of the business users.
Storage Requirements
Another major issue in the management of CDE systems is managing and controlling the volume
of data. Clearly, these systems demand a large amount of historical data. The historical data
requires large amounts of storage. In the past, some small amounts of historical data have been
contained in operational systems. But for the most part historical data has been kept to a
minimum in operational systems. However, with CDE systems, there is exactly the opposite
trend. CDE systems may contain multiple years of historical information and huge amounts of
disk space are required.
In addition, the CDE requires the storage of fine detail. Detail is required in order to let the users
look at data in more than one way. Since there is no way to accurately predict what data will be
needed, it is necessary to store data at the lowest level of detail.
Finally, summary data is stored as well as detail data. Summaries are stored to make some CDE
transactions run faster. This requires space over and above the detailed data.
The CDE often grows at an astonishing rate. When these environments first grow beyond their
hardware capacity, we generally expand disks, processors and database licenses. As they mature,
it is natural for the pattern of growth to continue. But at some point in time, as growth continues
the question must be asked:
There will be a point at which significant data has collected, yet much of it is not being used.
When this point is reached, economically and technologically, it makes sense to remove unused
data and reuse the space the data has been occupying rather than to purchase new storage. In
addition, by removing dormant or redundant data, user access times are improved.
We saw that traditional monitoring focuses on the internal functioning of an operating system or
database management system. By going beyond the internals and focusing on how users are
interacting with the data, IT management gains a comprehensive view of how business actually
uses the data environment.
By getting real-time and historical insight from external metrics – who is using what data, when,
how, how long, how often and from what applications – regular improvements can be made that
ensure peak performance of mission-critical applications across the enterprise. We can improve
business performance, control costs, increase efficiency and effectiveness for users, achieve
service level promises and enhance the user experience.
Embarcadero’s solutions provide 24X7 usage tracking of all user sessions, queries, tables/column
accesses and dormant data. Extensive data usage reports provide information on a daily, weekly
and monthly basis for analysis. IT Management now has the facts to manage and improve the data
environment.
Embarcadero, the Embarcadero Technologies logos and all other Embarcadero Technologies product or service
names are trademarks of Embarcadero Technologies, Inc. All other trademarks are property of their respective
owners.