Sie sind auf Seite 1von 12

Understanding Data Growth and Best Methodologies for SQL Optimization with Toad for Oracle Xpert

A White Paper February 2005

written by Product Marketing Manager, Toad

Bryan Huddleston

CONTENT INTRODUCTION .................................................................................................... 3 DATA GROWTH .................................................................................................... 3 THE GROWTH OF DATA IN NUMBERS ..................................................................... 3 WHY DO WE HAVE ALL OF THIS DATA? ....................................................................... 4 WILL IT GET ANY BETTER? ....................................................................................... 5 RADIO FREQUENCY IDENTIFICATION (RFID)............................................................ 5 VOICE OVER IP - VOIP ................................................................................... 5 DIGITAL AUDIO/VIDEO .................................................................................... 5 GETTING DATA TO THE USER .................................................................................... 6 THE SQL EXECUTION PROCESS WHAT HAPPENS WHEN I
HIT

EXECUTE? ............................... 6

SQL OPTIMIZATION CHALLENGES .............................................................................. 7 COMPLEX/DIFFICULT TASK ................................................................................ 7 TIME CONSUMING .......................................................................................... 7 METHODOLOGY .................................................................................................... 8 GO-TO PERSON .......................................................................................... 8 TEAM APPROACH ............................................................................................ 8 QUEST METHODOLOGY IDENTIFY, REWRITE AND TEST .................................................... 9 IMPLEMENTATION ............................................................................................... 10 MANUAL .................................................................................................... 10 PRODUCTIVITY TOOLS .................................................................................... 10 QUEST TOOLS ................................................................................................... 10 CONCLUSION .................................................................................................... 10 ABOUT THE AUTHOR ............................................................................................ 12 ABOUT QUEST SOFTWARE ..................................................................................... 12

Introduction
If your shop has not experienced data growth in the last two-10 years, you are in the minority. The explosion of the information age has increased the amount of data we send, receive, process and use. Whether its customer or employee information, portals, dashboards or even personal data like MP3s and downloadable videos, our IT systems and personal computers are becoming just like my Dad -packrat junkies for data. This paper is the result of reading industry articles and analyst papers and conducting numerous customer interviews to draw some conclusions about what the future holds for data growth, what it means for application owners and, ultimately, the person responsible for application performance. It informs application owners about business considerations that result from data growth, and explains that these owners can prepare their applications by employing a methodical approach to optimizing code within their organization. This paper focuses in on what developers have control over, what will arguably create the biggest gains for productivity and what it takes for management to have a be prepared approach. Ironically this is usually the last approach to be implemented for data growth and SQL statement optimization. Lastly, this paper suggests a successful methodology for optimizing SQL code and explains how Toad for Oracle Xpert can make SQL optimization seamless and less time-consuming by identifying potential performance-limiting factors. The reader should walk away with an understanding in the following areas: The breadth and growth of data What factors contribute to that growth How to apply that knowledge within their company How to incorporate a methodical approach for code optimization Ways to overcome potential performance issues when applications query data

Data Growth
This section discusses the impact of data growth in the industry, provides some examples of companies using data for a strategic advantage, explains why we are inundated by data growth and identifies future trends.

The Growth of Data in Numbers


So what type of data growth are we talking about here? Is it doubling? Tripling? analyst papers on what to expect from data growth in the future. There are numerous

META Group: Net annual storage growth will average 20 25 percent for enterprise (monolithic), 50 55 percent for midrange (modular), and 80 85 percent for low-cost capacity based (SATA/ATA) storage yielding aggregate storage growth of about 45 percent per 1 year. According to Winter Corporations annual 2003 storage survey, France Telecoms database (the top winner in database size at 29.2TB) was three times as big as the winners for 2001.2 In addition to data storage, transactions numbers are increasing. The U.S. Bureau of Customers and Border Protection almost doubled its workload from 2001 to 51, 450 tps.

The best example is Wal-Mart. Wal-Mart has 460 terabytes of data stored on Teradata mainframes at its Bentonville headquarters. In perspective, the Internet has less than half that much data, according to experts.3 Now you may be saying to yourself, Im a small-to-medium business (SMB) theres no way my data growth can come close to these examples. Wrong. At 2004 GigaWorld, Bob Zimmerman, Forrester analyst, cited an example of an SMB that acquired a company that subscribed to a service that would deliver a terabyte of new data every four to six weeks. This single service would increase data by 250 percent in 12 months. The article goes on to say that administrators were unprepared and ill equipped to accept the first block of data. Another excellent example is the U.S. Air Force. The USAF is designing a flight data collection system to prototype in 2005. Each flight of a new aircraft will generate more than a terabyte of data. By the time this gets into production in 2006, data volumes are projected to approach 20 petabytes per year excluding replication. Zimmerman explains that every IT shop should be prepared for a 200 percent unanticipated increase in data volume at any given time. This preparation must consider the impact of such extreme changes on operations, data security and applications.4

Why Do We Have All of This Data?


A recent research note from Gartner claims that CIOs who have business acumen and are technology savvy are the most valuable. 5 I would argue that this is not limited to C level management roles in IT, but all levels of IT management. All IT managers are providing a service to better the organization. 6 Understanding business needs, and the accompanying data flow, ensures that your companys IT department is valued and respected instead of exploited. This section discusses the business reasons for data growth. Understanding your business and upcoming organizational needs enables you and your staff to provide proactive measures for dealing with data. So, companies data coming in from multiple sources. Why do we have it? Where is it coming from? What do we expect in the future? The reasons are five-fold: Due to business needs, improved instrumentation that captures digital, rather than analog data, has driven the growth of scientific, engineering and production data. Take the USAF example above. USAF designers use data to make performance changes to their planes and give pilots a competitive advantage in combat. Without aircraft instrumentation collecting these vast amounts of data, competitive advantage is far more difficult to obtain. Automated enterprise business processes such as enterprise resource planning and customer relationship management have implemented systems to capture employee, customer and financial data. 7 More recently, regulatory mandates like Sarbanes-Oxley and HIPAA have moved into this space. In the future, it is quite possible that tracking/change management data could be greater than the actual data itself. Individual productivity software such as e-mail and word processing are creating as much data as some ERP applications. 7 I will use a personal example to illustrate this. My personal work mailbox is greater than 140MB. My archived mailbox is over 300MB. That is almost half a gigabyte of data in two years. Multiply my situation by 2000 employees and you have a lot of data sitting in Microsoft Exchange. Whats more interesting is how I use the data in my mailbox. I regularly search that data to reference what I cant remember. This is becoming more and more important as the term Enterprise Search becomes ubiquitous. Everyone in the organization will have the ability to make decisions based off of the data I store and create, improving individual productivity. Analytics are used to improve a companys business process and outcomes.7 When Hurricane Frances was on its way to Florida, Wal-Mart executives were able to use their data, combined with predictive technology, to mine trillions of bytes of shopper history and determine which stores would need certain products. Wal-Mart was able to stock shelves prior to the storm and according to the company, most
4

of the products that were stocked for the storm sold out quickly.3 The price/capacity of storage has been and will continue to be the driver for this trend. As if the 250GB drives for $120 (before rebate) in the Sunday Best Buy ad are not enough validation, META Group indicates that like-for-like price/capacity storage will improve 35 percent per year.1 These are the business reasons for the growth of data. In a management capacity, tying this information back to the business allows for companies to not use but rather advantageously exploit IT. In these cases, data and the ability to manipulate it is not only powerful, but also a competitive advantage in regulatory compliance, improved productivity and profit.

Will it get any Better?


We have established that data is growing because of business needs and trends in the industry. We also know the data will continue to grow in the future from new and impending sources. But where is this data going to come from? There are several sources: RFID, VOIP and digital audio/video.

Radio Frequency Identification (RFID)


Wal-Mart is my favorite example in this paper because currently, they are not only using data for multiple reasons with numerous benefits -- they are also looking to the future. RFID, combined with data tracking, could transform Wal-Mart from a retail business into a logistics business.11 Every level of Wal-Marts organization (including suppliers) is told within hours, depending on the event, that they need to do something. With RFID tags on every product, Wal-Mart can track a specific product throughout the supply chain. If they can track the product from beginning to end, "Wal-Mart will never take those products onto its books," said Bruce Hudson, a retail analyst at META Group. "If you think of the impact of shedding $50 billion of inventory, that is huge." 3 For profitability, it makes sense for Wal-Mart to use RFID and data tracking. Consider the amount of data that will be stored in Wal-Marts database -- 460 terabytes will quickly become a mere drop in the bucket as each item is tracked throughout the supply chain.

Voice Over IP - VOIP


VIOP is a great technology because every phone call and message can be digitally stored. This is not a new concept, as telemarketing organizations have been doing this for years. Couple this with the ability to intelligently search for specific words within the VOIP data and you have a productivity tool.10 At Oracle Open World 2004, such technology was demonstrated on the main stage for Oracle 10g. When someone did a corporate search for a specific phase, a phone conversation from a couple months ago was identified from the search results. This is a boon to people like me who have a tough time remembering what I did yesterday, much less a phone conversation. It improves my productivity to be able to go back and understand what I was discussing with my team and how that strategically fits in with both current and future planning.

Digital Audio/Video
Everyone I know either has an iPod or iPod envy. Its the data or the digital video/music that is stored everywhere that presents the future challenge. Drawing again on personal experience, I purchased an 80-hour, 160GB, TiVo over Christmas. My family had missed the last two hours and 40 minutes of the Survivor finale because of a VCR set up error. Upon setting TiVo up, I read about the ability to take shows and put them on a laptop and take them with you with TiVoToGo. This is great for me because I travel and I can put them on my work laptop and I can watch them at my leisure. Just as retail will have to cope with RFID, media will have to cope with digital formats of video and music. Id love to be able to link readers to the two-hour special on the History channel I watched
5

about the rise of Wal-Mart for this paper. Better yet, Google has introduced a new video search utility. 8 The real question is how will this technology on the consumer side be used in the corporate world? What digital media will be stored on corporate servers and how will users have effective access to this stored data?

Getting Data to the User


Considering current growth trends, we will have even more data to present to customers and users. This data will impact business decisions, and rapid data collection will lead to rapid decision-making. Today, a great deal of data is stored in databases, so optimizing the interface and the SQL that interacts with the database is the best chance users have to proactively improve performance. An explosion of data growth has a direct impact on application performance. Much like filling a bucket with water, how do you know when is it going to overflow and affect the area around it? The growth of data will have the same affect on the performance of current and future applications. There are several areas that will impact database performance, and they should been examined as data grows and performance becomes an issue. They are: Hardware Upgrades Operating System Configuration Database Configuration Database Design Indexes SQL Statements

The following section describes the SQL execution process, what challenges face developers, development teams, QA teams and ultimately the people responsible for the performance of the application. It also suggests a methodology to overcome these challenges and demonstrates how Toad for Oracle Xpert from Quest can optimize SQL for improved application performance.

The SQL Execution Process What Happens When I hit Execute?


The database process is complex. I will use a non-technical example to illustrate. Russ the handyman is running an errand to gather building materials for a home project. Time is of the essence, as Russ has dinner plans later in the evening. Russ will have four stops before returning home: the tool shop, the lumberyard, the home building center and the designers office. Russ has a new 2005 Dodge Ram with a Hemi. The shortest time between stops will ensure that Russ makes his dinner plans. There are many different ways Russ could get directions. He could look them up on a map, ask his friend Denise or input the target stops in MapQuest. He chooses MapQuest because it takes less time. MapQuest identifies that there are over 1,000 different ways Russ could potentially travel in order to complete his errands. He chooses what MapQuest determines as the shortest, grabs his directions from the printer and heads out. However, once Russ gets on the freeway toward his first stop, he hits road construction and, even with his powerful engine, is slowed down. After his first stop, he discovers that another street has a detour and he will have to take a different route. It is not until he completes all his errands and returns home that he sees a different path would have provided a much faster errand completion time.

SQL statement execution is much like Russ and his plan for accomplishing multiple errands. A SQL statement with data from four tables in the database has to make four stops to pick up the data and deliver the results to the person querying the database. Process efficiency (or the shortest distance that the SQL statement takes to retrieve its data from each of the four tables) is the amount of time it takes to provide results. It is not until this SQL statement is identified that all paths through the Oracle database are calculated and tested and the shortest route is determined. This process is called SQL tuning or SQL optimization.

SQL Optimization Challenges


Optimizing SQL has two main challenges. It is a complex skill to master and it is a time consuming task. At the simplest level, the person optimizing SQL must: Understand how the database processes SQL statements Know database structure Know data in the database Try different ways to write a SQL statement

Complex/Difficult Task
Undoubtedly, this is a difficult task. There are an untold number of papers, books articles and experts on the subject of the mechanical nature of tuning SQL. As an example, ask members of your team to explain how the database processes SQL statements how they use that data to optimize SQL and see how many different answers you get.

Time Consuming
Below is an example of a very simple query that developers and DBAs create every day. This simple SQL statement can be rewritten a total of 11,901 times. select emp_name, dpt_name, grd_desc from employee, department DEPARTMENT1, grade where emp_grade = grd_id and emp_dept = dpt_id and EXISTS (SELECT 'X' from department DEPARTMENT2

WHERE dpt_avg_salary in (select min(dpt_avg_salary) from department DEPARTMENT3) AND dpt_id = EMPLOYEE.emp_dept)

The above SQL statement has 36 words in it. This sentence has 15 words and took me 22 seconds to type and spell check. Doing this task nonstop 11,901 times would take 261,822 seconds, 4,363.7 minutes, 72.7 hours or 3.03 days. Is this ideal time developer/DBA usage for a manual effort in rewriting and testing even the simplest of SQL statements? If a simple statement takes this much time, think of the time required for a complex SQL statement.

Methodology
Based on numerous customer interviews I have conducted over the past five years, most organizations have some sort of code review process in place for optimizing SQL. This process could be using a single individual with SQL optimization expertise (Go-To person) or using a QA or review team (Team Approach) to conduct SQL optimization prior to production. However, there are flaws associated with both processes.

Go-To Person
Some organizations have a Go-To person for optimizing SQL. Single individuals provide the expertise to optimize SQL, however, this is only a subset of what they do. Typically, this person is a senior or principle developer or a DBA who has, by default, become the Go-To person. Their time is spread across the project(s) they are currently working and taking ad hoc requests from other developers. They are usually only contacted when a statement is noticeably slow. With this method, not all SQL statements will be optimized and based on the Go-To SQL persons time -- even crucial statements may be over looked.

Team Approach
Another method is a QA or review team to optimize SQL. During QA or load testing, the QA team may discover performance issues in the database. When this occurs, the QA team will send the offensive code back to the developers for optimization. It is more efficient to initially optimize the code, rather than have QA identify an issue, send it to development for correction and retesting, then retest again in QA. This leads us to the recommended Quest methodology for optimizing SQL code, implementation strategies and an overview of Toad for Oracle Xpert.

Quest Methodology Identify, Rewrite and Test


Based on our expertise in database development, Quest Software recommends that a more stringent methodology be put in place for coding applications with embedded SQL. Quest recommends an Identify, Rewrite and Test methodology at each phase of the life cycle.

1. Identify Problematic SQL


1. Locate SQL Statements a. Copy from SQL-Editor b. Sort through source code c. Extract from DB Objects 2. Review EXECUTION PLAN a. Offensive operations:

2. Rewrite SQL
Transform the SQL to obtain different versions

3. Test SQL Alternatives

Get original SQLs run time Always test run your SQL alternatives Do not completely trust the Oracle cost estimation

This should be implemented by making everyone responsible for the performance of the application at every phase of the lifecycle. No matter the skill/experience level or the role, everyone from the junior developer to the DBA should be responsible for application performance. (See graphic below).

This ensures that mechanics are in place for quality code creation. It takes the pressure off individuals who are inundated by requests within the organization and shortens QA/Testing time, thereby decreasing delivery time for applications to move to production.

Implementation
One can implement this methodology in multiple ways, (based on each companys needs) either through a manual method or the use of productivity tools.

Manual
With manual implementation, every member of the team must become an expert in optimizing SQL. This could be done through lunch and learn sessions or internal training sessions. Though effective, this method is both complex and time-consuming.

Productivity Tools
A more productive method for SQL optimization is the use of database tools that automate the code quality process. In an inquiry through Forrester, Noel Yuhanna explains, Usually SQL tuning tools can help improve performance by several times, but it largely depends on how optimized the SQL query already is. Based on customer feedback, on two out of three occasions, SQL tuning tools usually help in improving the performance of a query. Typically developers focus mainly on the logic, not as much on performance, therefore the SQL tuning tool helps fill the gap, enhances productivity and makes applications run faster using less system resources. 9

Quest Tools
Quest Softwares de facto standard development tool, Toad for Oracle, has SQL optimization technology in the Xpert edition. This edition allows for developers and DBAs of any skill level, in any phase of the project lifecycle on most applications, to optimize and enhance the performance of SQL code. Toad for Oracle Xpert has embedded functionality to implement a SQL optimization methodology in any organization. Teams that already have expertise in-house for optimizing SQL will find increased productivity through the simple use of this tool. For those who have not implemented SQL Optimization into their process, they will find the intuitive interface easy to use and implement. For more information on Toad for Oracle Xpert, please go to: http://www.quest.com/toad/index.asp

Conclusion
Data growth is and will continue to be one of the continual challenges IT organizations face. Technologies such as RFID, VOIP and digital media will be the largest data drivers companies contend with in the near future. Over the next few years, many companies will see an exponential growth in collected data that is used strategically to maintain a competitive advantage. Understanding industry trends will enable IT management to be prepared to present solutions, not only to efficiently handle data growth, but also to install a methodology for SQL optimization. The best defense is a solid plan of preparation for potential data growth issues. For applications, it is providing developers with the tools to optimize SQL code, which leads to maximum application performance and satisfied end users. Companies must ensure that all individuals involved in application development and administration are responsible for data integrity and SQL code quality.

10

References:
1

IT Imperative: Managing Robust Storage Growth, META Group, Carl Greiner, Rob Schafer, December 21, 2004. Winter Corporations TOPTEN Grand Price Winners, Kathy Auerbach, DM Review, March 2004.

2 3

What Wal-Mart Knows About Customers Habits by Constance l. Hays, The New York Times November 14, 2004. Capacity Matters: Plan Ahead for Terabyte Data Growth: Bob Zimmerman Forester Paper, July 12, 2004.
5 CIOs' 'Must Do' Resolutions for 2005, J. Mahoney, M. McDonald, M. Raskino, Gartner Research Note, 17 December 2004. 4

Four Giant Steps to Maximize the Business Value of IT, M. Gerrard, Gartner Commentary, 22 December 2004.
7 8

The Tsunami of Data Growth, Elliot King, Windows IT Pro Magazine, March 1, 2004 Google to Branch Into Television, Michael Liedtke, AP Business Writer, January 25, 2005.

SQL Tuning Tools Remain Important, Especially For Custom Applications, Noel Yuhanna, Forrester. IdeaByte, January 2, 2004.
10 11

IT moves into Voice Communications, Caron Carison, EWeek, December 20/27, 2004 Get ready for RFID, Renee Boucher Ferguson, EWeek, January 17, 2005

11

About the Author


Bryan Huddleston is the Product Marketing Manager of Database Development at Quest Software. He specializes in database development tools, most notably Toad. Prior to joining Quest, Bryan worked as a PeopleSoft technical consultant for four years. With more than 12 years of experience in IT, he brings several years of consulting and account management practice to Quest. Bryan has spoken at numerous tradeshows, user groups and customer technology days, to more than 3,500 people throughout the world on various topics and technologies.

About Quest Software


Quest Software, Inc. delivers innovative products that help organizations get more performance and productivity from their applications, databases and infrastructure. Through a deep expertise in IT operations and a continued focus on what works best, Quest helps more than 18,000 customers worldwide meet higher expectations for enterprise IT. Quest Software, headquartered in Irvine, Calif., can be found in offices around the globe and at www.quest.com.

World Headquarters 8001 Irvine Center Drive Irvine, CA 92618 www.quest.com e-mail: info@quest.com Inside U.S.: 1.800.306.9329 Outside U.S.: 1.949.754.8000 Please refer to our Web site for regional and international office information. For more information on Quest Software solutions, visit www.quest.com.

Copyright 2005 Quest Software, Inc. Toad and Toad for Oracle Xpert are registered trademarks of Quest Software. The information in this publication is furnished for information use only, does not constitute a commitment from Quest Software Inc. of any features or functions discussed and is subject to change without notice. Quest Software, Inc. assumes no responsibility or liability for any errors or inaccuracies that may appear in this publication.

12

Das könnte Ihnen auch gefallen