Planning Your ASE15 Migration v1

Planning Your ASE 15.
0 Migration
Tips, Tricks, Gotchas & FAQ.
ver 1.0
Planning Your ASE 15 Migration- v1.0
Table of Contents
Table of Contents .......................................................................................................................... iii

Introduction & Pre-Upgrade Planning .........................................................................................1
High Level Upgrade Steps for ASE 15 .........................................................................................2
General ASE 15 Migration FAQ...................................................................................................4
SySAM 2.0 Installation & Implementation...................................................................................7
SySAM 2.0 Background ...............................................................................................................7
SySAM 2.0 Implementation Steps ................................................................................................7
ASE 15 & SySAM FAQ .............................................................................................................18
Preparing for the Upgrade ...........................................................................................................21
Review Database Integrity ..........................................................................................................21
Find all currently partitioned tables.............................................................................................22
Increase HW Resource Requirements .........................................................................................22
Preparing for Post-Upgrade Monitoring for QP Changes ...........................................................22
Review Trace Flags.....................................................................................................................22
Run Update Index Statistics ........................................................................................................23
Installation & Upgrade .................................................................................................................25
Installing the Software ................................................................................................................25
Alternative to Dump/Load for ASE 12.5.1+ ...............................................................................25
Cross Platform Dump/Load ........................................................................................................26
Installation & Upgrade FAQ .......................................................................................................27
Partitioned Tables .........................................................................................................................29
Partitions & Primary Keys/Unique Indices .................................................................................29
Global vs. Local Indexes.............................................................................................................30
Multiple/Composite Partition Keys & Range Partitioning..........................................................32
Semantic Partitions & Data Skew ...............................................................................................33
Partitioned Tables & Parallel Query ...........................................................................................33
Partitioning Tips..........................................................................................................................34
Dropping Partitions .....................................................................................................................41
Creating a Rolling Partition Scheme ...........................................................................................42
Partitioning FAQ.........................................................................................................................43
Query Processing Changes ...........................................................................................................45
Query Processing Change Highlights..........................................................................................45
Determining Queries Impacted During Migration ......................................................................50
Diagnosing and Fixing Issues in ASE 15.0 .................................................................................65
Query Processing FAQ: ..............................................................................................................86
Storage & Disk IO Changes .........................................................................................................91
Very Large Storage Support........................................................................................................91
DIRECTIO Support & FileSystem Devices................................................................................92
Tempdb & FileSystem devices ...................................................................................................93
Changes to DBA Maintenance Procedures .................................................................................95
Space Reporting System Functions.............................................................................................95
Sysindexes vs. syspartitions & storage........................................................................................96
VDEVNO column.......................................................................................................................96
DBA Maintenance FAQ..............................................................................................................97
Update Statistics & datachange().................................................................................................99
Automated Update Statistics .......................................................................................................99
Datachange() function.................................................................................................................99
Update Statistics Frequency and ASE 15..................................................................................100
iii
Update Statistics on Partitions .................................................................................................. 101

Needs Based Maintenance: datachange() and derived_stats() .................................................. 101
Update Statistics FAQ .............................................................................................................. 102
Computed Columns/Function-Based Indices ........................................................................... 103
Computed Column Evaluation.................................................................................................. 103
Non-Materialized Computed Columns & Invalid Values......................................................... 103
Application and 3rd Party Tool Compatibility.......................................................................... 105
#Temp Table Changes .............................................................................................................. 105
3rd Party Tool Compatibility ..................................................................................................... 119
Application/3rd Party Tool Compatibility FAQ ........................................................................ 120
dbISQL & Sybase Central ......................................................................................................... 121
dbISQL & Sybase Central FAQ ............................................................................................... 121
Appendix A - Common Performance Troubleshooting Tips .................................................. 123
iv
Introduction & Pre-Upgrade Planning
ASE 15.0 is a significant upgrade for Sybase customers. In addition to the new features that are available,
the query processor has been completely re-written. Consequently, customers should expect to spend more
time testing their applications for ASE 15.0. This document is a living document, and will attempt to
highlight some of the known tips, tricks or problem areas to help customers prepare for the upgrade, and in
particular highlight how some of the new features may help speed the upgrade process (for example: using
sysquerymetrics to isolate regression queries).
The purpose of this document is to provide a comprehensive look at the new features of ASE 15.0 focusing
on either the application impact or how they affect migrations. It is not intended to be a complete
migration guide providing step-by-step instructions on migrating existing systems and applications to ASE
15 which is the focus of the official ASE 15.0 Migration Guide. Instead, the goals of this document are:
Document what what steps application developers and DBAs will need to do prior to
migration.
Document system changes that will impact maintenance procedures, monitoring scripts, or
third party tools.
Document which new features of ASE 15.0 that customers are expected to be adopting early
in the migration cycle, particularly any migration tips or gotchas that would be of assistance.
Document what tools that are included with ASE 15.0 that can be used to facilitate migration.
The rationale for this document was that the official Migration Guide treated all these items lightly initially
- and instead focused on documenting testing procedures. While an attempt was made to try to add some of
this documents content, publication issues resulted in the content being altered in such a way that
important points were lost or the context changed - in some cases resulting in significant inaccuracies. As a
result, where the content in this document overlaps the migration guide, this document should be
considered to supercede the ASE 15.0 Migration Guide. It should be noted that the only content in this
guide is the application impact and migration considerations. Guidance about the upgrade procedures can
still be found in the ASE 15.0 Migration Guide.
This document is organized into the following sections:
Implementing SySAM 2.0

Preparing for the Upgrade
Installation and Upgrade
System Table & System Function Changes
IO Subsystem Changes
Updating Statistics & datachange()
Semantic Partitions
Computed Columns/Function-Based Indices
Query Processing
dbISQL & Sybase Central
In each section, we will attempt to provide as much information as possible to assist DBAs in determine
which features to implement, the best strategies to implement, and the merits of implementing the changes.
An important point is that this document assumes that you are upgrading from ASE 12.5.x this
assumption is based on the fact that ASE 12.0 and previous releases have already reached the end of their
1
support cycle. Additionally, some of the recommendations may be obsoleted with later Interim Releases
(IRs) of 15.0 such as 15.0.2 (planned for Q2 2007 release). This document is based on ASE 15.0.1 ESD
#1 which was released in early November 2006.
High Level Upgrade Steps for ASE 15
While the ASE 15.0 Migration Guide contains much more detailed migration steps, this document would
not be complete without pointing out some of the high-lights or specific differences with ASE 15.0
migrations over previous releases that really need to be discussed.
Plan Your SySAM 2.0 License Management Architecture
ASE 15.0 is the first product in the Sybase family to introduce version 2.0 of the SySAM technology.
There are several options for controlling license management including unserved file based licenses.
However, the most flexibility in terms of ability to control licenses, ease product installation, etc., it is
likely that most Sybase customers will want to use a license server. It is importart that you determine how
you are going to manage your licenses prior to installing the software for the first time as you will be
prompted for license host information during the software installation.
Determine Your Upgrade Path
Generally, most DBAs are very familiar with ASE migrations and have long established procedures for
their installations for migrations. This section however will detail two important aspects that may affect
those procedures.
Alternative to Dump/Load
In the past, there generally have been only two upgrade paths upgrade in place, or dump/load. Because of
the mount/unmount feature added to ASE 12.5, there is now a third method that is likely considerably faster
than dump/load for customers using SAN disk technology.
1. Queisce the database using a manifest file

2. Copy/move the devices along with the manifest file to the 15.0 host
3. Mount the database using the mount database command
4. Bring the database online
It is during the online step that the database upgrade is performed. Note that this technique is only
supported between machines from the same vendor and platform cross platform is not supported (you
must use the XPDL feature for this).
Reduced 32-bit Platform Availability
In addition, ASE 15.0 is only released as a 64-bit application on most platforms, with 32-bit versions
available only for Windows, Linux and Solaris. If you are running a 32-bit version of Sybase ASE,
particularly on HP or AIX, you will need to make sure you are running a later 64-bit version of the
operating system. The rationale for this change is that most of the Unix OS vendors have fully
implemented 64-bit versions of their operating systems and have discontinued support (in most cases) for
their 32-bit versions. This lack of support as well as in some cases degraded performance of 32-bit
applications on the 64-bit OSs, has made it impractical for Sybase to continue providing 32-bit binaries for
these platforms. For a full list of which 64-bit versions and patches are required for your platform, consult
the ASE certification page at http://certification.sybase.com/ucr/search.do.
If you are upgrading from a 32-bit to a 64-bit release as a result, and take advantage of the additional
memory offered, you may want to observe the application carefully through the MDA table
2
monOpenObjectActivity. With increased memory availability, a previous table scan that resulted in
physical IO may now be able to completely fit in data cache and be all logical IOs. As a result of not
needing to yield the CPU to wait for physical IOs, processes involved in logical memory table scans will
use their full time slice, yield the CPU and then immediate jump on the runnable queue. The result of this
behavior is an immediate and dramatic increase in CPU utilization - sometimes a near constant 100%
utilized and a resulting drop in application performance. Table scans on application tables can be
determined by a query similar to:
select *
from master..monOpenObjectActivity
where DBID not in (1, 2, 3) -- add additional tempdb dbids if using multiple tempdbs
and UsedCount > 0
and IndexID = 0
order by LogicalReads desc, UsedCount desc
The reason for excluding tempdb(s) is that it is common for table scans of temporary tables for no other
reason than typically they are either small and usually lack indexing.
Prepare for the Upgrade
After determining how you are going to upgrade, you will then need to prepare for the upgrade itself. With
some of the changes present in ASE 15 and some of the new features, we have collected some guidance
around additional hardware resources, etc. that you should review prior to attempting the upgrade. The
details are provided in the section on this topic later in the paper.
One item that may help is to create a quick IO profile of the application in 12.5.x and 15.0.x. This can be
done by collecting MDA monOpenObjectActivity and monSysWaits data for a representative period of
time. For monOpenObjectActivity data, you may wish to collect two sets of data - one for application
databases and another for temp databases. If you compare the monOpenObjectActivity data with ASE
15.0, you may notice the following:
A huge increase in table scans on one or more tables (see earlier discussion on 32 to 64 bit for
logic). This could be due to an optimization issue or just a move to a merge-join instead of a
nested loop join using an index.
A large decrease in IOs in tempdb - particularly when joins between two or more temp tables
are involved. This likely is due to merge join.
A significant drop in the UsedCount column for particular indexes (when IndexID > 1).
Likely this is the result of missing statistics if the index contains more than one column.
While this may not point out the exact queries affected (discussed later), it can help reduce the effort to find
the queries to just those involving specific tables. Keep in mind that the observation periods should be
during the same relative loading to ensure that the results are comparable.
Stress Test Your Application with ASE 15 QP
Because of the new QP engine, it is critical that Sybase customer stress test their application before and
after the upgrade to determine any changes in query processing behavior. Some advice, common issues
and particularly useful tools are discussed later in this paper.
Post-upgrade Tasks
As is normal, the successful completion of a Sybase ASE upgrade does not finish with bringing the system
online for end-user access. After the upgrade itself is complete, there are a number of tasks that need to be
considered, the following not all inclusive:
Post-upgrade monitoring of query processing

Implementation of new ASE features that do not require application changes
3
Upgrade of 3rd party tools to versions that support ASE 15.0
General ASE 15 Migration FAQ
Are there customers running ASE 15.0 in production? How many?
Yes. There are customers running ASE 15.0 in production across a broad spectrum of customers from
financial companies to state governments to electronics companies. We cant tell how many as customers
typically dont tell Sybase when they have moved systems into production on the new release.
Consequently, we are only aware of customers that are running in production when interaction with field
sales teams or technical support staff specifically identifies a system in production on ASE 15.0. At this
point, the adoption of ASE 15.0 is proceeding much the same as with previous releases, such as ASE 12.5,
however, there are more customers looking to migrate sooner at this stage than previous due to the new
partitioning feature, the performance improvements with the QP engine and other improvements.
Consequently, we expect the adoption of ASE 15.0 overall to be quicker than previous releases.
Why should we upgrade to ASE 15.0?
The exact reason why a specific customer should upgrade to ASE 15.0 will be specific to their
environment. For customers with large data volumes, the semantic partition feature in ASE 15.0 may be
enough of a driving reason. For others with complex queries, the substantial performance enhancements in
ASEs QP engine for complex queries (3 or more tables in a join) or the performance improvements for
GROUP BY or other enhancements may be the reason.
If the benefits in ASE 15.0 are not a sufficient reason, then platform stability may be a consideration. It is a
fact that Sybase compiles ASE on specific OS versions from each vendor. As new hardware is released by
the vendor, often it requires newer versions of the OS to leverage the hardware advances. As a result, this
causes changes to the behavior of the core OS kernel, making older versions of Sybase software unstable
on the newer OS versions. Sybase works with OS vendors to ensure that Sybase ASE remains compatible
with newer OS versions during the reasonable lifespan of when it was compiled typically 4-5 years.
After that point, the advances in technology both software and hardware, make it impossible to maintain.
ASE 12.5 was first released in Q2 of 2002 and is expected to be EOLd in 2008 a life span of 6 years
which stretches the service life of the software to its maximum. This is analgous to running Windows 98
on a desktop today while fully possible, it cant take advantage of todays dual core processors that are
common in desktop systems, the multi-tasking capabilities of todays software, and device drivers for newer
disk drives (SATA RAID, etc.) are not available which prevents the hardware from achieving its full
potential.
But Isnt 12.5.4 Also Planned?
ASE 12.5.4 was indeed planned and subsequently released, but it has a much more reduced implementation
than originally planned. The primary goal of ASE 12.5.4 is to provide a GA release for the features
introduced in the Early Adopter 12.5.3a IR (the a signified adopter release) and to officially merge the
12.5.3a code changes into the main 12.5.x code path. This is in keeping with Sybases stated policy of
having two supported releases in market simultaneously (for the full description of the Sybase ASE & RS
lifecycle policy, please refer to:
http://www.sybase.com/products/informationmanagement/adaptiveserverenterprise/lifecycle
However, the EOL date for 12.5.x has already been established as September 2008, consequently upgrades
to 12.5.4 should only be considered as a temporary measure until a 15.x migration can be affected.
It is also important to note that many of the features introduced in 12.5.x including 12.5.3a, have already
been merged into the 15.0 codeline (as of ESD #2). Future enhancements to those features including
4
RTMS, Encrypted Columns, etc. are being planned in the 15.0 codeline and may not be back-ported to ASE
12.5.x. This is inline with Sybases policy that while two releases may be supported simultaneously, it can
be viewed that the earlier release is effectively in maintenance mode with no new functionality being added
except in limited circumstances.
5
SySAM 2.0 Installation & Implementation
While this topic is dealt with in greater detail in other documents, given the frequency of questions, this
paper would be amiss if this topic was not mentioned at all. However, for more detail information on this
topic, please go to http://www.sybase.com/sysam, which has recorded webcasts, presentations, whitepapers
and other information describing SySAM 2.0. This paper will attempt to provide a background for SySAM
2.0, and provide a planning guide along with implementation steps and considerations for customers.
SySAM 2.0 Background
Even before Sarbanes-Oxley (SOX), many customers were requesting strong license management and
reporting capabilities from Sybase. Sarbanes-Oxley made this even more important as it held the CEO of
corporations responsible for any financial wrong-doing including misuse of software assets. As the
banking industry was one of those most heavily watched as a result of legislation and given Sybases
position within the financial community, Sybase adopted a more stringent licensing management
implementation.
While it is built using the same flexlm technology used in SySAM 1.0, it was decided that a true network
licensing management implementation should be provided. In SySAM 1.0, it was common for customers
to simply deploy ASE servers using the same license keys on all the servers. This reuse of the same license
string, of course, was not legitimate and many times resulted in over deployment of software vs. actual
licensed. As SOX driven audits were performed; it became apparent that this was an issue that required
resolving sometimes resulting in draconian measures taken by the company such as restricting access to
Sybases website as a means of controlling software access. Since SOX applies to all publicly traded
companies, Sybase opted to implement SySAM 2.0 to make licensing compliance easier to measure for
corporate users. In fact, one of the critical components of SySAM 2.0 was a reporting feature that would
allow IT managers to monitor and report on license compliance to corporate officers as necessary.
However, in order to make it as easy as possible to also support the types of flexibility the real world
requires during hardware transitions, unexpected growth, testing/evaluation periods, etc., SySAM 2.0
provides features such as license overdrafts, license borrowing (for mobile tools use), grace periods and
other techniques to allow temporary software use without restriction. At no time does the SySAM software
report license usage to Sybase. So, much like before, any long term use of Sybase software beyond the
licensing agreements must be reconciled with the companys Sybase sales representative.
While ASE was the first product to use SySAM 2.0, subsequent product releases including RS 15.0,
PowerDesigner 12.0, and future product releases will use SySAM 2.0 as well. Consequently, it is wise to
consider the broader scope for SySAM 2.0 than thinking of it as limited to ASE.
It is important to understand one aspect very clearly. SySAM 2.0 will NEVER contact Sybase to report
license compliance issues. The purpose of SySAM 2.0 was to aid corporations in assuring compliance to
meet legislative requirements not report software compliance to Sybase, Inc. In fact, if you opt to accept
the optional license overdrafts, any use of overdrafted licenses would not be reported to Sybase and any
purchase orders to correct the overdrafts would need to be initiated by the customer manually by reporting
the overdraft to Sybase and requesting pricing information on the continued use.
SySAM 2.0 Implementation Steps
Before installing ASE 15.0 or other SySAM 2.0 managed product, you will need to follow the following
steps:
1. Determine license management architecture (served, unserved, OEM, mixed)

2. Determine number of license servers, location and infrastructure
3. Inventory product license quantities and types by host/location
7
4. Determine host ids of designated license servers and redundant host ids
5. Generate license files from Sybase Product Download Center (SPDC)
6. Install SySAM software on license server hosts
7. Deploy license key files to license servers
8. Generate license file stubs for deployed software
9. Implement reporting and monitoring requirements
10. Update licenses congruent with Sybase Support Contracts
Most of these steps will be discussed in more detail in the following sections. However, not all will be, and
the intent will be not to reiterate the product documentation as much as to provide best practices, insights
and work-arounds to the gotchas involved.
To facilitate in this discussion, the following mythical customer scenario will be used: Bank ABC is a
large Sybase customer with several different business units operating as separate profit and loss centers.
Bank ABC is headquartered in New York City with production systems there as well as in northern New
Jersey where IT staff and developers are located. The NJ office operates also as a primary backup facility
to the production systems with a full DR site in Houston, TX (with some IT staff located there as well).
Bank ABC also operates foreign offices in London, Frankfurt, Australia and Tokyo for its trading business
units with a group of developers in the London and Australia locations. London and Frankfurt serve as
DR sites for each other as does Australia and Tokyo. Bank ABC also operates several web applications
(for its retail as well as trading subsidiaries) outside the corporate firewall with mirrored sites in NY,
Houston, San Francisco, London, Tokyo and Australia. In other words the usual multi-national corporate
mess that drives IT license management crazy.
License Architecture
Generally, SySAM license management architectures fall into two categories served and unserved. A
third case, OEM licensing (used for embedded applications), will not be discussed in this document as the
best implementation is dependent on the application, deployment environment and licensing agreements
between Sybase and the vendor.
It is expected that Sybase customers most likely will use served licenses or a mix of served and
unserved licenses. The rationale is that served licenses take advantage of one or more centralized
license servers that provide licensing to all the deployed software locations. This eliminates the hassle of
maintaining license keys and individually tying Sybase licenses to specific hosts in the case of hardware
upgrades, etc. However, a license management server could be impractical in small environments or when
license server access is restricted such as when the systems run outside the corporate firewall. To get a
better idea of how this works and the impact it could have on your implementation, each of these cases
(served/unserved) will be discussed in detail.
Unserved Licenses
It is easier to understand served licenses (and their attractiveness) once unserved licenses are discussed.
Unserved licenses are exactly what you would expect and everything that you likely wanted to avoid.
When using an unserved license, the Sybase product (ASE) looks for a license file locally on the same
host and performs the license validation and checkout. While a more detailed list of considerations is
contained in the SySAM documentation, consider the following short list of advantages and disadvantages:
Advantages Disadvantages
Local license file Because the license file is local, License generation - Because the license is node-locked,
network issues or license server process failures will not each unserved license will have to be generated uniquely
affect the product licensing resulting in grace mode at SPDC
operations.
8
Advantages Disadvantages
Works best for remote locations such as outside the Licenses will need to be regenerated for each node when
firewall where limited admistration access is required. support renews (typically yearly) or when the host
hardware changes.
Licenses are dedicated and always available Licenses cant be pooled and used as needed (pursuant to
licensing agreement restrictions).
No extra hardware or software installation requirements License usage reporting is not available
Like most DBAs, our staff at Bank ABC really cant be bothered manually generating and maintaining
hundreds of licenses each year for all of their servers. However, the web-applications running outside the
firewall pose a problem for administration, it is decided that the web-application servers will use unserved
licenses. Assuming Bank ABC has two servers in each location (one for retail applications and one for
trading), we have a total of 12 servers in 6 locations. Fortunately, this is not likely one person doing this
like most corporations Bank ABC has divided its DBAs along the business units, and in this case the
retail DBAs will take care of their systems and the trading DBAs will take care of theirs leaving 6 at the
most. But even then it is likely that the London & Australian staff will be responsible for the servers in
their local areas as well so each geography is likely only concerned with 2 unserved licenses.
Served Licenses
As the name suggests, this implies that one or more license servers services multiple deployed software
host platforms. For example, Bank ABC could have a single license server that provided all the licenses to
every server that had network access to it the ultimate in license management simplicity. One license
server, one license file, one license generation from SPDC and done. Of course, not exactly realistic, but
that is a discussion for a later discussion on how many license servers you should have. For now consider
the following diagram:
Figure 1 License Server Provisioning Licenses for Multiple ASE Software Installations
There are a couple of key points about the above diagram:
1. The only license server process running is on the license management host (top). The
deployed ASE machines are not running a license server.
9
2. Sybase SYSAM 2.0 compatible products will check their local files for licenses. The license
stub file will point the product to the license server(s) to use for actually aquiring the
license.
On the license server, license files are saved to the directory $SYBASE/SYSAM-2_0/licenses. Each
license file can contain more than one type of license and more than one file may be used. The typical
license file will resemble the following:
SERVER tallman-d820 0013028991D3
VENDOR SYBASE
USE_SERVER
PACKAGE ASE_EE SYBASE COMPONENTS=ASE_CORE OPTIONS=SUITE SUPERSEDE \
ISSUED=14-jun-2006 SIGN2="0F70 418E B42B 2CC9 D0E4 8AEC 1FD0 \
B6C7 69CE 1A05 F6BF 45F5 BEE4 408C C415 1AA5 18B8 6AA1 3641 \
6FDD 52E1 45B6 5561 05D4 9C62 AD6B 02AA 9171 5FAC 2434"
INCREMENT ASE_EE SYBASE 2007.08150 15-aug-2007 2 \
VENDOR_STRING=SORT=100;PE=EE;LT=SR PLATFORMS=i86_n DUP_GROUP=H \
ISSUER="CO=Sybase, Inc.;V=15.0;AS=A;MP=1567" \
ISSUED=14-jun-2006 BORROW=720 NOTICE="Sybase - All Employees" \
SN=500500300-52122 TS_OK SIGN2="0E12 0DC8 5D26 CA5B D378 EB1A \
937B 93F9 CAF2 CDD8 0C3E 4593 CA29 E2F1 8F95 15D1 2E60 11C0 \
10BE 26EC 8168 4735 8A52 DD9F C239 5E88 36D7 1530 A947 1A7C"
Note that this file is only located on the central license server it is not distributed with the software to the
deployed installations. These machines will have a much simpler license stub that resembles:
SERVER TALLMAN-D820 ANY
USE_SERVER
Which tell the Sybase products to contact the license host (tallman-d820 in this case) in order to acquire a
license. This has huge benefit with respect to license administration in that when licenses change, only the
license file at the license server host has to be updated. By default, when a SySAM 2.0 product is installed,
a default stub file SYBASE.lic is generated and contains a sample license stub based on the answers to the
license server question from the software installer GUI.
There is a variation of the above file that may need to be used if you are running within a firewall
environment in which common ports are blocked. That format is:
# Replace the port # with any unused port that your firewall supports
USE_SERVER
VENDOR SYBASE PORT=27101
It is really helpful to first understand the detail architecture of flexlm. The MacroVision FlexLM software
has a license manager daemon (lmgrd) that runs on the host machine. This process is not responsible for
actual license management - but rather managing the other vendor licensing daemons that need to be
executed. On windows platforms, this is the %SYBASE%\SYSAM-2_0\bin\SYBASE.exe executable,
while on Unix, the path is $SYBASE\SYSAM-2_0\bin\SYBASE. During startup, the lmgrd starts the
SYBASE licensing daemon and any other daemons it finds. When a program requests a license, the
program first contacts the FlexLM lmgrd daemon to find the port number that the SYBASE daemon is
running on. While the FlexLM lmgrd daemon normally listens in the range 27000-27009, the vendor
daemons can be started on any available port. For example, consider the following snippet of the
SYBASE.log file in the SySAM directory
0:18:42 (lmgrd) pid 2380
0:18:42 (lmgrd) Detecting other license server manager (lmgrd) processes...
0:18:44 (lmgrd) Done rereading
0:18:44 (lmgrd) FLEXnet Licensing (v10.8.0 build 18869) started on tallman-d820 (IBM PC) (12/9/2006)
0:18:44 (lmgrd) Copyright (c) 1988-2005 Macrovision Europe Ltd. and/or Macrovision Corporation. All
Rights Reserved.
0:18:44 (lmgrd) US Patents 5,390,297 and 5,671,412.
0:18:44 (lmgrd) World Wide Web: http://www.macrovision.com
0:18:44 (lmgrd) License file(s): C:\sybase\SYSAM-2_0\licenses\SYBASE.lic
0:18:44 (lmgrd) lmgrd tcp-port 27000
0:18:44 (lmgrd) Starting vendor daemons ...
0:18:44 (lmgrd) Started SYBASE (pid 1828)
0:18:45 (SYBASE) FLEXnet Licensing version v10.8.0 build 18869
0:18:45 (SYBASE) Using options file: "C:\sybase\SYSAM-2_0\licenses\SYBASE.opt"
10
0:18:45 (lmgrd) SYBASE using TCP-port 1060

0:18:45 (lmgrd) SYBASE using TCP-port 1060
In the above log, the lmgrd (highlighted in yellow) lines show that it starts up, starts the vendor daemons,
and then starts listening on port 27000. The SYBASElicensing daemon (highlighted in blue) starts up,
reads the options file, then reads all the licenses, and finally starts listening on port 1060. Note that this
port number could change. If running in a firewalled environment in which you need to use a consistent
port, then you use the PORT=<#####> notation on the VENDOR line to specify which port the specific
vendor daemon should use. Note that you may also have to specify both the lmgrd and the SYBASE
executables in any firewall softwares permitted applications/exclusions list - particularly on MS Windows
systems using either the supplied MS firewall or 3rd party firewall software.
Another interesting point is that while the license server may have multiple license files for example, one
for each product the deployed software only needs to have that one license stub, regardless of how many
SySAM 2.0 products are actually installed on that machine. This is illustrated in Figure 1 above in that the
license server has multiple files, but the deployed hosts have a single license stub file most likely simply
SYBASE.lic (the default). We will discuss this and other deployment techniques later.
License Server Quantity & Locations
The number of license servers you will need depends on a number of factors:
The geographic distribution of your business locations and IT infrastructure to support it

Number and locations of Disaster Recovery sites
SySAM availability logical clustering/redundancy
The number and locations of development organizations
Business unit cost allocations/IT support agreements
The one consideration that is not likely is the number of hosts accessing the license server for licenses. A
single license server running on a workstation class machine can easily support hundreds of server
products. In fact, an internal Sybase license server proving well over 100 licenses to ASE hosts only uses 5
minutes of cpu time per week for the SySAM activity. The rest of these will be discusse in the following
sections.
Geographic Distribution
While not absolutely necessary, Sybase suggests that additional license servers be used for each geographic
region possibly even advisable on a site/campus basis. The rationale for this is that the products have
built-in heart beats to the license server as well as the need to aquire licenses during the startup phase.
Network issues between locations could prevent products from starting (hampering DR recovery
procedures) or cause running products to enter grace mode and possibly shutdown.
Given our example ABC Bank scenario above, we have the following locations and activities:
11
Development
Web System
Production
DR
Site Comments
Manhattan, NY 9 9
New Jersey 9 9 9 Primary DR for NYC
Houston, TX 9 9 9 DR for NYC & SF
London, UK 9 9 9 9 DR for Frankfurt
Frankfurt, Germany 9 9 DR for London
Melbourne, Australia 9 9 9 9 DR for Tokyo
Tokyo, Japan 9 9 9 DR for Melbourne
San Francisco, CA 9 9
It is likely that we will want license servers in all 8 locations. Strictly speaking, given that Manhattan and
NJ are in very close proximity, it is possible for the NYC systems to use the NJ license servers. However,
in this case, since ABC Bank is leasing the lines between Manhattan and Newark NJ installed by the local
telephone company, the DBAs are considering a separate license server to avoid issues in case of
telephone company hardware issues.
Disaster Recovery (DR) Sites
While it is true that Sybase software has built-in grace periods, for ASE, the installation grace period is 30
days from the time the software is installed. Consequently, if you only use a single license server at the
production facility and you experience a site failure, you likely will not be able to access those licenses
during the DR recovery procedures when starting up the mirrored system. On top of this, warm-standby
and active clustering systems need to be up and running already, so they will need access to a license server
at anypoint. Again, however, if the primary site fails (and takes out the single license server) an
unexpected shutdown at the DR facility (i.e. a configuration change) could result in not being able to
restart the DR facility servers. For this reason, it is strongly recommended that a license server be installed
at each DR facility. This server should minimally have the licenses used by the DR systems.
License Server Redundancy
There are several mechanisms used by the license server software to attempt to prevent software issues due
to license server failure. The first mechanism, of course, is the grace periods. However, they should be
viewed only as an immediate and temporary resolution and should not be depended upon for more than a
few hours at most. While it is true that the runtime grace period is 30 days more than long enough to fix
a failed hardware component or to totally change license servers the risk becomes more pronounced in
that if the ASE is shutdown for any reason, it likely will not restart as it will be working on the installation
grace which likely expired long ago.
Another approach that could work is to use multiple independent license servers. As the Sybase software
starts up or conducts its license heartbeat, it reads the first license file it comes to lexigraphically in the
license directory. If it fails to obtain a license, it then reads the next license file, and so on, until all the
license files in the directory have been read. On the surface, this seems to be a simple problem ABC
Bank for example, could simply put 6 license servers around the globe and if one should be done, the other
5 would be available. The problem is the pesky license quantity issue. As software products check out
licenses, the pool of licenses available drop down. While there is an optional overdraft, this (by default) is
10% of the licenses being checked out for the given license server. To see the problem here, lets assume
12
that both NYC and NJ have 10 ASE servers in production each (ignoring all other locations/servers).
Consequently, when the licenses for each were generated, the DBA picked 10 for each server and accepted
the overdraft. Using the default of 10%, each site could now actually run 11 servers. You should start to
see the problem. If NYC fails, 9 of the ASEs will not be able to restart as all of the licenses for NJ would
be used (10) and the first NYC server started at the NJ failover site will check out the single remaining
license. This technique is an option for site license customers, however, as they simply need to generate
the license pool for each license server to accommodate all the licenses they would expect to use in a worse
case scenario. While the overdraft does make sense when doing platform migrations, special benchmarks,
etc. for license server redundancy, it is not likely to help much.
A common thought then is to insall SySAM software on a clustered server. The general thinking is that if
one boxy fails, the other will restart the license server immediately. The problem with this approach is that
the license server uses node-locking on the unique information from the host machine either the primary
network MAC address or the OS hostid. When the license manager is restarted on the other half of the
cluster, the MAC or hostid will not match what is in the SySAM license file and the licenses will be
useless. Which brings us to the most likely implementation SySAM clustering.
The FlexLM software used by SySAM 2.0 supports a three node logical clustering of license servers.
There are a couple of key points that should be mentioned about this:
2 of the 3 license servers must be available at all times

The license servers should be fairly closely located. More on this in a minute.
While it may not be required to have license server high availability due to grace periods, it
may be recommended for larger installations
Now then, in the SySAM documentation (specifically the FlexLM Licensing End User Guide, it states that
a three node SySAM redundancy configuration should have all the license servers on the same subnet.
This requirement is based upon having guaranteed good communications vs. a WAN implementation.
Even given this restriction, however, there are ways deploying the license servers in a three node
configuration very effectively. For example, you could have one server be the first server for production
systems, one be primary for development systems and one as a failover.
When configuring a three-node license server logical cluster, there are some additional steps that need to be
completed:
The three hostids have to be provided at the time of license generation

The license files and license stubs will have three server lines instead of one
When the license files are generated, the license files can then be distributed to all three host machines
being used in the license server redundant setup. In addition, the license stubs with the three server lines
would be distributed or rolled into the distribution package for distribution. The stub would resemble:
SERVER PRODLICSRVR ANY
SERVER BACKUPLICSRVR ANY
SERVER DEVELLICSRVR ANY
USE_SERVER
Note that the order of the hostnames listed in the license stub file on the installed hosts specifies the order
in which the license servers are contacted. If the above file was the distributed, each SySAM 2 product
would attempt to acquire a license from the license server running on the PRODLICSRVR host first then
attempt the others in order. This can be exploited by rearranging the server lines on some hosts so that the
contact one of other license servers first. This can be particularly useful for a license server managing
certain development products that may have a much shorter license heartbeat.
13
Development System License Servers
While Sybase does provide a free development license for ASE, this license is often too restrictive for
real development organizations. The biggest problems with it are:
Limited resources (single cpu, small number of connections, etc.)

Lack of maintenance (can not apply EBFs)
As a result, most organizations with normal sized development teams will have fully licensed
Development/Test/QA systems. While these systems could use the same license server as the production
systems, having a separate license server for these systems achieves the following:
Protects the production systems from development/testing system issues

Provides an ideal environment for Sybase DBAs to test SySAM procedures
Reduces the overhead on production license servers from tools interaction
The last bullet takes a bit of explaining. While ASE has a license heartbeat measured in hours (4 hours or
more by default), some of the tools may have a license heartbeat measured in minutes (5 minutes or less).
While this may seem a bit extreme, the rationale is that it is necessary for rapid reclamation of floating
licenses from the pool of developers. By having a separate server for providing development licenses, the
production license management servers are not under as great of stress and can respond to license
heartbeats faster as well as service larger numbers of systems reducing the actual number of license
servers necessary.
Business Unit License Provisioning
In most medium to large corporations, the business units are divided into different cost centers. In these
cases, IT infrastructure costs are often allocated among the business units on a project basis with some
centralized costs for common components such as network infrastructure. This can impact the number of
license servers in several ways:
If it is decided that the SySAM license servers will be placed on separate hardware from the
software, the costing for this hardware acquisition (small as it may be) may need to be
charged to the appropriate business units.
DBA resources may be assigned on a business unit level and may not even have access to
machines belonging to the other business units therefore are not able to generate or maintain
the license files for those machines.
One reason that some may think should be included in the above is that one business unit may unfairly grab
all of the available licenses. However, SySAM 2.0 allows license provisioning via the OPTIONS file, in
which licenses can be pre-allocated based on logical application groups by hostid. Note that this also could
be used within an environment to provision production licenses away from development/test licenses or
similar requirement to ensure that some quantity of licenses are preserved for specific use. How to do this
is covered in the FlexLM Licensing End User Guide in the chapter on the OPTIONS file. Note that for
Sybase, this file should have a default name of SYBASE.opt and be located in the SySAM licenses
subdirectory.
Inventory Current Deployment
Now that we have an idea of considerations that go into determining how many license servers we may
need, the next step to determining how many we will actually use and where they might be located would
be to do an inventory of the current software deployment and license usage. For production systems, this
may be easy finding all the development licenses may be more fun. To make your job easier, you may
want to create a spreadsheet with the following columns (this will really speed up the SPDC generation as
it will help you reduce the number of interactions necessary especially changes).
14
Licenses
License
CPUs/
Reqd
Types
Cores
Host Bus Unit / Software/ SySAM
Location machine Application Option Server
Consider the following example using our mythical ABC Bank:
Licenses
License
CPUs/
Reqd
Types
Cores
Host Bus Unit / Software/ SySAM

Location machine Application Option Server
NYC Prod_01 12 Trading ASE SR 1 nyc_sysam_01

NYC Prod_01 12 Trading ASE partitions CPU 12 nyc_sysam_01
NYC Prod_01 12 Trading ASE encrypt col CPU 12 nyc_sysam_01
NYC Web_01 8 Trading ASE CPU 8 (file)
NYC Web_01 12 Trading ASE encrypt col CPU 12 (file)
NJ Prod_01_HA 12 Trading ASE SR 1 nj_sysam_01
At first, of course, you may not know the license server host (last column), but it can be filled in later after
youve completed your inventory and youve decided how many license servers you want to deal with or
need based on availability requirements.
License Server HostIDs
Obtaining your HostID is fairly simple, although, it does differ for each platform. For example, on some
platforms, you simply issue the Unix hostid command, while on others, you will need the primary
network interfaces physical MAC address. Detailed instructions are contained in the FlexLM Licensing
End User Guide shipped with the software.
However, this does require a bit of warning. When using the MAC address, the NIC that will be used for
node locking will be the first NIC bound in order when the system boots. On Microsoft Windows systems,
this is more than just a bit fun as MicroSoft and other hardware vendors love to create network devices out
of firewire ports, USB streaming cameras, etc. If you are running a Microsoft system, you can ensure the
NIC binding order by opening the Control Panel and navigating to the Network Connections folder. When
it is open, you then select Advanced Settings from the Advanced menu option at the top (same menu bar
as File/Edit, etc.). This should open the Advanced Settings dialog. Select the Adapters and Bindings tab
and in the top pane, re-order the binding order in the desired order.
If you are using an operating system other than Microsoft Windows that uses MAC based hostids (Linux,
MacOS), check with your system administrator to see if you can specifically control the network adapter
binding order.
15
Generate Licenses from SPDC
After you have determined your license servers and/or file based node locked host machines, you are ready
to generate licenses from the Sybase Product Download Center. While it is fairly straight forward and
based on a web-form wizard that walks you through the process, it can be a bit tedious especially if you
make a mistake (you will have to check the licenses back-in and then re-check them out to fix a mistake).
A couple of items to specifically watch out for when generating licenses:
The license quantity will often default to all remaining (first time) or the same number as the
last licensing action.
After the first time you enter a license servers info, subsequent licensing actions for the same
license server will be quicker as the license server can simply be selected from the list.
Each licensing action will result in a single license file containing a single license. Save the
files to a safe place where they will be backed up.
If desired, rather than having dozens of files with individual licenses, you could combine
some of the licenses into one or more license files
A sample license file may look like the following:

VENDOR SYBASE
USE_SERVER
PACKAGE ASE_EE SYBASE COMPONENTS=ASE_CORE OPTIONS=SUITE SUPERSEDE \
ISSUED=14-jun-2006 SIGN2="0F70 418E B42B 2CC9 D0E4 8AEC 1FD0 \
B6C7 69CE 1A05 F6BF 45F5 BEE4 408C C415 1AA5 18B8 6AA1 3641 \
6FDD 52E1 45B6 5561 05D4 9C62 AD6B 02AA 9171 5FAC 2434"
INCREMENT ASE_EE SYBASE 2007.08150 15-aug-2007 2 \
VENDOR_STRING=SORT=100;PE=EE;LT=SR PLATFORMS=i86_n DUP_GROUP=H \
ISSUER="CO=Sybase, Inc.;V=15.0;AS=A;MP=1567" \
ISSUED=14-jun-2006 BORROW=720 NOTICE="Sybase - All Employees" \
SN=500500300-52122 TS_OK SIGN2="0E12 0DC8 5D26 CA5B D378 EB1A \
937B 93F9 CAF2 CDD8 0C3E 4593 CA29 E2F1 8F95 15D1 2E60 11C0 \
10BE 26EC 8168 4735 8A52 DD9F C239 5E88 36D7 1530 A947 1A7C"
While the SYSAM documentation describes the license components in greater detail, the key fields are
highlighted above. Taken in order, the first field is a date. This is the expiration date for Sybase support
for this group of servers. After this date, since support is expired, attempts to apply patches will succeed or
fail depending on the support grace period of the product. Prior to this date, this customer will have to
renew their support agreement for these systems and then do a license upgrade.
The second field is the number of licenses in this case 2. By itself, it is meaningless and has to be
looked at in conjunction with the last field LT=SR which says that this license is for a Server license vs.
CPU (or other form). The result is that this license allows 2 different host machines to check out ASE
server licenses. The reason this is brought to your attention is that for CPU licenses, the above file might
change slightly to resemble something like:
VENDOR SYBASE
USE_SERVER
INCREMENT ASE_PARTITIONS SYBASE 2007.08150 15-aug-2007 4 \
VENDOR_STRING=PE=EE;LT=CP PLATFORMS=i86_n ISSUER="CO=Sybase, \
Inc.;V=15.0;AS=A;MP=1567;CP=0" ISSUED=14-jun-2006 BORROW=720 \
NOTICE="Sybase - All Employees" SN=500500048-52182 TS_OK \
SIGN2="1BBA CEE0 89FD 7CDA 6729 E6AB D37C B48C A97F D3F7 4AC0 \
E4C9 4310 DCA9 7FE7 1FD2 5C38 1345 931F 7D14 9A34 DB84 6157 \
8B2A 3E90 9654 5177 A539 E362 9A73"
In this license, the license server can provide 4 CPUs of the ASE partition license to ASEs on other hosts
that may be requesting it. Site license customers or other licensing implementations that have an unlimited
quantity of licenses need to make sure that when generating licenses that the number of cpus that they
generate licenses for is the same or greater than the number of cpu cores
16
One difference for customers that only had licensed the core ASE product in previous versions, in ASE
15.0, the ASE_CORE license includes the Java, XML, and XFS (external file system or content
management) licenses or these licenses may be available at no additional cost via the SPDC (as is the case
for XML, etc.). Premium options such as HA, DTM and others still require separate licensing. For further
information, contact your Sybase sales representative or Sybase Customer Service at 1-800-8SYBASE.
The Developer's Edition (DE) includes all non-royalty options from Sybase. Royalty options include the
Enhanced Full Text Search, Real Time Data Services, and most of the tools such as DBXray or Sybase
Database Expert. When installing the Developer's Edition, simply pick the unserved license model as the
Developer's Edition use the file based licensing mechanism. The Developer's Edition keys are included in
the product CD image. Consequently after installing the software, no further SySAM actions are required.
SySAM Software Installation
The SySAM installation has several gotchas for the uninitiated. The first hurdle is simply acquiring the
software itself. While this may seem obvious, however, consider a Sun, HP or IBM hardware organization
that decides to use a small Linux workstation for the license server. Without a valid ASE license for Linux,
they will not be able to obtain the software. Fortunately, this can be resolved quite easily. One option is to
download the ASE Developers Edition or ASE Express Edition (Linux only) for the platform you
wish to run the license server on. After getting the CD image, run the installshield program and only select
the license server components to be installed.
The second hurdle is starting up the license server process. The gotcha here is that you must have at least
one served license file installed in the licenses directory or else the license server will not run. This is
especially obvious on Microsoft Windows systems in which after installation, the usual nonsensical reboot
is required and on restart, you get a services start error on the license server process. In any case, simply
copy or ftp the license files you downloaded earlier into the appropriate directory (making sure they have a
.lic extension), negotiate to the SySAM bin directory and issue sysam start. You can verify everything is
working via the sysam status command as well as looking at the log files.
The third hurdle is to make sure that the license manager starts with each system reboot. For most Unix
systems, this means adding the SySAM startup to the rc.local startup scripts.
Finally, for email alerts, you will need to have smtp email based server running. For Unix systems, this
pretty simple as most Unix variants include a basic email server with the operating system. For Microsoft
Windows users, however, this can be a bit more challenging. The gotcha with this aspect is that it is the
product not the license server that alerts the DBA to a license in grace mode or other similar problem.
As an alternative, you could create a centralized polling process that used the ASE sp_lmconfig stored
procedure. However, an easier approach that presents the licensing information in a more easily parsed
format is to use the MDA table monLicense
1> 1> use master
1>
2> select Name, Quantity, Type, Status, GraceExpiry
3> from monLicense
Name Quantity Type Status GraceExpiry

------------------ -------- ---------------- --------- ------------
ASE_JAVA 1 Server license expirable NULL
ASE_EFTS 1 Server license expirable NULL
ASE_ENCRYPTION 1 CPU license expirable NULL
ASE_RLAC 1 CPU license expirable NULL
ASE_PARTITIONS 1 CPU license expirable NULL
ASE_CORE 1 Server license expirable NULL
(6 rows affected)
A simple license grace detection query would merely check:

select Name, Quantity, Type, Status, GraceExpiry
from master..monLicense
where GraceExpiry is not null
17
While this technique works for ASE, however, it wont work for other products such as Replication Server.
For other products (as well as for ASE), you may wish to use the Unified Agent (UA) process to check for
occurrences of a graced license error in the errorlog (error 131274), such as the following:
2006/05/24 14:51:16.82 kernel SySAM: Checked out graced license for 1 ASE_CORE (2005.1030) will
expire Fri Jun 09 01:18:07 2006.
2006/05/24 14:51:16.82 kernel SySAM: Failed to obtain 1 license(s) for ASE_CORE feature with
properties 'PE=EE;LT=SR'.
2006/05/24 14:51:16.82 kernel Error: 131274, Severity: 17, State: 1
2006/05/24 14:51:16.97 kernel SySAM: WARNING: ASE will shutdown on Fri Jun 09 01:18:07 2006, unless a
suitable ASE_CORE license is obtained before that date.
2006/05/24 14:51:16.97 kernel This product is licensed to:
2006/05/24 14:51:16.97 kernel Checked out license ASE_CORE
Software Deployments & SySAM
Many of our customers prefer to build their own deployment packages for Sybase installation using
internally certified ASE versions and patch levels often using tarballs or other electronic packaging.
Previously, such images were built by simply tarring up the same license key and using it for all the
servers. As mentioned earlier, this doesnt change much. When using a license server, the only
requirement is to make sure that each host that will be running Sybase software has a license file stub. A
sample one SYBASE.lic should have been created for you when you first installed the software from
the CD image. If not, the format is extremely simple:
USE_SERVER
For redundant installations, the file will have three SERVER lines vs. the one. From this point, the
software installation builds can be deployed to where-ever necessary. The only post-extraction requirement
that might be necessary is to change the license server names in the file if planning on using a different
license server(s) than those listed in the file when the installation package was created.
An alternative is to listing the license servers in the license stub file is to set the license servers using the
LM_LICENSE_FILE environment variable for served licenses. For unserved licenses, the licenses are
interrogated in alphabetical order of the license files. Both of these are described in the SySAM
documentation.
ESDs, IRs and Updating SySAM licenses
So far, every ASE 15.0 ESD and IR released - except ASE 15.0 ESD #1 - has had a license component
change. Each time, the license change was due to a new feature being added and product bundling resulted
in the name of a existing license component changing - or previously configured options failing as the new
license component may be required. Additionally, as an IR, ASE 15.0.1 required updating the ASE_CORE
and other option licenses to the 15.0.1 level from 15.0. As a result, ASE 15.0 ESD #2, ASE 15.0.1, and
ASE 15.0.1 required DBAs to Upgrade their licenses via the SPDC. This would be in addition to any
upgrade performed as a result of annual support renewal efforts.
Consequently, it is strongly encouraged that DBAs check all licensable options by applying an ESD or IR
first to a test system with all the customer licensed options enabled. By doing this, unexpected last-minute
licensing actions or attempts to uninstall a patch due to licensing changes can be avoided.
ASE 15 & SySAM FAQ
The below questions should have been answered in the above discussions but are repeated here to ensure
that the points are not missed in the details above. Further information or assistance on SySAM is available
at http://www.sybase.com/sysam. In addition, you can call Sybase Customer Service or Technical Support
at 1-800-8SYBASE, or talk to your local Sybase Sales Team.
18
Is SySAM 2.0 Required?
Yes. In order to ensure SOX and other legislative requirements that publically traded companies must now
face and the penalities that corporate executives face for financial issues, including misuse of software
licenses, ASE now must check out a valid license from the license server.
I have 100 ASE servers, do I have to register all of their nodes with Sybase?
No. Only the node where the license server runs needs to be provided to Sybase. This node id is part of the
license key, which is why it must be provided. See the discussion above on served licenses (under License
Architecture)
Is a license required for the ASE Developers Edition?
No. Although the installer prompts you for license information, the Developers Edition license enabling all
the non-royalty options of Sybase ASE is included in the software. Simply pick unserved from the
installer option and proceed with the install. Note that this also, by default does not install the license
server, which is not needed for the Developers Edition.
My company paid for a license for a server that operates outside our WAN environment where
it cant contact a license server do we need a separate license server for it?
You have two options here. The first, of course, is to set up a separate license server outside the WAN for
it. However, that is likely impractical. As a result, Sybase also provides for unserved licenses in which
the ASE software simply reads its licensing information from a local file. This latter method is probably
the best choice for remotely deployed software.
My company has locations in Europe, Asia, Africa, Latin America and North America is it
reasonable to use only a single license server or is more than one necessary?
Possible: Yes. Reasonable: Likely not. For geographically dispersed deployments, you may wish to set up
a separate license server for each different geography.
Do I need to install the License Server on a cluster or otherwise provide redundant capability.
Due to the nodeid lock, the license server would not work if it failed over to another server in hardware
clustering configuration. If license server redundancy is desired, you can either use the multi-license
server implementation (may be most applicable for site licenses customers), or use the three-node
redundant server configuration.
I normally create tarballs for distributing certified releases within my company will SySAM
2.0 prevent this or cause me problems?
No. Unlike previous releases in which the license keys were distributed to every host, the license file for
the deployed ASE software only needs to have the location of the license host and port number (if other
than 27000).
Where can we go for more information?
http://www.sybase.com/sysam is the official location for SySAM 2.0 materials. In addition, you can call
Sybase Customer Service or Technical Support at 1-800-8SYBASE, or talk to your local Sybase Sales
Team.
19
20
Preparing for the Upgrade
The Sybase ASE 15.0 Installation guide contains the full steps for upgrading to ASE 15.0 including the
usual reserved word check utility, etc. This section of the paper concentrates on particular aspects that have
resulted in upgrade issues to date.
Review Database Integrity
The following database integrity considerations should be reviewed prior to upgrading from ASE 12.x.
Check Security Tables (sysusers)
Sometimes to provide customized control of userids, login ids or roles, DBAs have manually modified the
values of user or role ids to resemble system accounts. As with all new versions of Sybase software,
increased functionality sometimes requires new system roles to control those features. As a result, during a
database upgrade, the upgrade process may attempt to add a new role to the database and if one already
exists with that id, it will fail causing the entire upgrade to fail. Prior to upgrading, you may wish to review
sysusers and sysroles to ensure any manually inserted user or role ids are within the normal Sybase user
range.
If you are doing the upgrade via dump/load from another server, after the ASE 15.0 server has been
created, you will need to copy the existing login ids and role information from the previous server to the
new ASE 15.0 server prior to loading the first database to be upgraded.
Run dbcc checkstorage
Although this is documented in the Installation guide, it is often skipped. Unfortunately, existing
corruptions will cause the upgrade process to fail as nearly every page in the database is often touched
during the upgrade.
Check database space
Make sure you have plenty of free space within the databases to be upgraded. The pre-upgrade utility will
check this for you especially calculating the space necessary for the catalog changes. However, it is
always advisable when performing an upgrade to have plenty of free space available in the database. Try to
have at least 25% of the space free for databases less than 10GB. Additionally, make sure the transaction
log is as clear as possible. The upgrade instructions specifically remind you to dump the transaction log (to
truncate it) after performing a full database dump. This is important as all the catalog changes will require
some log space.
This is also true of the system databases such as master or sybsystemprocs. Make sure master is only using
about 50% of the available disk space (remember to make a dump of master and to then dump the tran log
to truncate it often the cause of space consumption in master is the transaction log) and that
sybsystemprocs has sufficient disk space for all the new stored procedures.
Additionally, several of the system databases take up more space in 15.0. While the amount of additional
space required is a factor of the server pagesize (for 2K pagesize, the increase is generally about 4MB), it is
recommended that you increase the system database sizes prior to the upgrade itself.
If using utilities such as DBExpert to compare query plans, these utilities use abstract query plans which
are captured in the system segment. Consequently, you will need to make sure that you have enough free
space in the system segment beyond the catalog expansion requirements. Additionally, ASE 15 includes a
feature called sysquerymetrics that uses system segment space you may wish to prepare for this by
expanding the existing system segment to other devices if it was restricted to a smaller disk or disk with
little free space.
21
Find all currently partitioned tables
As a result of system table changes, and the changes to support semantic partitions, tables currently
partitioned using segment-based partitioning or slices will be unpartitioned during the upgrade process.
Optionally, to save time and effort during the upgrade itself, you may wish to manually unpartition these
tables prior to running the upgrade particularly tables for which the partitioning scheme will be changed
to one of the semantic partitioning schemes.
Increase HW Resource Requirements
Because ASE 15 uses new in-memory sorting and grouping algorithms, ASE 15.0 will need additional
memory compared to ASE 12.5. Since this involves sorting, you will need additional procedure cache
space for the auxiliary scan buffers used to track sort buffers and additional memory for sorting. Since this
in-memory sorting replaces the need for worktables, the additional memory requirement will be in
whichever data cache tempdb is bound to. While it is not possible to suggest exactly how much memory,
remember, this is replacing former use of worktables, which used data cache as well as disk space
consequently the additional requirements are likely a small percentage above the current allocations (i.e.
5%). This is one of the areas to carefully monitor after upgrading. In ASE 15.0.1, some new MDA tables
will help monitor this from a procedure cache standpoint - monProcedureCacheModuleUsage, and
monProcedureCacheMemoryUsage. Both tables contain a High Water Mark (HWM) column so that the
max memory usage for particular activities can be observed on a procedure cache module basis.
In addition, due to the increased number of first class objects implemented within the server to support
encrypted columns, partitioned tables and other new features, you may need to increase the number of open
objects/open indexes. This particularly is likely to be true if the ASE server contains a number of
databases. You can monitor the metdata cache requirements including object and index descriptors through
sp_sysmon.
Preparing for Post-Upgrade Monitoring for QP Changes
Especially post-upgrade, the MDA tables along with sysquerymetrics will be heavily used to find affected
queries as well as changes in over system resource consumption. This document will include several
techniques useful in spotting queries performing differently using both sysquerymetrics and the MDA
tables. The MDA setup instructions are in the product documentation, but essentially consist of creating a
loopback server (not needed after ASE 15.0 ESD #2), adding mon_role privileges and then installing the
monitor tables from the installmontable script. Make sure that you install the most current montable script
if you previously installed the montables and then applied an EBF, you may need to reinstall them to pick
up differences. The MDA table descriptions are in the ASE product documentation (ASE Performance &
Tuning). You should also familiarize yourself with the sysquerymetrics table, sp_metrics system procedure
and the associated techniques to exploit this new feature to rapidly identify underperforming queries.
Review Trace Flags
Either to influence ASE behavior or to correct issues, some customers may be starting ASE with trace flags
in the RUN_SERVER file (i.e. T3605). As many of these trace flags are specific to 12.5.x, they are not
necessary in 15.0. The below table gives a list of some of these trace flags and the ASE 15.0 applicability.
Customers with trace flags should check with Sybase Technical Support as well before removing the trace
flag as the trace may be used for another purpose than the main one listed here.
22
Trace Description 12.5.x 15.0 Additional Remarks

Flag
291 when ON, predicates of the form col1 3 8 This is a feature in 15.0
<relop> fn(col2), where the datatype of
col2 is higher than that of col1, the
expression f(col2) will be cast to the
datatype of the lower type i.e. col1
333 disable min-max optimization for all cases 3 8 No longer supported
364 Use range density instead of total density 3 8 No longer supported
370 use min-max index only as alternative to 3 8 No longer supported

the table scan for single table queries. Do
not perform aggregation optimization for
joins
396 use min-max optimization for single table 3 8 No longer supported
queries
526 print semi-graphical execution operator 8 3 New feature in 15.0
tree when showplan is turned on
If you are running with a trace flag not listed here, contact Technical Support for assistance.
Additionally, the 300 series diagnostic trace flags (302, 310, 311, etc.) are being discontinued and will be
deprecated in a future release. They have been replaced with showplan options which provide better
diagnostic output as well as more readable output.
Run Update Index Statistics
Prior to the upgrade, you will likely want to run update statistics using a higher than default step count and
possibly using a non-default histogram tuning factor. A good beginning point might be to increase the
default step count to 200 and the default histogram tuning factor to 5, allowing update statistics to create
between 200 and 1,000 range cells as necessary. For extremely large tables, you may wish to use a step
count of 1,000 or higher (depending on your configured histogram tuning factor setting - note that in
15.0.1, the default has changed to 20). While not always possible, ideally you should have one histogram
step for every 10,000 data pages with a maximum of 2,000 histogram steps (beyond this is likely not
pratical and would likely only have intangible benefits). However, this likely is only necessary for indexes
that are used to support range scans (bounded or unbounded) vs. direct lookups. The more histogram steps,
the more procedure cache will be needed during optimization, so it likely is best to gradually increase the
number of steps/histogram tuning factor combination until performance is desirable and then stop.
Additionally, it is strongly suggested that you run update index statistics instead of plain update statistics
(there is a discussion on this later in the section on missing statistics and showplan options). For existing
tables with statistics, you may wish to first delete the statistics so that the update statistics command is not
constrained to the existing step counts. While not required, it has been observed during previous upgrades
(i.e. 11.9 to 12.0) that table statistics inherited the histogram cell values when running update statistics
alone resulting in skewed histograms in which the last cell contained all the new data. For large tables, you
could run update statistics with sampling to reduce the execution time as well as increasing the number of
sort buffers temporarily (number of sort buffers is a dynamic configuration variable in 12.5) to 5,000,
10,000 or even 20,000 however, the larger the setting the more procedure cache will be required
especially if running concurrent update statistics or create index commands.
The reason this is suggested as part of the pre-upgrade steps is that it can be factored in during the weeks of
testing leading up to the upgrade itself. While a more thorough explanation as to why this may be
necessary will be discuss in the maintenance changes section, ASE 15.0 is more susceptible to statistics
issues than previous releases due to now having multiple algorithms vs. singular. As a result, running
23
update statistics after the upgrade would normally be recommended to avoid query performance issues
but since neither the statistics values nor structures change as a result of the upgrade, this step could be
accomplished before the actual upgrade. The notable exception to this would be tables planning on being
partitioned, which will require dropping and recreating the indices which will recreate the statistics in any
case.
24
Installation & Upgrade
The installation guide available online along with the downloadable PDF version contains the full details
for installing or upgrading ASE. This section mentions common installation problems noted to date.
Installing the Software
Previous releases of Sybase software were typically distributed via CD-ROM, which also was the primary
installation media for the first host installed on (other hosts may be installed from an image at larger sites).
However, with the widespread use of Electronic Software Distribution, ASE 15 is likely going to be first
installed from a software download from the Sybase Product Download Center (SPDC). The downloaded
software is still in the format of a CD image, consequently you may burn it to a CD-ROM for installation if
necessary. To avoid installation problems, consider the following points:
Installation directly from the downloaded media without using a CD is supported. However, make
sure the download location has a short path (i.e. /sybase/downloads or c:\sybase\downloads) and
that the path does not contain any special characters such as the space character. This is especially
true of Windows systems avoid directories such as C:\Program Files\Sybase\downloads due to
the space between Program and Files. Failing to do this or using the wrong JRE version will
likely result in installation failures citing "class not found" errors.
Additionally, the software ships with the correct JRE for the installshield GUI. If you have other
Java environments installed (JDK or JRE) and have environment variables referencing these
locations (i.e. JAVA_HOME or if you have java in your path for frequent compilations), unset
these environment variables in the shell you start the installation from.
Use GNU gunzip utility to uncompress the CD images on Unix systems. The CD images
downloaded were compressed using GNU zip. Decompressing them with standard unix
compression utilities often results in corrupted images. Please use gunzip which is freely
available from the GNU organization
Use GNU tar utility to extract the archive files for the CD image. Similar to the above, the file
archive was built using a form of the GNU tar utility. Using standard unix tar may not extract the
files correctly.
Many of the hardware vendors support a variety of mount options for their CD-ROM drives.
Make sure you are using the correct mount options specified for your platform in the Sybase
installation manual.
Alternative to Dump/Load for ASE 12.5.1+
For customers upgrading from ASE 12.5.1 or higher and planning on building a new server vs. upgrade-in-
place, there is an alternative to dump/load which may prove to be much faster. The crucial element in this
alternative is that the new server must have access to the physical devices or copies of them. Users with
SAN technologies may find this process much faster even if migrating to new hardware at the same time.
The general sequence of steps are as follows:
1. The existing 12.5.x database is either quiesced to a manifest file (during testing) or unmounted
from the server (final).
2. If the new 15.0 server is on another host, the disk devices need to unmounted from the current host
and mounted to the new host machine (or copied using SAN utilities).
3. The manifest file needs to be copied to the new ASE 15.0 location
25
4. The 12.5.x database is then simply mounted by the new ASE 15.0 server using the manifest file
with corrections to the new device path's as necessary.
5. The online database command is issued the database upgrade will begin as soon as the database
is brought online.
The following example scenario demonstrates this technique. The assumption is that you wish to upgrade a
database 'testdb' and will be using a manifest file named 'testdb_manifest.mfst'.
1. Quiesce the 12.5.x database to a manifest file using the SQL similar to the following:
quiesce database for_upgrd hold testdb
for external dump
to /opt/sybase/testdb_manifest.mfst
with override
go
2. Copy the devices using disk copy commands (i.e. dd), SAN utilities, or standard filesystem
command (if file system devices are used).
3. When the device copy is finished, release the quiesce by a SQL statement similar to:
quiesce database for_upgrd release
go
4. If this is the final upgrade (vs. testing), shutdown ASE 12.5 to prevent further changes.
5. Move the device copies to the new host machine and mount them as appropriate.
6. In ASE 15.0, use the following SQL statement to list the physical to logical device mappings
using a SQL statement similar to:
mount database all from /opt/sybase/testdb_manifest.mfst with listonly
go
7. From the above, look at the physical to logical device mappings and determine the new
physical device mappings that correspond to the logical devices. Then mount the database in
ASE 15.0 using SQL similar to:
mount database all from /opt/sybase/testdb_manifest.mfst using
/opt/sybase/syb15/data/GALAXY/test_data.dat = test_data,
/opt/sybase/syb15/data/GALAXY/test_log.dat = test_log
8. Bring the database online and start the upgrade process by using the normal online database
command similar to:
online database testdb
This technique has a restriction in that the databases need to be aligned with the devices. If a device
contains fragments of more than one database, you may have to move/upgrade them all at the same time.
Cross Platform Dump/Load
In ASE 12.5.x, Sybase introduced a cross-platform dump/load capability (aka XPDL) that allowed dumps
taken from a quiesced server on one platform to be loaded on another platform - regardless of endian
architecture and 32-bit/64-bit codelines. While there are a few limitations, most popular platforms are
covered. If moving to a new platform while upgrading to ASE 15.0, there are a few things to keep in mind
as part of the upgrade when using XPDL.
While most data is converted between endian structures, the index statistics appear not to be.
As a result, after a successful XPDL load sequence, in addition to running the sp_post_xpload
procedure, you will need to (re)run update index statistics on all tables (see earlier
recommendations for using higher step values).
As documented, large binary fields and image data may not be converted between endian
platforms. For example, if the tables contain a binary(255) column, it is not likely that the
data will be converted between endian structures while a binary(8) may be. If you know you
26
changed endian architectures, you may have to write a routine to manually swap the bytes -
after ensuring that the values are indeed wrong as a result of the platform change.
Unfortunately, the exact byte swapping to perform will largely depend on how the original
data was encoded.
If you are storing parsed XML (either in varbinary or image datatype), you will need to
reparse the orginal XML (hopefully this is in a text column) vs. attempting to swap the byte
ordering.
Obviously this may impact the time it takes to perform the migration - resulting in longer application
outage for non-replicated systems. Additionally, you may also want to time both the XPDL and a bcp
implementation (or sybmigrate) to determine which might be faster given your volume of data. One
consideration is that if bcp is nearly the same speed, by using bcp or sybmigrate, you can change the page
size of the server. For example, on Linux, a 4K page size has considerable performance gains over a 2K
page size - and most SAN implementations are tuned to 4K frames.
Installation & Upgrade FAQ
Everytime I start the installation GUI, it fails with a class not found error why?
There could be several causes for this. The two most likely causes are that the downloaded media path
contains a space or other special character. A second common cause is that a different JRE is being used
than the one shipped with the installation media
Where do I get the product documentation is SyBooks available?
Sybase is discontinuing use of the DynaText based SyBooks in favor of pdf format and Eclipse-based
documentation. Currently, you can download the pdf versions from the Sybase Products Download Center.
The online documentation using the Eclipse viewer is also available at
http://sybooks.sybase.com/nav/detail.do?docset=741 which can be found by traversing www.sybase.com
Support & Services tab select Adaptive Server Enterprise and your language select Adaptive
Server Enterprise 15.0
When I try to install, I get a file missing error with a path like md5 and a string of numbers
Most likely this happens when the installation media was created from a download image vs. a shipped
product CD. As a result, there are several possible causes for this:
gunzip vs. unix compress The CD images downloaded were compressed using GNU zip.
Decompressing them with standard unix compression utilities often results in corrupted
images. Please use gunzip which is freely available from the GNU organization
GNU tar vs. unix tar Similar to the above, the file archive was built using a form of the GNU tar
utility. Using standard unix tar may not extract the files correctly.
Incomplete download image during the download, the download itself may not have been
complete or in a subsequent ftp from the download machine, the archive file may not be
complete. You may want to download the image again.
If none of the above three work, contact Sybase Technical Support for assistance.
When I try to install from the CD, it won't run
Many of the hardware vendors support a variety of mount options for their CD-ROM drives. Make sure
you are using the correct mount options specified for your platform in the Sybase installation manual.
27
What should I do if the upgrade fails
Sybase has provided a utility that will upgrade an existing ASE by automatically performing the upgrade
steps. If the upgrade should fail for any reason, customers have the option to manually complete the
upgrade after resolving the problem. Should this be necessary, contact Sybase Technical Support for
assistance in completing the upgrade. This is particularly true if upgrading via dump/load. There have
been known issues in which ASE 15.0 configuration settings could prevent an upgrade from completing
when loading from an older dump.
28
Partitioned Tables
One of the key new features in ASE 15.0 is the semantic partitioning. As the name implies, this feature
allows data to be distributed to different partitions according to the data value vs. just round-robin
partitioning that earlier versions of Sybase ASE provided. A full description of this feature is available in
Chapter 10 of the ASE 15.0 Transact SQL Users Guide. The purpose of this section is to highlight some
of the gotchas or unexpected behavior when using partitions that some customers have already noticed.
The behaviors are documented the purpose of this document is to help customers understand the
rationale.
Partitions & Primary Keys/Unique Indices
As documented in the Transact SQL Users Guide, unique index including primary keys may not be
enforceable if the partition keys are not the same as the index/primary key columns or a subset of them.
As a result, up through ASE 15.0 ESD #2, attempts to create a primary key constraint or a unique local
index on columns not used for partition keys will fail with an error similar to:
1> use demo_db
1>
2> create table customers
3> (
4> customer_key int not null,
5> customer_first_name char(11) null ,
6> customer_last_name char(15) null ,
7> customer_gender char(1) null ,
8> street_address char(18) null ,
9> city char(20) null ,
10> state char(2) null ,
11> postal_code char(9) null ,
12> phone_number char(10) null ,
13> primary key (customer_key)
14> )
15> lock datarows
16> with exp_row_size = 80
17> partition by range (state) (
18> ptn1 values <= ('F'),
19> ptn2 values <= ('O'),
20> ptn3 values <= ('S'),
21> ptn4 values <= (MAX)
22> )
Msg 1945, Level 16, State 1:
Server 'CHINOOK', Line 2:
Cannot create unique index 'customers_1308528664' on table 'customers'. The table partition condition
and the specified index keys make it impossible to enforce index uniqueness across partitions.
Msg 2761, Level 16, State 4:
Server 'CHINOOK', Line 2:
Failed to create declarative constraints on table 'customers' in database 'demo_db'.
Note that in the above case, since a error of level 16 is returned, the command fails, consequently the table
is NOT created (despite the fact the error message only seems to indicate that it was the primary key
constraint that was just not created vs. the entire table). While a future ESD may eliminate this restriction,
it is one to be very much aware of as the most common partitioning schemes for range partitions typically
do not include primary key attributes. To see how this can happen, consider the following scenario.
Most customers are considering partitioning their tables for one of two reasons 1) allow more efficient
and practical use of parallel query feature; 2) allow easier DBA tasks. Note that the two are not always
mutually exclusive. However, there is one aspect to partitioning for DBA tasks that can cause unexpected
behavior. The most common method for partitioning for DBA task is to partition based on a date or day
possibly a modulo day number calculated as an offset. This is typically done to allow fast archiving of
older data. Often times, this date is either not in the primary key at all or is only one column in the primary
key. Another example is when the table is partitioned according to a lower cardinality division such as
State/Country vs. unique key.
29
Now, lets assume we have a table containing customers (uniquely identified by cust_id) in the United
States and to divide them up by sales region or for another reason, we partitioned them by state. Consider
the following diagram:
Figure 2 - Partitioned Table and Primary Key Enforcement Issue
Even though we are attempting to insert the same exact cust_id value (12345), because we are inserting
into different state partitions, the insert succeeds. Why??? Because, the various index partitions act almost
as if they are independent consequently, when the values are being inserted, they cant tell if the value
already exists in another partition. This explains the warning you receive when you partition a table on a
column list that does not include the primary keys or attempt to create a unique local index on columns that
are not used for partition keys.
So, why didnt Sybase enforce uniqueness??? The answer is speed. Lets say for example that we have a
50 million row table. The primary key and any non-clustered index would likely need 7 levels of indexing
to find a data leaf node from the root node of the index. When partitioned (assuming even distribution of
values), each partition would only have 1 million rows which likely would need 5 levels of indexing. In
an unpartitioned table, the unique value check would only take 7 IOs to read to the point of insertion to
determine if a row with that value already exists. However, for a partitioned index, it would have to
traverse all 5 levels for all 50 partitions 250 IOs in all.
The workaround to this problem is simple create a global unique index to enforce uniqueness instead of a
local index or primary key constraint (all primary key constraints are partitioned according to the table
schema). Since a global index is unpartitioned, uniqueness can still be enforced.
Global vs. Local Indexes
The aspect that is often overlooked is whether an index on a partitioned table should be created as a local
index or a global index. Some initial guidelines are:
If the Pkey is not a superset of all the partition keys, use a unique global index.
If the index does not contain ALL the partition keys, either
o Use a global index
o Add the partition keys and use a local index (preferred)
You may need both a global and local index on the same columns
If partitioning for maintenance reasons, keep the number of global indexes to an absolute
minimum (with zero being preferred).
30
The reason for these rules is that the optimizer can only consider partition elimination when it knows which
partitions are affected - namely inferring that the partition key column(s) appear in the where clause. Take
for example the following example table:
create table trade_detail (
trade_id bigint not null,
trade_date datetime not null,
customer_id bigint not null,
symbol char(5) not null,
shares bigint not null,
price money not null
)
partition by range (trade_date)
(
Jan01 values <= Jan 1 2006 11:59:59pm,
Jan02 values <= Jan 2 2006 11:59:59pm,

)
Assuming that the trade_id is the primary key, then we would need a global index such as:
create unique nonclustered index trade_detail_pkeyidx
on trade_detail (trade_id)
The question is, what if instead of the above index, we created an index such as the following:
create unique nonclustered index trade_detail_idx
on trade_detail (trade_id, trade_date) local index
Ignoring the uniqueness enforcing issue, the question is what would happen for queries similar to the
below?
select *
from trade_detail
where trade_id between 123400 and 123500
The answer unfortunately is a table scan. In early versions of 15.0, a table scan may have happened.
However, in ASE 15.0.1, the optimizer will use the index partitions as part of the partial index usage
strategy and simply check each partition. With a large number for partitions, this could take a while
compared to a global index on the same columns.
Consequently, a good mix of indexes for the above table might be:
-- global index to enforce primary key uniqueness and assoc queries
create unique nonclustered index trade_detail_pkey
on trade_detail (trade_id)
-- local index in case the trade date is supplied

create unique nonclustered index trade_date_id_idx
on trade_detail (trade_date, trade_id) local index
-- local index on customer and trade_date

create nonclusterd index trade_cust_date_idx
on trade_detail (customer_id, trade_date) local index
Another aspect to remember is that you may need to reorder some of the columns in a local index. For
example, consider the following index on the above table:
-- local index variation of the above
create nonclusterd index trade_symbol_date_idx
on trade_detail (symbol, customer_id, trade_date) local index
In a partitioned table if only the symbol or trade_date are provided (but not customer_id), the optimizer can
use the trade_date values to perform partition elimination, but it cant use it for query access. As a result, it
has to use strictly the symbol column - which is likely not to discrete for the table - resulting in a partition
scan. Simply by moving the trade_date forward as in:
-- local index variation of the above
create nonclusterd index trade_symbol_date_idx
on trade_detail (symbol, trade_date, customer_id) local index
31
The index is now usable. Note that it is likely that even in an unpartitioned table, the earlier index might
not get used.
The reason it is suggested that global indexes be kept to a minimum for partitioned tables has more to do
with DBA tasks than query performance. If you are partitioning for DBA maintenance reasons, you will
likely be dropping/truncating partitions - especially if implementing a rolling partition scheme. If only
local indexes exist, these tasks only take a few seconds. However, because a global index only has a single
index tree, when a partition is dropped or truncated, a global index needs to be rebuilt after deleting all the
associated index rows corresponding to the partition rows - a task that could take hours.
In addition to primary key enforcement, other cases when global indexes might be required include queries
using aggregates (especially if grouped) that do not reference the partition columns in the where clause - as
well as other possibilities. It is highly recommended that you thoroughly test your application if you
partition tables that previously were unpartitioned.
Multiple/Composite Partition Keys & Range Partitioning
As described in the Transact SQL Users Guide, both the Range and the Hash partitioning scheme allow
users to specify more than one column (up to 31) as partition keys creating a composite partition key. For
hash partitions, it behaves as would be expected. However, for range partitions, the behavior can be
unexpected, especially when the partition keys are numeric. The reason is that the assumption most users
make is that ASE uses all of the keys each time to determine where the row needs to be stored. However,
in actuality, ASE uses the fewest partition keys in sequence until it can determine the appropriate partition.
This rule is documented as:
if key1 < a, then the row is assigned to p1
if key1 = a, then
if key2 < b or key2 = b, then the row is assigned to p1
if key1 > a or (key1 = a and key2 > b), then
if key1 < c, then the row is assigned to p2
if key1 = c, then
if key2 < d or key2 = d, then the row is assigned to p2
if key1 > c or (key1 = c and key2 > d), then
if key1 < e, then the row is assigned to p3
if key1 = e, then
if key2 < f or key2 = f, then the row is assigned to p3
if key2 > f, then the row is not assigned
This can be summarized by the following points:
If value < key1 then current partition

If value = key1, then compare to key2
If value > key 1, then check next partition range
To see how this works, lets assume we have a table of 1.2 million customers. Since we want to partition to
aid both parallel query and maintenance, we are going to attempt to partition it by the quarter and then the
customer id. This way, we can archive the data every quarter as necessary. Note, however, that the table
has a month column vs. quarter, but since we know that the quarters are every third month, we try the
following partition scheme (Tip: this example is borrowed from the Sybase IQ sample database telco_facts
table in case you wish to try this).
alter table telco_facts_ptn
partition by range (month_key, customer_key)
(p1 values <= (3, 1055000) on part_01,
p2 values <= (3, 1100000) on part_02,
p3 values <= (6, 1055000) on part_03,
p4 values <= (6, 1100000) on part_04,
p5 values <= (9, 1055000) on part_05,
p6 values <= (9, 1100000) on part_06,
p7 values <= (12, 1055000) on part_07,
p8 values <= (12, 1100000) on part_08)
32
After the partitioning, we notice that instead of evenly distributing the 1.2 million rows (150,000 rows to
each partition), the odd partitions contain 250,000 rows while the even partitions only contain 50,000 rows.
What happened.??? The answer is that for months 1 & 2 (Jan, Feb), when ASE compared the data values
to the first key in the first partition, it was less than the key (1 < 3 and 2 < 3), therefore, it immediately put
all the Jan & Feb data into the first partition regardless of the customer_key value. Only when March data
was being entered did ASE note that the values were equal to key1 (3=3) and therefore it needed to
compare customer_key value with key2. Consequently, the even partitions would only have data in which
the month was equal to the partition key and the customer_key value was greater than the customer_key
value for the partition key before it.
Semantic Partitions & Data Skew
In the earlier releases of ASE, the segment-based round-robin partitioning scheme supported parallel query
capability. However, if the partition skew exceeded a specific ratio, the optimizer would consider the
partitioning too unbalanced to provide effective parallel query support and process the query in serial
fashion. In order to prevent this, DBAs using parallel query often had to monitor their partition skew and
attempt to rebalance it by dropping and recreating the clustered index or other techniques.
With semantic partitioning, data skew is no longer a consideration. Rather than evaluating the depth of the
partition for parallel query optimization, the ASE 15.0 optimizer considers the type of partition, the query
search arguments, etc. In fact, for semantic partitions, due to the very nature of unknown data distribution,
there likely will be data skew, but since a hash, list or range partition signals where the data resides, the
skew is unimportant.
Partitioned Tables & Parallel Query
Partitioned tables allow more effective parallel query than previous releases of ASE. However, in some
cases this could lead to worse performance than expected. For example, in previous releases of ASE, the
number of partitions typically created were fairly small and often within a small multiple (i.e. 2) of the
number of ASE engines. Additionally, the max parallel degree and max parallel scan degree were also
tuned to the number of engines. While the latter may still be the case, in many cases, customers are now
looking at creating hundreds of partitions largely due to the desire to partition on a date column due to
archive/data retention and maintenance operations.
If the partitioning is simply to maintenance operations, parallel query can be disabled by setting max
parallel degree, max parallel scan degree to DEFAULT (1). However, at query time, the proper way to
disable it is to use the set option set parallel_query 0.
In early releases of ASE 15.0, in order to perform partition elimination for each thread of the parallel query,
the partition columns were added to the query parameters. As a result earlier releases of ASE 15 could
have parallel query performance degradation in which queries using covered indices would not be covered
any more due to the addition of the automatic inclusion of the partition columns. For example a query such
as:
-- assume table is partitioned on a column SaleDate
-- and assume it has an index on StockTicker
Select count(*)
From trades
Where StockTicker=SY
Would get expanded to:

-- after expansion for parallel query.
Select count(*)
From trades
Where StockTicker=SY
And SaleDate between <range1> and <range2>
33
This specific problem was fixed in ESD #2, however, it likely was the cause of the problem for global vs.
local index selection as mentioned above. If planning on using the parallel query features within ASE 15.0,
you will need to carefully test each query and compare query plans. This can be done via a technique
mentioned later in this document.
Partitioning Tips
If you are planning on implementing partitioning, the following tips are offered for your consideration.
Best Method for Partitioning
There are three basic methods for creating partitions:
1. Use the alter table command

2. Bcp out the data; drop/recreate the table, bcp the data back in
3. Create a new table as desired; use select/into existing; and rename the old/new tables.
The first two are likely common to most readers. The third is a new feature of ASE 15.0 that allows a
query to perform a select/into to an existing table provided that the table doesnt have any indices or
constraints (and that the table selecting into is not itself). The syntax is identical to a typical select/into for
a #temp table - except that the tablename must exist (and the structure match the query output columns).
Which one of these methods works best for your situation will depend on hardware capacity, system
activity and the number of rows in the table - as well as whether any of the following are considerations:
You also wish to alter the table to use datarows locking

You wish to add a column - either derived or denormalized - that will be used as the partition
key.
Considering the first point, it is likely that any existing history table is still in allpages locking. Extremely
large tables with 10s-100s of millions of rows are often a source of contention within ASE. Most often
the reason is that these tables started prior to upgrading to ASE 11.9.2 and the DOL locking
implementation. By then, altering the locking scheme on such large tables often was impractical not only
from the table copy step, but also the associate index maintenance. However, if you are now faced with the
situation of having to drop the indices anyhow in order to partition the table, you may want to take
advantage of the time and also alter the table to datarow or datapages locking. Unfortunately, you cant
partition and change locking in a single alter table (and thereby copy the table only once) as the alter table
command only allows one operation at a time. However, altering the locking will likely run much faster
than partitioning as it is a straight forward table copy vs. partition determination and row movement
situation that partitioning requires.
To get an idea of how the three methods compare, the following test was run on a small 32-bit dual
processor (single core) Windows host machine using a table of over 168,000,000 rows (19 columns wide).
The machine used a mix of SATA hard drives and U160 SCSI internal disks, 4GB of RAM total (2GB for
ASE).
34
Figure 3 - Comparison of Partitioning Methods
One interesting item to note is that the two alter tables were considerably faster than the bcp in/out strategy
- which normally, most DBAs would not have suspected. A good reason is due to the fact that the alter
table lock datarows command only took 13 minutes - considerably shorter than the partitioning phase.
The other advantage of the select/into/existing approach is if you need to add a derived or denormalized
column that will be used for partitioning - for example, adding a column to denormalize a column such as
order_date from the parent orders table to the order_items table to facilitate partitioning and archiving
old orders and order items. Again, this would normally require a third alter table command, followed by an
update. It can be done in the same select/into/existing statement via a standard join or a sql computed
column (using a case statement or other expression).
To further improve performance, partitions can be created using parallel worker threads. In order to work,
parallelism must be enabled at the server configuration level (number of worker threads/max parallel
degree/max scan parallel degree) and parallel sort must be enabled for the database (sp_dboption select
into/bulkcopy/pllsort, true). Enabling the latter will also help with re-creating the indices after the
partition step is complete.
Re-Creating Indexes on Partitioned Tables
In order to partition a table, you will first need to drop all the indices. Not only is this necessary as the
indices themselves will need to be partitioned, but it also is necessary from a performance standpoint as
dynamically maintaining the index leaf nodes during massive page changes will not be required. Since you
have to drop the indexes anyhow, you may wish to:
Re-create them as local/partitioned indexes

Re-create them with larger statistics steps
Re-creating indexes on large tables is a very time consuming task. However, now that the table is
partitioned, worker threads may take advantage and be more efficient than a single process. Consequently,
you may find it much faster to re-create the indexes using parallel threads however, you may wish to
constrain the number of consumers to the lesser of the number of partitions or twice the number of engines
(consumers =min(# of partitions, # engines * 2)).
35
Partitioning and Relationships
Some of the largest tables in any OLTP system are the event tables that track occurrences of business
transactions - sales, shipments, orders, etc. Often there are one or two other large tables such as customers
or products that are related to these at a higher level. At a detailed level, sales, etc. events are often stored
in one or more tables with a main table representing the header and the the details being stored in other
tables. In addition, audit or history tables often contain orders of magnitude more data than the active
counter parts.
As a result, it is best if a common partitioning key can be used between all the tables. This common key
should also collate all the data for a single event to the same partition. This may require altering the
scheme to carry the partitioning key to the related tables. For example, a common partitioning scheme
for ease of maintenance is based on a range partition scheme in which each range represents a single day,
week or month. While the sales table contains the date and therefore is easily partitioned, the sales
details tables may not. Unfortunately, these tables are often much larger than the parent as the number of
child rows is an order of magnitude higher. Consequenlty, the tables that would most benefit from
partitioning are excluded from early attempts. However, if the sales date was carried to each of the child
tables containing the sales details, these could be partitioned as well.
One of the biggest reasons for doing this (besides ease of maintenance) is that if you are using parallel
query, by partitioning on the same partition keys, the likelihood is that a parallel query can use a vector join
instead of the traditional M*N explosion of worker threads required.
As a result, before implementing partitioned columns, you may wish to carefully review the partition keys
on related tables. If you need to carry a partition key from a parent to a child table, an application change
may be required to ensure that this keys value is kept in sync. As with any application change, this may
take a while to implement and test.
Which Partitioning Scheme to Use
There are several aspects to determining which paratitioning scheme is the best choice. You should start by
asking yourself the following questions:
Is the primary goal of partitioning the data to:
a) Reduce the maintenance time for update stats, reorgs, etc.?

b) Improve parallel query optimization?
c) Improve insert/point query performance?
Is the desired partition data field monotonically increasing such as transaction id, transaction
date, etc. and is there a need for range scans within this increasing field?
Are there natural data boundaries (i.e. weeks/months) that can be used to determine the boundaries
for data partitions?
What is the order of magnitude of the number of discreet values for the partitioning columns (i.e.
10s, 100s, etc.)? Especially the leading column of the list of partition columns.
The first point is extremely important. If partitioning for maintenance, it is more than likely that you will
be using range partitioning on a date column. While other partitioning schemes can be used, they generally
will require maintenance, such as update statistics, to be run periodically. With range partitioning, much of
the maintenance on most of the data can be eliminated.
36
The second point is interesting from a different perspective. First, if attempting to evenly distribute the
data, a hash partitioning scheme is the apparent choice. However, similar to the previous example, a
monotonic sequence also becomes a natural for range partitioning when partitioning for maintenance. If
you think about it, date datatypes are a special form of a monotonic sequence in that they are internally
represented as a number of milliseconds elapsed from a particular reference date. As the natural sequence
of the date increases, the internal values increase in order as well. Order numbers, transaction ids, trade
numbers, etc. are similar consequently both date and sequential numbers may use hash or range
partitioning depending on the goal.
The last two points are focused more on how evenly distributable the data will be across the partitions.
Unlike indexes where it is highly recommended to put the most distinct column first (a legacy
recommendation that is no longer necessary if you use update index statistics instead of update statistics),
the first column of a partition key sequence does not need to be the most unique. However, if it is very low
cardinality (especially when less than the number of partitions), you may want to order the partition
sequence differently. Keep in mind that the access method will still be via index order the order of the
partition columns is independent.
The following sections will describe each of the partition schemes and which ones are best for particular
situations.
Range Partitions
Range partitions will most likely be the most common form of partitioning particularly for those wishing
to partition on date ranges for maintenance purposes. While range partitions are good choices for 10s to
low-mid 100s of partitions, the algorithm for finding which partition to use is a binary search. From early
computer science classes, we understand that the speed of a binary search is a max log2 number of splits -
meaning that the most iterations used to find the correct value will be log(n)/log(2) where n is the number
of elements and log() is assumed to be the normal base 10 logarithmic function. For 1,000 values, this max
would be 10 splits while for 100 it would be 7. Put another way, retaining 30 days of data online using a
range partition would result in 5 splits for the partition search; while retaining 3 years (1095 days) would
result in 11 splits - doubling the partition search time. Consequently, for really large numbers of partitions
(1,000s or more), the search time to find the desired partition may adversely affect insert speeds.
From a query optimization perspective, range partitions are also best used if range scans (i.e. column
between <value> and <value>) or unbounded ranges (i.e. column > value) are frequenly used particularly
when within a single range. However, this points out a situation in which the granularity of the partitioning
vs. the query performance could need balancing. If extremely narrow partition values are used (i.e. one
partition per day), queries with bounded or unbounded ranges that span more than one day could perform
slower than when a wider partition granularity (week) is used. As an extension of this, the server sort order
could have an impact on this as well. For example, a server using nocase sort order and using a range
partition on a character attribute such as name will have a different data distribution than a partitioning
scheme in a case sensitive sort order system. This does not imply that it might be better to change to a case
insensitive sort order if your range partitioning scheme involves character data since case insensitive sort
orders take longer than case sensitive comparisons. However, it may mean that the range specification
needs careful specification for example the expression col <= a could have substantially different
meanings and result in completely different data distributions.
Range partitions are also likely the best choice for table maintenance if the intention is to partition on a date
column. However, there are some challenges to this. First, most existing systems have such dates
implemented as datetime consequently, the partition specification may have to include an end of day time
such as 11:59:59pm. Secondly, unless the date column is part of the primary key, you may have to use
global index to enforce primary key uniqueness. This global index may have an impact on some of the
maintenance functions especially reorg commands for that index may not have the same benefits. Range
partitions have other benefits such as being able to drop a partition (ASE 15.0.1).
In summary, range partitions should be considered when:
37
The number of partitions will likely be less than 500.

Range scans are common on the partition key attributes.
The ability to add/drop partitions is important.
Reduction of time for data maintenance activities such as update statistics, reorg or other
activities is desired.
Partition elimination and parallel query performance is desirable.
Composite partition columns are required (vs. list partitioning).
Hash Partition
Hash partitioning is not a substitute for range partitioning. If the number of partitions for a range partition
starts to exceed several hundred, rather than attempting to use a hash partition instead, you should
repartition with broader ranges to reduce the number of partitions. The primary reason for this suggestion
is that the data distribution will be different for hash partitioning and some queries (such as range scans)
may perform worse when using hash partitions.
While it is true that hash partitions work the best when the number of partitions are large (1000s), they
also work very well at lower numbers (i.e. 10s) of partitions as well. One reason is that rather than a
binary search, the correct partition for a datarow is simply determined from the hash function a fairly
fixed cost whether 10 partitions or 1000.
Hash partitioning is best when an even distribution of values is desired to best achieve insert performance
and performance of point queries (in which only a single row is returned). However, partition elimination
for hash partitioning can only be achieved for those using equality predicates vs. range scans. Additionally,
if the number of partitions are small and the domain of the datatype used allows a wide range of values
(such as an int datatype or a varchar(30)), an even distribution of values may not occur as desired as the
partition assignments are based on all the possible values for the domain vs. the actual number of discrete
values.
From a maintenance perspective, unless the hashed key is a date column, maintenance commands may have
to be run against the entire table as new data rows are scattered throughout the various partitions.
Additionally, a hash partition can not be dropped or else it would create a hole for future data rows
containing those hash values. Addition or deletion of partitions requires re-distribution of the data.
Good candidates for hash partition keys are unbounded data values such as sequential transaction ids,
social security numbers, customer ids, product ids essentially any data column in which it is most often
accessed using an equality parameter and most often a primary key or alternate primary key.
In summary, hash partitioning should be considered when:
The number of partitions exceeds practicality of list partitioning

Write-balancing/even data distribution is important for IO loading
Partioning attributes are fairly discrete such as primary keys, pseudo keys or other distinct
elements. In fact, the cardinality of the data values should be in the same order of magnitude
as the domain of the datatypes (to ensure even data distribution).
Partition elimination and parallel query performance is desirable.
Range scans are rarely issued on the partition key attributes.
Reduction of total time on data maintenance operations such as update statistics, reorgs, etc. is
not a consideration (as these operations will still need to be performed on all partitions).
Composite partition columns are desired.
38
List Partition
Works best for small numbers of discrete values (10's) as list partitioning lookups are more expensive
compared to other partition types due to needing to search through the list of the elements as well as
increased memory consumption to hold the partition key values. Consequently, if range partitioning is an
option for larger quantities (above 30-40), then range partitioning should take precedence over list
partitioning if performance is of consideration. This cant always be achieved.
Lets take the scenario that we wish to balance the I/O load of a customer enquiry system by partitioning by
state. If we look at North America, we have roughly 50 states in the US, 12-13 for Canada, slightly more
than 30 for Mexico plus territories (Carribbean and Pacific Islands). The total number of states and
providences are roughly 100. The natural tendency would be to use a list partition scheme to precisely
locate each state/providence within its distinct partition. However, when rows are inserted, in order to find
the correct partition, the server will need to scan each of the list partition key values sequentially to
determine the correct partition. While this may be quick for states such as Alaska, Alabama, Alberta, etc.,
it may be a bit longer for states such as Wyoming. Even for states in the middle Maryland, New York,
etc., the number of comparisons would be at least 20 or so. By comparison, a binary search on 100 items is
likely to locate the correct partition within 10 comparisons (remember from programming 101 - the
maximum iterations for a binary search is sqrt(n)). This means that for the more populous state of
California, either method is likely to perform about the same while other states with large metrolpolitan
populous areas such as New York, Illinois, Ontario, would benefit. To achieve this, the range partitioning
scheme would need to use distinct values and be listed in alphabetical order such as:
Partition by range (state_code) (
P1 values <= AB, - Alberta
P2 values <= AK, -- Alaska
P3 values <= AL, -- Alabama
P4 values <= AR, -- Arkansas
P5 values <= BC, -- British Columbia
P6 values <= CA, -- California

)
As you can see, this is a bit different as the states/providences are intermixed vs. the normal segmentation
by country.
However, a list partitioning scheme may well work if we consider how the data is access. For example,
lets consider a sales system. Most corporations align their sales territories along geographic bondaries -
most often using political boundaries as the lines of demarcation due to differences in legal considerations
(think banking laws, tax requirements, etc). Most often, these are then grouped into regions - and more
importantly, there are some interesting data access patterns of note:
Regional management will often be performing reports across the various states within their
region.
Access times for the different regions will be offset due to the differences in local time.
As a result, a list partitioning scheme that might well work even better than a range partion is a list
partitiong based on regions. For example:
39
Partition by range (state_code) (

NewEngland values (ME,MA,NH,RI,VT,CT,NY),
MidAtlantic values (PA,NJ,MD,DE,VA,WV),
SouthEast,
GreatLakes,
GulfCoast,
GreatPlains
NorthWest
SouthWest
Maritimes
EasternCanada
WesternCanada

)
This has some definite advantages for query performance - and it also is easier for DBA maintenance as
update statistics, reorgs, etc. can start on the different partitions earlier. While the latter is also true of
individual partitions, the above approach is likely less error prone.
Of special consideration for list and range partitions is the handling of NULL values in partition keys.
While list partitions can specify a NULL value, range partions can not. For example:
Partition by list (
NullPtn values (NULL),
Ptn_1 values (),

)
Is legal, while:
Partition by range (
NullPtn values <= NULL,
Ptn_1 values <= <value>,

)
Is not. To get around this with range partitions, the trick is in realizing that ASE considers NULL to be less
than every constant value, consequently:
Partition by range (
NullPtn values <= .,
Ptn_1 values <= <value>,

)
Is legitimate. In summary, list partitions should be use when:
The number of partitions is less than 30 - preferably a dozen or fewer.

Composite partition keys are not required
Partition key values are discrete and data values are expected to be fairly welly distributed
within the partition
Search arguments on the partition column predominantly use equality or in lists vs. range
scans.
Data values can be grouped within a partition when the collating sequence would not
support such grouping (think back to the state example) - particularly for query performance.
Round Robin
The segment/slice based partitioning scheme available since ASE 11.0 is still available in ASE 15.0, but it
has been named as round-robin to differentiate it from other partitioning implementations. Unlike the
others, round-robin partitioning is available without separate licensing from Sybase.
The primary goal of round-robin partitioning when it was introduced was to eliminate last page contention
in allpages locked tables (APL) and eliminate the I/O bottleneck of single/serial device access during
40
querying a table. By eliminating last page contention, high insert environments could have evenly
distributed inserts across the partitions. While this may appear similar to hash partitioning, there are a
number of differences which we will discuss in a minute. However, the main driver for round-robin
partitioning was eliminated when DOL locking introduced in ASE 11.9.2, resolving much of the page
contention issue and did a much better job as partitioned tables in 11.0 and 11.5 still suffered index
contention. Round-robin partitions still provided some insert speed improvements especially for bulk
operations such as bcp. However, compared to hash partitioning as a insert/write-balancing mechanism,
round-robin partitioning does have some noted differences.
The partition determination is based on a round-robin of the users session low concurrency
of user sessions (less than number of partitions) results in unbalanced partitions.
Because the partition determination is based on user session vs. data semantics, two users
inserting the identical values for the partition keys could save their data to different partitions.
As a result, the optimizer can not perform partition elimination as query optimization
technique resulting in higher I/O and cpu costs.
Local indexes are not possible in round-robin partitioning for ASE 15.0. As a result, any
advantages in index tree height for insert activity as compared to hash partitioning is lost.
One advantage of round-robin partitions is that an update of a data value does not cause the
row to change partitions an expensive update operation.
Another advantage is that with hash partitioning, two different data values could result in the
same hash key and force the writes to the same partition.
Altering the datatype of the hash partition key may result in significant re-distribution of data
Round robin partitions should be used when the following conditions are true:
Parallel query performance is not a high consideration.

Eas of data maintenance is not a high consideration
Maximum write-balancing is desired among low numbers of concurrent sessions
Dropping Partitions
In the GA release of ASE 15.0, dropping a partition was not supported however, it was added to ASE
15.0.1 (available Q306) for range and list partitions only. Prior to 15.0.1, in order to remove a partition,
the table should be repartitioned using the same partitioning scheme, but without the undesired partition.
As a work-around, you can truncate all of the data in a partition, and with no realistic limit on the number
of partitions, this has the same effect.
However, this is not as simple as a task as it would seem. Along with dropping a partition, the main reason
you would be dropping a partition is that the partitioned data has been archived. However, this would
imply a need to add a partition later when un-archived in-between existing partitions vs. strictly at the end
(range and list partitions allow partitions to be added to the existing scheme, however, for range partitions,
the key value must be higher than previous partitions effectively adding the partition to the end).
Alternatively, DBAs might try to use drop/add partition to fix minor partitioning scheme problems without
re-partitioning the entire table. Understanding this, now consider the following scenario. Assume we have
created a range partitioned table on partition keys of 10,20,30,40,50, etc. Now, lets assume that we archive
the data in the first partition (<=10) and subsequently drop the partition leaving us with partition ranges
of 20,30,40,50, etc. Some time later, a user enters perfectly valid business data with a partition key value
of 5. Given the mechanics of range partitioning, the newly inserted data goes into the first partition (<=20).
At this point there is no problem. Now, due to need to unarchive some of the data or just to rebalance the
partitioning as more data is added, a new partition is added with the same partition key as the original
(<=10). Immediately, the problems begin. The previously insert data (i.e. 5) may be left stranded as
local indexes, etc. all would point to the new first partition only. The only reasonable solution to this is to
41
relocated the 5 rows when the new partition is added effectively turning the add partition into a
repartitioning as data is relocated. This problem would not occur currently in ASE 15.0.1 as partitions
currently can be added at the end of a range where the assumption is that no data would need to be
migrated.
Additionally, what is the impact of dropping a hash partition? The intent of a drop partition generally is to
remove all data with it. Again, the data removal isnt where the problem lies the issue is what happens
when new data is inserted. In this case, consider a hash partition on an integer column using 10 partitions.
Dropping one of them removes the hash bucket for 1/10th of the possible data values. Lets assume that the
particular hash bucket removed held the hash keys for integers {5, 32, 41, etc.}. Now, a user inserts a value
of 32. What should happen?? Should the hashing algorithm have changed to reflect the full domain across
the remaining 9 partitions? If this is the case, then is the purpose of the drop partition just to redistribute
the data instead of removing them (again a repartition). Or perhaps, the value should be rejected as the
hash bucket no longer exists (much like attempting to insert an unlisted value in a list partition)? The
problem with dropping hash partitions is why dropping hash partitions currently is not supported in ASE
15.0.1.
Dropping a list partition is fairly straight forward - with one little gotcha. If you drop a list partition and
then someone attempts to insert a value that was in that partition, they will get an error. Consequently, care
should be taken when dropping a list partition to make sure that the application will not break as a result.
However, a round-robin partition has an even worse problem than hash partitions. Consequently dropping
round-robin partitions is also not supported. As you can see, dropping a partition is a feature that goes
beyond simply removing the partition and all its data. As a work-around, simply truncating a partition
(truncate table command in ASE 15.0 now supports truncating just a single partition) may be a more usable
approach. However, even this is not without its issues. If using parallel query, the optimizer considers all
the partitions as part of the worker thread costing and scanning operations (the latter if partition elimination
is not available). As a result, the following guidance is provided regarding drop partition:
Dropping a list partition is fairly low risk and doesnt need much consideration - other than
ensuring that the application will not attempt to insert a value that would have gone into the
old partition.
Dropping a range partition is not advisable if intending on unarchiving data that used to be in
that partition into the existing schema. If the data may be unarchived into a different schema
for analysis, dropping a partition may still be an option.
Dropping partitions may be recommended when parallel query is enabled.
If you anticipate dropping a partition, you must either use a range or list partition. If you
suspect that you may need to re-add the partition after dropping it, a list partition is best -
assuming a small number of partition key values.
Dropping a partition will cause global indexes to be re-built. Unlike a local/partitioned index
in which the index partition can simply be removed, it is likely that the partitions data is
scattered throughout the index b-tree -including intermediate node values. As a result, you
may find it faster to drop global indexes prior to dropping a partition and recreating them
manually later.
Creating a Rolling Partition Scheme
One of the primary drivers for partitioning from a Sybase customer perspective is to ease the administrative
burden of very large tables during maintenance operations. For example, running update statistics, reorgs,
as well as data purge/archive operations on a 500 million row table can be extremely time consuming and
in many cases prohibitive until the weekend. As a result, many Sybase customers are looking at
implementing a rolling partition scheme in which each days data is partitioned separately. This has some
immediate benefits:
42
Since most of the older data is fairly static, there is no need to run update statistics or reorg on
those partitions. In fact, these operations may only have to be run on the last weeks
partitions - greatly reducing table maintenance time.
Data purge/archive operations become simplified - rather than a large delete that takes a long
time to run (and locks the entire table) or iterative batches of 1,000 deletes (or so) to avoid a
table lock, older partitions can simply be dropped.
The partition granularity will likely depend on the data retention policy. As discussed earlier in the
partitioning tips, if the data retention policy is 5 years online, having one partition per day is not likely very
efficient as this would require 1,825 partitions. As a result, the first decision will be how granular to make
the partitioning scheme. If keeping 5 years, possibly 1 partition per week (260 partitions) would be
advisable.
Regardless, the method for creating a rolling partition scheme as of ASE 15.0.1 is as follows:
The table must contain a date field (although any sequentially increasing column is usable).
The range partition is created using the full date boundary for day, week or month that was
decided for the granularity. Only enough partitions are created to support the retention policy
plus a few in advance.
With each purge cycle, the next set of partitions are created while the older ones are dropped.
For example, if using a rolling partition on day, but purging weekly, every week, 7 new
partitions will need to be created and the 7 old partitions dropped.
The number of partitions should be at least one purge cycle in advance. For example, if
partitioning by day/purging by the week, you should have 37 partitions initially - to ensure
rollover conditions during processing doesnt cause application errors if the partition isnt
available.
The partition name should be able to be autogenerated to facilitate scripting
The earlier table example was a good example of a rolling partition based on a single day:
create table trade_detail (
trade_id bigint not null,
trade_date datetime not null,
customer_id bigint not null,
symbol char(5) not null,
shares bigint not null,
price money not null
)
partition by range (trade_date)
(
Jan01 values <= Jan 1 2006 11:59:59pm,
Jan02 values <= Jan 2 2006 11:59:59pm,

)
Note that the partition naming scheme also provides a clue about what data is contained in it - but more
importantly can be autogenerated so that purge scripts and maintenance commands can use the getdate()
function and derive which partitions should be currently active (and need update statistics run) or which
ones should be the ones to be dropped - and what partition should be added.
Partitioning FAQ
The following questions have been common questions asked of Technical Support staff since ASE 15.0
went GA. This list is not meant to be all inclusive.
43
How do I move data in/out from individual partitions?
Currently, Sybase does not yet provide a partition merge/split feature comparable to Oracles partition
exchange capability although such a feature is high on the priority list for a future release. Additionally,
you can not use a select statement to retrieve data out of a single partition by using the partition name
however, you can use bcp to extract data from the partition, and you can select data out of a partition if you
specify a where clause that includes only that partition's key values. The bcp utility has also been enhanced
to support directly loading data into a partition both of these commands (bcp in/out from a partition) are
expected to be used during data archival operations.
Another way to move data between partitions is to simply change the partition key values. If you update
the column used in a partition key to a value that is in a different partition, the row is physically relocated
to the other partition. This is achieved via a deferred update, in which the existing row is deleted from the
current partition, and inserted into the new partition. For example, if a table is partitioned by a "state"
column, updating the column by changing it from NY to CA would likely cause the partition to change.
What is recorded in the ASE transaction log is a deferred update consisting of the removal of the NY row
and the insertion of the deleted row the same as any other deferred update operation.
What happens if I change the value in the column used to partition?
If you update the column used in a partition key to a value that is in a different partition, the row is
physically relocated to the other partition. This is achieved via a deferred update, in which the existing row
is deleted from the current partition, and inserted into the new partition.
Can I select the data from a single partition?
Not directly. Using the partition key values in a where clause is the only supported method of doing this
today. However, you can bcp out from a specific partition which gives you the same effect.
Which partition schemes support composite partition keys
Both range and hash partition support multiple partition keys.
44
Query Processing Changes
The ASE 15.0 documentation includes a whole new book just on Query Processing. Consequently, the
purpose of this section is intended to highlight features that might be useful during migration.
Query Processing Change Highlights
Adaptive Server introduced a new optimizer with release 15.0. The focus of the new optimizer was to
improve the performance of DSS style complex queries. Performance of these queries will likely improve
dramatically. However, OLTP DML statements and simple queries may not see any change in
performance unless a new feature is used that allows an index to be used where before it couldnt (i.e.
function-based index) or similar. However, you should test all applications before you use the server in
production for the following issues:
Because of changes in the parser, some queries may return a general syntax error (message
102) instead of Syntax error at line # (message 156).
The maximum number of worktables per query increased from 14 to 46.
In the past, some used trace flag 291 to improve performance with joins when using different
datatypes. Continued using it with ASE 15.0 could result in wrong answers. ASE 15.0 has
improved algorithms to take care of joins between compatible but different data types. ASE
15.0 will use the appropriate indexes even if the SARGs are of different datatypes.
Most of these should be transparent from an application perspective. Some that may not be so transparent
are mentioned below.
Group By without Order By
One of the changes as a result of the new sorting methods is that a group by clause without an order by
clause may return the results in a different order than in previous releases.
Pre-ASE 15
Prior to ASE 15.0, queries in ASE that used a group by clause without an order by clause would return
the results sorted in ascending order by the group by columns. This was not deliberate, but rather was an
artifact that was the result of the grouping operation. Prior to ASE 15.0, group bys were processed by one
of two methods:
Creating a work table and sorting the work table via a clustered index to determine the vector
aggregates
Access in index order if the group by columns were covered by an index
Either way, the net effect was that the result set was return sorted in grouped order if an order by clause
was not present. This is not in accordance with strict ANSI interpretation of the SQL standard as ANSI
dictates that the result set order can only be influenced by an order by clause.
The problem, of course, is that some developers were unaware that this behavior was the result of
unintentional behavior and opted not to use the order by clause when the desired result set ordering
matched the group by clause.
ASE-15.0
In ASE 15.0, new in-memory sorting/grouping methods (e.g. hash) do not physically sort the data during
the grouping process. As a result, this does not generate a sorted result set due to implicit sort operations.
This problem can be a bit difficult to spot as optimizations that still choose index access order or a sorted
45
work table will still return the rows in sorted order - however, this can vary within the same query
dependent upon the number of rows estimated to be in the result. Consequently, one time the problem may
be apparent and in other executions it may not be. In order to generate a sorted result set, queries will have
to be changed to include an order by clause.
Sybase is considering an enhancement in future release that will revert behavior (group by implicitly orders
result). In ASE 15.0 ESD #2, traceflag 450 makes group by use the classic (non-hashed) sort method, thus
making the result set order predictable again but at the possible cost of slower performance. In a later
release, a new optimization criteria language command 'set group_inserting {0|1}' is being considered
which will let you control this on a session level and especially via login triggers without requiring
trace flags.
Literal Parameterization and Statement Cache
One of the enhancements in CT-Lib in 10.x that dbLib does not support is the notion of dynamic SQL or
fully prepared SQL statements. When using fully prepared SQL statements, the statement with any
constant literals removed, would be compiled and optimized and stored in the ASE memory as a dynamic
procedure. Subsequent executions would simply execute the procedure by calling the statement by the
statement id and supplying the parameters for the various values. This implementation is also supported in
ODBC as well as JDBC, although JDBC requires the connection property DYNAMIC_PREPARE to be set
to true (default is false). For high volume OLTP applications, this technique has the fastest execution -
resulting in application performance improvements of 2-3x for C code and upto 10x for JDBC applications.
In ASE 12.5.2, Sybase implemented a statement cache which was designed to try to bring the same
performance advantage of not re-optimizing repetitive language statements issued by the same user.
However, one difference was that when the statement cache was introduced, the literal constants in the
query were hashed into the MD5 haskey along with the table names, columns, etc. For example, the
following queries would result in two different hashkeys being created:
-- query #1
select *
from authors
where au_lname=Ringer
-- query #2
select *
from authors
where au_lname=Greene
The problem with this approach was that statements executed identically with only a change in the literal
values would still incur the expense of optimization. For example, updates issued against different rows in
the same table would be optimized over and over. As a result, middle tier systems that did not use the
DYNAMIC_PREPARE connection property and batch jobs that relied on SQL language vs. procedure
execution did not benefit from the statement cache nearly as much as they could have.
In ASE 15.0.1, a new configuration option enable literal autoparam was introduced. When enabled, the
constant literals in query texts will be replaced with variables prior to hashing. While this may result in a
performance gain for some applications, there are a few considerations to keep in mind:
Just like stored procedure parameters, queries using a range of values may get a bad query
plan. For example, a query with a where clause specifying where date_col between
<date 1> and <date 2>.
Techniques to finding/resolving bad queries may not be able to strictly join on the hashkey as
the hashkey may be representing multiple different query arguments.
Note the following considerations about the statement cache:
46
Since we are creating a dynamic procedure, this counts as an open object. You will have to
increase the configuration number of open objects accordingly.
The following must match exactly: login, user_id(), db_id() and session state (see next bullet).
A number of set operators affect the MD5 hash, including: Forceplan, jtc, parallel_degree,
prefetch, quoted_identifier, sort_merge, table count, transaction isolation level, chained
(transaction mode). Using these options will cause a different hashkey and a new statement to
be cached vs. reusing a existing cached query plan.
Only impacts selects, updates, deletes, insert/selects. Does not affect:
Insert values() - no where clause plus literals (planned to be added in 15.0.2)
Dynamic SQL - query plan is already cached, so no need.
Stored procedures - as above, query plan(s) are already cached, so no need.
Statements within procedural constructs (ifelse, etc.)
Statement cache is disabled if abstract plan dump/load active
The key is that for applications that have a high degree of repetitive statements, literal parameterization and
the statement cache could have a significant performance boost. This is also true for batch operations such
as purge routines that deleted data. Without literal paramerization, caching an atomic delete operation was
futile and an app server that was front-ending for a lot of users with different parameters also didnt get the
benefit - and the statement cache was simply getting thrashed as new statements would get added as old
ones got pushed out.
Even with literal parameterization, the number of distinct login names could affect the statement cache
sizing/turn-over. As noted above, one login will not use the cached query plan of another - and multiple
logins will incur a high number of cached queries for the same SQL text. This has its good and bad points -
on the good side, the range problem (between date1 and date2) would only affect one login - unlike a
procedure optimization which affects all logins.
Set tablecount deprecated/Inceased OptimizationTime
Adaptive Server release 15.0 increases the amount of time for query compilation because the query
processing engine looks for more ways to optimize the query. However, there is a new timeout mechanism
in the search engine that can reduce the optimization time if there is a cheap plan. In addition, more
sophisticated cost based pruning methods are implemented. As a result, the option "set tablecount" is
obsolete in 15.0. While the statement still executes, it has no effect on query timeout.
It should be noted that set tablecount was in effect a query optimization timeout mechanism. At the
default value of 4, the optimizer could consider a maximum of 4! (4 factorial) or 24 possible join
permutations (less non-joined possibilities). When set higher, the optimizer could consider increasing
possibilities - which would take longer to optimize. For example, a customer using ASE 11.9.3 with a 12-
way join using a default tablecount would seemingly never return. Set to 12, however, the query
optimization would take 10 minutes and the execution would take 30 seconds. Analyzing trace 302 and
other outputs revealed that the optimizer found the best query plan well within the first minute of
optimization - the subsequent 9 plus minutes were spent exhausting all the other permutations (12! 479
million less non intersecting joins) possible.
ASE 15.0 with more effective cost pruning would likely avoid many of the exhaustive searches that ASE
11.9.3 conducted. However, still, the number of permutations would result in lengthy optimization times.
As a result, ASE 15.0 has a direct control that restricts query optimization time. This new timeout
mechanism is set via the new server configuration:
47
-- <n> is a number in the range of 0 4000 (as of 15.0.1)

sp_configure optimization timeout limit, <n>
-- stored procedure timeout - default is 40 (as of 15.0 ESD #2)

sp_configure sproc optimize timeout limit, <m>
as well as session and query limits. Note that the first example affects SQL language statements that need
to be optimized (vs. statement cache) and the second one addresses query optimization when a stored
procedure plan is compiled. The default values of 5 for queries and 40 for procedures are likely appropriate
for most OLTP/mixed workload environments, however, reporting systems with complicated queries may
benefit from increasing this value. The value itself is a percentage of time based on the estimated query
execution time based on the current shortest execution plan. The way it works is that the optimizer costs
each plan and estimates the execution time for each. As each plan is costed, if a lower cost plan is found
the timeout limit is re-adjusted. When the timeout limit is reached, the query optimization stops and the
most optimal plan determined by that point will be used. It should be noted that the query timeout limit is
not set until a fully costed plan is found.
The result of this implementation is that queries involving more than 4 tables are likely to take longer to
optimize than previous releases as most developers did not use the set tablecount function. This lengthier
optimization time should be offset by improved execution times as better plans are found. However, in
some cases, the optimizer was picking the more optimal plan anyhow and as a result the query may take
marginally longer. For developers using set tablecount to increase the optimization time may need to
increase the timeout limit at the server level to arrive at the same optimal plan. In addition to the server
level setting, a developer could increase the amount of time via the session level command:
set plan opttimeoutlimit <n>
Or at the individual query level by using the abstract plan notation of:
select * from <table> plan "(use opttimeoutlimit <n>)"
Considering that the most common occurrence of set tablecount was in stored procedure code, the logical
migration strategy would be to simply replace the set tablecount with set plan opttimeoutlimit. Some
amount of benchmarking may be needed to determine the optimal timeout value - or you could simply
select an arbitrary number such as 10 if you have mainly DSS queries and wish to give the optimizer more
time to find a better query plan.
If you suspect that a query optimization was timed out and a better strategy might have been available, you
can confirm this by using the showplan option 'set option show on' (discussed later). If a time out occurs,
you will see the following in the diagnostics output:
!! Optimizer has timed out in this opt block !!
You could raise the value of the timeout parameter globally for the server or for the session or for this
query only. Raising the value of timeout at a server or session level can hurt other queries due to increased
compilation time and may use more resources like procedure cache. So, be careful when you do this.
Enable Sort-Merge Join and JTC deprecated
The configuration option "enable sort-merge join and JTC" has been deprecated. As a result, you may see
merge joins being used when unexpected. Before attempting to force the old behavior, consider that the
new in-memory sort operations has greatly improved the sort merger join performance. The query
optimizer will not choose merge join where it is deemed in-appropriate. If you do want to turn "sort-
merge" join off, you would have to do it at a session level using the command "set merge_join 0" or use an
optimization goal that disables merge join like "allrows_oltp".
One benefit from this is that Join Transitive Closure may be occurring, helping ASE arrive at more efficient
join orders than previously capable of due to this setting typically being disabled at most customer sites
(due to the merge join cost). This may also result in most three or more table queries in which JTC is a
possibility of using the N-ary Join Strategy vs. a strict Nested Loop Join - which may help in some
situations with query response times.
48
Set Forceplan Changes
In some legacy applications, developers used set forceplan on vs. the more flexible and more controllable
PLAN clause with an AQP. In the migration to ASE 15.0s optimizer, set forceplan is attempted to
migrate proc/trigger statements to an AQP implementation - however, this is not always doable resulting in
errors or other issues when a set forceplan clause is encountered -particularly during procedure
compilation which could result in the procedure failing to compile.
In many cases, forceplan was used to overcome the following situations:
Many (>4) tables were involved in the query and the optimization time when set tablecount
was used became excessive - the query plan was pre-determined typically by using set
tablecount one time, the query rewritten with the FROM clause ordered accordingly and then
forceplan used. Typically this is likely the case when the query has more than 6 tables in the
FROM clause. This problem has been mitigated in ASE 15 through the optimization timeout
mechanism.
Join index selectivity resulted in an incorrect join order. Frequently, this was a problem when
either there was a formula involved (such as convert()) in the join, or a variable whose value
was determined within the procedure vs. a parameter - hence value is unknown at
optimization time. While the index selectivity still could be an issue in ASE 15, the issue may
be mitigated entirely in ASE 15 due to the different join strategies (such as merge and hash
joins), optimizer improvements inherent in ASE 15, creating a function based index, or by
using update index statistics with a higher step count.
Developers should search their scripts (or syscomments system table) looking for occurrences of
forceplan and test the logic using the new optimizer. If a plan force is still required, developers have two
choices:
Use the PLAN clause as part of the SELECT statement to force the desired logic
Store the plan as an AQP and enable AQP load to not only facilitate that query but all similar
occurrences of that query.
The second option is especially useful if the query might be issued dynamically from an application vs.
being embedded within a stored procedure/trigger. To get a starting plan for the query, use the set option
show_abstract_plan command when executing the query to retrieve a AQP to start from and then modify it.
For example:
dbcc traceon(3604)
go
set option show_abstract_plan on
go
select * from range_test where row_id=250
go
set option show_abstract_plan off
go
dbcc traceoff(3604)
go
The same query with a plan clause would look like:

select * from range_test where row_id=250
plan '( i_scan range_test_14010529962 range_test ) ( prop range_test ( parallel 1
) ( prefetch 4 ) ( lru ) ) '
go
Note that if the table order is all that is important, a partial plan that only lists the table order would be all
that would be necessary.
49
Determining Queries Impacted During Migration
There are many ways to identify queries impacted by changes in the query processing engine during
migration. Each of the sections below describes different common scenarios and how to resolve them. In
each of the cases below, it is assumed that the MDA tables are installed and query monitoring enabled in
both the ASE 12.5 and 15.0 systems.
Things To Do Before You Start Testing
Before you begin to diagnose problems, you will need to ensure the following:
Make sure that the monitoring (MDA) tables are installed and you have access to it.
Have permission to run "sp_configure" command, if needed.
Have permission to turn on various "set options" in the query processor to get the diagnostics
output
Be able to turn on trace flag 3604/3605
Some of the outputs can be quite huge - so plan for file space
Practice capturing query plans using AQP and practice MDA queries in order to better
understand how the tables work as well as how large to configure the pipes.
Create a test database to be used as scratch space. You may be bcping output from one or
both servers into this database to perform SQL analysis. Likely it will be best to create this on
the 15.0 server to facilitate data copying.
Make sure you have plenty of free space (2GB+) for the system segment
Disable literal parameterization - possibly even statement caching as a whole for the testing
sessions
With regard to the last point, if literal parameterization is enabled for the statement cache, you may have to
disable it. With literal parameterization, the queries with different search parameters would return the same
hashkey. However, one might return 1 row and the other 1,000 rows. Depending on the clustered index,
the logical I/Os could be different between the two queries just due to the row count differences.
Obviously, queries that impact a larger difference in row counts will have even greater differences. As a
result, when attempting to find query differences between versions, make sure that you disable literal
parameterization. Not only will this allow joins on the hashkey to be accurate, but it also will allow a more
accurate comparison between 12.5 systems and 15.0 as 12.5 did not have the advantage of literal
parameterization. As mentioned, though, you may want to disable statement caching for your session
entirely - either via set statement_cache off or by zeroing the statement cache size for the entire server.
As noted in the discussion about statement caching, usinq abstract plan capture disables the statement cache
anyhow.
Abstract Query Plan Capture
Starting in ASE 12.0, Sybase has provided a facility called Abstract Query Plans (AQP or sometimes
Query Plans on Disk QPOD) to capture, load and reuse query plans from executing queries.
Documentation on AQPs is found in the ASE Performance & Tuning: Optimizer and Abstract Plans,
beginning with Chapter 14 in the ASE 12.5 documentation. By default, captured query plans are stored in
the sysqueryplans table using the ap_stdout group. The important consideration here is that both ASE
12.5.2+ and ASE 15.0 use a hashkey of the query, consequently, the query plans for identical queries can
be matched. The high level steps for this are as follows:
1. Enable query plan capture on the 12.5 server. This can be done at the session level with set
plan dump on or at the server level with sp_configure abstract plan dump, 1
50
2. Execute one module of the application to be tested.

3. Turn off AQP dump and bcp out sysqueryplans
4. Enable query plan capture on the 15.0 server.
5. Execute the same module as in #2
6. Turn off AQP dump in ASE 15
7. bcp in the 12.5 data into the scratch database (create a table called queryplans_125)
8. copy the 15.0 data into the scratch database (use select/into to create a table called
queryplans_150)
9. Create an index on hashkey, type and sequence for both tables
10. Run queries to identify plan differences
Sample queries include ones such as the following:

-- get a list of the queries that have changed - first
-- by finding all the queries with changes in the query plan text
select t.hashkey
into #qpchgs
from queryplans_125 t, queryplans_150 f
where t.hashkey = f.hashkey
and t.sequence = f.sequence
and t.type = 100 - aqp text vs. sql
and f.type = 100 - aqp text vs. sql
and t.text != f.text
-- supplemented by those with more text as detected by having
-- more chunks than the 12.5 version.
union all
select f.hashkey
from queryplans_150 f
where f.sequence not in (select t.sequence
from queryplans_125 t
where f.hashkey = t.hashkey)
-- and then supplemented by the opposite - find queryplans that
-- are shorter in ASE 15.0 than in 12.5
union all
select t.hashkey
where t.sequence not in (select f.sequence
go
-- eliminate duplicates
select distinct hashkey
into #qpchanges
from #qpchgs
go
drop table #qpchgs
go
-- get the sql text for the queries identified

select t.hashkey, t.sequence, t.text
from queryplans_125 t, #qpchanges q
where q.hashkey=t.hashkey
and t.type = 10 -- sql text vs. aqp
51
-- optionally get the aqp text for comparison

-- first find the ones in which the 15.0 QP may be longer
-- note that we need to have done the earlier query as
-- we use its results in a subquery (highlighted)
select t.hashkey, t.sequence, t.text, f.sequence, f.text
and t.sequence=*f.sequence
and t.hashkey in (select hashkey from #qpchanges)
union
-- now find the ones in which the 12.5 QP is longer this
-- may cause duplicates with the above where exactly equal
select f.hashkey, t.sequence, t.text, f.sequence, f.text
where f.hashkey = t.hashkey
and f.sequence*=t.sequence
and f.hashkey in (select hashkey from #qpchanges)
order by t.hashkey, t.sequence, f.sequence
Several items of note about the above:
Often in applications, the same exact query may be executed more than once. If so, and the
query plans differ between 12.5 vs. 15.0, the query will result in multiple instances in the
above table.
The above could be made fancier and more reusable, by enclosing inside stored procedures
that cursored through the differences building the SQL text and AQP text into sql variables
(i.e. declared as varchar(16000) even on 2K page servers this works) and then outputting to
screen or inserting into a results table (possibly with the SQL/AQP text declared as text
datatype).
The above queries are for demonstration purposes only more refinement is possible to
eliminate duplicates, etc. The goal was to show what is possible this task is automated via
DBExpert.
The downside to this method is that execution metrics are not captured so you cant
necessarily tell by looking at the output whether the query plan changed the performance
characteristics.
On the face of it, a large number of query plans will likely be changed due the implementation
of Merge Hoins, N-ary Nested Loop Joins, Hash Sorts, etc.
Using MDA Tables
The purpose of this section is not to serve as an introduction to the MDA tables - but as few customers have
taken advantage of this monitoring capability in the 4 years since its introduction, some background is
needed to explain how to use it to facilitate migration. This technique is a little harder than using
sysqueryplans as a query is not uniquely identified within the MDA tables via the hashkey. However, is a
lot more accurate in that it reports query performance metrics, has less impact on performance than other
methods and is not all that difficult once understood. The general concept relies on using monitoring index
usage, statement execution monitoring and stored procedure profiling during test runs and post-migration to
isolate query differences.
One fairly important aspect is that in ASE, the default value for enable xact coordination is 1 - and it is a
static switch requiring a server reboot. The reason why this is important is that in ASE 12.5.x, the MDA
tables use the loopback interface which mimics a remote connection, which will use CIS. Because of the
configuration option enable xact coordination, CIS will invoke a transactional RPC to the loopback
interface. This could cause a lot of activity in the sybsystemdb database - possibly causing the transaction
52
log to fill. As a result, the following best practices are suggested (any of the below fully resolves the
problem individually):
Enable truncate log on checkpoint option for sybsystemdb

Set enable xact coordination to 0
Use the session setting set transactional_rpc off in any procedure or script that selects from
the MDA tables
The first two may cause problems if distributed transactions (XA, JTA or ODBC 2PC) are used by the
application (the configuration option enable DTM would have to be 1) - consequently if using distributed
transactions, the safest choice is the third. If distributed transactions are being used, the first option does
expose a risk of losing transactions if the server should crash as heuristic transaction completion likely
would no longer be possible on transactions whose log pages were truncated by a checkpoint, but the
datapages not yet flushed to disk.
Monitoring Index Usage
Query plan differences that impact query performance typically manifest itself through increased IO -
usually by a significant amount. To that extent the first MDA monitoring technique uses
monOpenObjectActivity to look for table scans as well as index efficiency/usage. The
monOpenObjectActivity table looks like the following:
monOpenObjectActivity
DBID int <pk,fk>
ObjectID int <pk>
IndexID int <pk>
DBName varchar(30) <pk,fk>
ObjectName varchar(30) <pk>
LogicalReads int
PhysicalReads int
APFReads int
PagesRead int
PhysicalWrites int
PagesWritten int
RowsInserted int
RowsDeleted int
RowsUpdated int
Operations int
LockRequests int
LockWaits int
OptSelectCount int
LastOptSelectDate datetime
UsedCount int
LastUsedDate datetime
Figure 4 - MDA table monOpenObjectActivity in ASE 15.0.1
One of the differences between ASE 15.0 and 12.5.3 is noticeable in the above. ASE 12.5.3 didnt include
the DBName and ObjectName fields. Regardless, this table has a wealth of information that can be useful
during a migration. First, note that it is at the index level - consequently it is extremely useful to detect
changes in query behavior with minimal impact on the system. To do this however, you will need samples
from a 12.5.x baseline system and the ASE 15.0 migration system under roughly the same queryload.
Of the key fields in the table, the UsedCount column is perhaps the most important for index usage. This
counter keeps track of each time an index is used as a result of the final query optimization and execution.
As demonstrated earlier, this can be useful in finding table scans using the query:
select *
from master..monOpenObjectActivity
where DBID not in (1, 2, 3) -- add additional tempdb dbids if using multiple tempdbs
and UsedCount > 0
and IndexID = 0
order by LogicalReads desc, UsedCount desc
53
Note that not all table scans be avoided - the key is to look for significant increases in table scans. If you
have a baseline from 12.5 and have loaded both the 12.5 and 15.0 statistics into a database for analysis, a
useful query could be similar to:
-- in the query below, the two databases likely could have different
-- database ID's. Unfortunately 12.5.x doesn't include DBName - so
-- unless this was added by the user when collecting the data, we
-- will assume it is not available - which means we can't join
-- DBID nor DBName - so we need to hardcode the DBID values...
-- these values replace the variables at the first line of the
-- where clause below.
select f.DBName, f.ObjectName, f.IndexID,
LogicalReads_125=t.LogicalReads, LogicalReads_150=f.LogicalReads,
PhysicalReads_125=t.PhysicalReads, PhysicalReads_150=f.PhysicalReads,
Operations_125=t.Operations, Operations_150=f.Operations,
OptSelectCount_125=t.OptSelectCount, OptSelectCount_150=f.OptSelectCount,
UsedCount_125=t.UsedCount, UsedCount_150=f.UsedCount
UsedDiff=t.UsedCount - f.UsedCount
from monOpenObjectActivity_125 t, monOpenObjectActivity_150 f
where t.DBID = @DBID_125 and f.DBID = @DBID_150
and t.ObjectID = f.ObjectID
and t.IndexID = f.IndexID
order by 14 desc -- order by UsedDiff descending
One consideration is that even if the processing load is nearly the same, it is likely that there will be some
difference. That is where the Operations field comes into play. Technically is the the number of operations
such as DML statements or querys that are executed against a table - however, it tends to run a bit high (by
3-5x) as it includes internal operations such as cursor openings, etc. For this query though, it can be used to
normalize the workloads by developing a ratio of operations between the two. As mentioned earlier,
differences in query processing could result in some differences:
Merge Joins may increase the number of table scans - particularly in tempdb. It may also
result in less LogicalReads for some operations.
Hash (in-memory) sorting may reduce the number of index operations by using an index to
find a starting point, vs. traversing the index iteratively when using an index to avoid sorting.
If the table (IndexID=0 or 1) or a specific indes shows an considerable increase in
LogicalReads and the indexes show nearly the same OptSelectCount but the UsedCount has
dropped, the issue might be that statistics are missing or not enough statistics are available
and the optimizer is picking a table scan or an inefficient index.
If updating statistics using update index statistics with a higher number of steps doesnt solve the problem,
then next you are likely looking at a query optimization issue.
Monitoring Statement Execution
Monitoring statement execution focuses on the MDA monSysStatement table and related pipe tables.
Consider the following diagram:
54
Figure 5 - Diagram of MDA Tables Relating SQL Query Execution Statistics, Plan & Text
Note that we are focusing on monSysStatement, monSysSQLText and monSysPlanText vs. the
monProcess* equivalent tables. The rationale is that monSys* tables are stateful and keep track of
historically executed statements whereas the monProcess* tables only record the currently executing
statement.
A second aspect to keep in mind is that the context of one connections pipe state is retained for that
connection but is independent of other connections (allowing multi-user access to MDA historical data).
In other words, if one DBA connects and queries the monSysStatement table, they may see 100 rows.
Assuming more queries are executed and the DBA remains connected, the next time the DBA queries the
monSysStatement table, only the new rows added since they last queried will be returned. A different DBA
who connects at this point would see all the rows. This is important for the following reason: when
sampling the tables repeatedly using a polling process, it is tempting to disconnect between samples.
However, if this happens, upon reconnecting, the session appears as if a new session and the state is lost
consequently, the result set may contain rows already provided in the previous session.
A third consideration is the correct sizing of the pipes. If the pipes are too small, statements or queries may
be dropped from the ring buffer. To accommodate this, you either need to increase the number of pipes
available or sample more frequently. For example, for a particular application, one module may submit
100,000 statements. Obviously setting the pipes to 100,000 may be impossible due to memory constraints.
However, if it is known that those 100,000 statements are issued over the period of an hour, then an
average of 1,667 queries per minute are possibly issued. If guessing at a peak of double, that would mean
3,333 queries per minute. It might be useful to set the pipes to 5,000 and sample every minute to avoid
loosing statements.
Perhaps the two most important points to remember about monSysStatement are:
55
1. Statements selected from monSysStatement are returned in execution order if no ORDER

BY clause is used. One trick is to add an identity() if selecting into a temp table - or insert
into a table containing an identity column to keep the execution order intact.
2. The line numbers for procedures and triggers actually refers to the physical lines in the
creation script - starting from the end of the previous batch - including blank lines, comments,
etc. If a command spans more than one line, only the first line will show up during
monitoring. A line number of 0 implies that ASE is searching for the next executable line
(i.e. jumping from a large if statement - or skipping a large number of declares at the top that
are more compiler instructions vs. execution instructions).
Overall, the high level steps are as follows:
1. Configure the statement pipe, sql text pipe, and statement plan text pipe as necessary for the
ASE 12.5 server.
2. Create a temporary repository in tempdb by doing a query similar to the following:
create table tempdb..mdaSysStatement (
row_id numeric(10,0) identity not null,
SPID smallint not null,
KPID int not null,
DBID int null,
ProcedureID int null,
ProcName varchar(30) null,
PlanID int null,
BatchID int not null,
ContextID int not null,
LineNumber int not null,
CpuTime int null,
WaitTime int null,
MemUsageKB int null,
PhysicalReads int null,
LogicalReads int null,
PagesModified int null,
PacketsSent int null,
PacketsReceived int null,
NetworkPacketSize int null,
PlansAltered int null,
RowsAffected int null,
ErrorStatus int null,
StartTime datetime null,
EndTime datetime null
)
3. Repeat for monSysSQLtext and monSysPlanText as desired.
4. Begin a monitoring process that once per minute inserts into the tables created above from the
respective MDA tables. For example, the following query could be placed in a loop with a
waitfor delay 00:00:02 or similar logic.
insert into mdaSysStatement (SPID, KPID, DBID, ProcedureID
ProcName, PlanID, BatchID, ContextID, LineNumber,
CpuTime, WaitTime, MemUsageKB, PhysicalReads, LogicalReads,
PagesModified, PacketsSent, PacketsReceived,
NetworkPacketSize, PlansAltered, RowsAffected, ErrorStatus,
StartTime, EndTime)
select SPID, KPID, DBID, ProcedureID,
ProcName=object_name(ProcedureID, DBID),
PlanID, BatchID, ContextID, LineNumber,
StartTime, EndTime
from master..monSysStatement
6. Stop the application and halt the monitoring.
56
7. bcp out the MDA collected data from tempdb.

8. Repeat steps 1-5 for ASE 15.0
9. Create a second set of tables in the scratch database one each for ASE 15.0 and 12.5. For
example: mdaSysStmt_125 and mdaSysStmt_150.
10. Load the tables from the collected information either by bcp-ing back in or via insert/select
Since the monitoring captures all statements from all users, the next step is to isolate out each of the
specific users queries and re-normalize using a new identity column. For example, the following query
could be used to build the new table to be used to compare query execution:
select exec_row=identity(10), SPID, KPID, DBID, ProcedureID,
ProcName, PlanID, BatchID, ContextID, LineNumber,
StartTime, EndTime
into monSysStmt_150
from mdaSysStmt_150
where SPID = 123
and KPID = 123456789
order by row_id
go
create unique index exec_row_idx
on monSysStmt_150 (exec_row)
go
Consequently, if the same exact sequence of test statements is issued against both servers, the exec_row
columns should match. Consider the following query:
-- Get a list of SQL statements that executed slower in 15.0 compared to 12.5
select f.exec_row, f.BatchID, f.ContextID, f.ProcName, f.LineNumber,
CPU_15=f.CPUTime, CPU_125=t.CPUTime,
Wait_15=f.WaitTime, Wait_125=t.WaitTime,
Mem_15=f.MemUsageKB, Mem_125=t.MemUsageKB,
PhysIO_15=f.PhysicalReads, PhysIO_125=t.PhysicalReads,
LogicalIO_15=f.LogicalReads, LogicalIO_125=t.LogicalReads,
Writes_15=f.PagesModified, Writes_125=t.PagesModified,
ExecTime_15=datediff(ms,f.StartTime,f.EndTime)/1000.00,
ExecTime_125=datediff(ms,t.StartTime,t.EndTime)/1000.00,
DiffInMS= datediff(ms,f.StartTime,f.EndTime)-
datediff(ms,t.StartTime,t.EndTime)
into #slow_qrys
from monSysStmt_150 f, monSysStmt_125
where f.exec_row = t.exec_row
and (datediff(ms,f.StartTime,f.EndTime) > datediff(ms,t.StartTime,t.EndTime))
order by 20 desc, f.BatchID, f.ContextID, f.LineNumber
Of course it always nice to tell the boss how many queries were faster which is quite easy to accomplish
with the above simply swap the last condition with a less than to get:
-- Get a list of SQL statements that executed faster in 15.0 compared to 12.5
select f.exec_row, f.BatchID, f.ContextID, f.ProcName, f.LineNumber,
CPU_15=f.CPUTime, CPU_125=t.CPUTime,
Wait_15=f.WaitTime, Wait_125=t.WaitTime,
Mem_15=f.MemUsageKB, Mem_125=t.MemUsageKB,
PhysIO_15=f.PhysicalReads, PhysIO_125=t.PhysicalReads,
LogicalIO_15=f.LogicalReads, LogicalIO_125=t.LogicalReads,
Writes_15=f.PagesModified, Writes_125=t.PagesModified,
ExecTime_15=datediff(ms,f.StartTime,f.EndTime)/1000.00,
ExecTime_125=datediff(ms,t.StartTime,t.EndTime)/1000.00,
DiffInMS= datediff(ms,f.StartTime,f.EndTime)-
datediff(ms,t.StartTime,t.EndTime)
into #fast_qrys
from monSysStmt_150 f, monSysStmt_125
where f.exec_row = t.exec_row
and (datediff(ms,f.StartTime,f.EndTime) < datediff(ms,t.StartTime,t.EndTime))
order by 20 desc, f.BatchID, f.ContextID, f.LineNumber
57
The only gotcha with this technique is that it is only accurate within 100ms the reason is that it is based
on the CPU ticks length within ASE, which defaults to 100ms. Statements that execute less than 100ms
will show up as 0. This could be problematic where an insert that used to take 10ms now takes 20ms
especially if that insert is executed 1,000,000 times during the day.
Stored Procedure Profiling
Tracking stored procedure profiles from within the MDA tables can be a bit tricky. Typically, DBAs like
to track the stored procedures by reporting those most often executed, those that take the longest to execute
or those that have deviated from an expected execution norm. Traditionally, this has been done using
Monitor Server inconjunction with Historical Server and custom views. Unfortunately up through ASE
15.0.1, there isnt a direct equivalent within the MDA tables - although a planned enhancement for 15.0.2 is
to include two tables to track this (one aggregated by procedure and the other for each execution of a
procedure). The problem of course is that this will only be in 15.0.2, consequently comparing with 12.5.x
will not be possible.
The monSysStatement table reports the statement level statistics at a line-by-line basis - which in a sense is
even handy as when a proc execution is slower, the exact line of the procedure where the problem occurred
is easily spotted. Aggregating it to a procedure level for profiling is a bit trickier. Consider the following
stored procedure code (and the comments) which does a top 10 analysis
-- Because of the call to object_name(), this procedure must be executed in the same server as the
-- stored procedures being profiled. If the repository database schema included the ProcName, this
-- restriction could be lifted.
create procedure sp_mda_getProcExecs
@startDate datetime=null,
@endDate datetime=null
as begin
-- okay the first step is to realize that this only analyzes proc execs...
-- something else should have been collecting the monSysStatement activity.
-- In addition, the logic of the collector should have done something like a
-- row numbering scheme - for example:
--
-- select row_id=identity(10), * into #monSysStatement from master..monSysStatement
-- select @cur_max=max(row_id) from repository..mdaSysStatement
-- insert into repository..mdaSysStatement
-- (select row_id+@cur_max, * from #monSysStatement)
--
-- But now that we can assume everything was *safely* collected into this repository
-- database and into a table called mdaSysStatement, we can run the analysis
-- The result sets are:

--
-- 1 - The top 10 procs by execution count
-- 2 - The top 10 procs by elapsed time
-- 3 - The top 10 procs by Logical IOs
-- 4 - The top 10 proc lines by execution count
-- 5 - The top 10 proc lines by elapsed time
-- 6 - The top 10 proc lines by Logical IOs
-- The reason for the second set is that we are using these to key off of where a
-- proc may be going wrong - i.e. if a showplan changes due to a dropped index,
-- we will get the exact line #'s affected vs. just the proc.
-- The first step is to put the result set in execution order by SPID/KPID vs.
-- just execution order...we need to do this so our checks for the next line
-- logic to see when a proc begins/exits works....
select exec_row=identity(10), SPID, KPID, DBID, ProcedureID, PlanID,

BatchID, ContextID, LineNumber, CpuTime, WaitTime,
MemUsageKB, PhysicalReads, LogicalReads, PagesModified,
PacketsSent, PacketsReceived, NetworkPacketSize,
PlansAltered, RowsAffected, ErrorStatus, StartTime, EndTime
into #t_exec_by_spid
from mdaSysStatement
where (((@startDate is null) and (@endDate is null))
or ((StartTime >= @startDate) and (@endDate is null))
or ((@startDate is null) and (EndTime <= @endDate))
or ((StartTime >= @startDate) and (EndTime <= @endDate))
)
order by SPID, KPID, BatchID, row_id
58
-- then we kinda do it again to get rid of the identity - the reason why is that
-- sometimes a union all will fail if the join involves row=row+1 and row is numeric...
select exec_row=convert(int,exec_row), SPID, KPID, DBID, ProcedureID, PlanID,
BatchID, ContextID, LineNumber, CpuTime, WaitTime,
MemUsageKB, PhysicalReads, LogicalReads, PagesModified,
PacketsSent, PacketsReceived, NetworkPacketSize,
PlansAltered, RowsAffected, ErrorStatus, StartTime, EndTime
into #exec_by_spid
from #t_exec_by_spid
order by exec_row
drop table #t_exec_by_spid
create unique index exec_row_idx on #exec_by_spid (SPID, KPID, BatchID, exec_row)
-- then we need to find the proc exec statements - the way we will find this is by
-- finding either where:
--
-- 1) The first line of a batch is a valid procedure
-- 2) When the next line of a batch has a higher context id and the procedureID
-- changes to a valid procedure
-- Due part #1 - find all the procs that begin a batch

select SPID, KPID, BatchID, exec_row=min(exec_row)
into #proc_begins
from #exec_by_spid
group by SPID, KPID, BatchID
having exec_row=min(exec_row)
and ProcedureID!=0
and object_name(ProcedureID,DBID) is not null
-- Union those with procs that occur after the first line...
select e.SPID, e.KPID, e.ProcedureID, e.DBID,
e.BatchID, e.ContextID, e.LineNumber, e.exec_row, e.StartTime
into #proc_execs
from #exec_by_spid e, #proc_begins b
where e.SPID=b.SPID
and e.KPID=b.KPID
and e.BatchID=b.BatchID
and e.exec_row=b.exec_row
union all
-- e1 is the proc entry - e2 is the previous context
select e1.SPID, e1.KPID, e1.ProcedureID, e1.DBID,
e1.BatchID, e1.ContextID, e1.LineNumber, e1.exec_row, e1.StartTime
from #exec_by_spid e1, #exec_by_spid e2
where e1.SPID=e2.SPID
and e1.KPID=e2.KPID
and e1.BatchID=e2.BatchID
and e1.ContextID = e2.ContextID + 1 -- Context should go up by 1
and e1.exec_row = e2.exec_row + 1 -- on the next line
and e1.ProcedureID != e2.ProcedureID -- and the proc differs
and e1.ProcedureID !=0 -- and the proc is not 0
and object_name(e1.ProcedureID,e1.DBID) is not null -- and proc is valid
-- we are finished with this one....

drop table #proc_begins
-- Okay, now we have to find where the proc exits....This will be one of:
--
-- 1 - The SPID, KPID, BatchID, and ContextID are the same as the calling
-- line, but the ProcedureID differs...
-- 2 - The max(LineNumber) if the above is not seen (meaning proc was only
-- statement in the Batch).
--
-- Due part #2 - find all the procs that end a batch

select SPID, KPID, BatchID, exec_row=max(exec_row)
into #proc_ends
from #exec_by_spid
group by SPID, KPID, BatchID
having exec_row=max(exec_row)
and ProcedureID!=0
and object_name(ProcedureID,DBID) is not null
-- in the above we are reporting the row after the proc exits while in the below
-- (after the union all) we are using the last line exec'd within the proc
59
select x.SPID, x.KPID, x.ProcedureID, x.DBID,

x.BatchID, begin_row=x.exec_row, end_row=b.exec_row,
x.ContextID, x.StartTime, e.EndTime
into #find_ends
from #exec_by_spid e, #proc_ends b, #proc_execs x
where e.SPID=b.SPID and e.KPID=b.KPID and e.BatchID=b.BatchID
and e.SPID=x.SPID and e.KPID=x.KPID and e.BatchID=x.BatchID
and b.SPID=x.SPID and b.KPID=x.KPID and b.BatchID=x.BatchID
and e.exec_row=b.exec_row
union all
-- e1 is the next line, where e2 is proc return...we will use the e2 row_id
-- vs. e1 though so that later we can get the metrics right.
select x.SPID, x.KPID, x.ProcedureID, x.DBID,
x.BatchID, begin_row=x.exec_row, end_row=e2.exec_row,
x.ContextID, x.StartTime, e1.EndTime
from #exec_by_spid e1, #exec_by_spid e2, #proc_execs x
where e1.SPID=e2.SPID and e1.KPID=e2.KPID and e1.BatchID=e2.BatchID
and e1.SPID=x.SPID and e1.KPID=x.KPID and e1.BatchID=x.BatchID
and e2.SPID=x.SPID and e2.KPID=x.KPID and e2.BatchID=x.BatchID
and e1.ContextID = x.ContextID -- Context is same as calling line
and e1.exec_row = e2.exec_row + 1 -- on the next line
and e1.ProcedureID != e2.ProcedureID -- and the proc differs
and e2.ProcedureID !=0 -- and the exiting proc is not 0
and object_name(e2.ProcedureID,e2.DBID) is not null -- and exiting proc is valid
and e2.ProcedureID=x.ProcedureID -- and we are exiting the desired proc
and e2.DBID=x.DBID
and e2.exec_row > x.exec_row
drop table #proc_execs

drop table #proc_ends
-- the above could result in a some overlaps...so let's eliminate them....

select exec_id=identity(10), SPID, KPID, ProcedureID, DBID, BatchID,
begin_row, end_row=min(end_row),
StartTime=min(StartTime), EndTime=max(EndTime)
into #final_execs
from #find_ends
group by SPID, KPID, ProcedureID, DBID, BatchID, begin_row
having end_row=min(end_row)
order by begin_row
drop table #find_ends
-- #final_execs contains a list of proc execs in order by SPID along

-- with the beginning and ending lines...we now need to get the
-- execution metrics for each. To do this, we rejoin our execs with
-- the orginal data to get all the metrics for the procs...
select f.exec_id, f.SPID, f.KPID, f.ProcedureID, f.DBID, f.BatchID, f.begin_row,

f.end_row, f.StartTime, f.EndTime, elapsedTotal=datediff(ms,f.StartTime,f.EndTime),
subproc=e.ProcedureID, subdbid=e.DBID, e.LineNumber, e.CpuTime, e.WaitTime,
e.MemUsageKB, e.PhysicalReads, e.LogicalReads, e.PagesModified,
e.PacketsSent, e.RowsAffected, elapsedLine=datediff(ms,e.StartTime,e.EndTime)
into #proc_details
from #final_execs f, #exec_by_spid e
where f.SPID=e.SPID and f.KPID=e.KPID and f.BatchID=e.BatchID
and e.exec_row between f.begin_row and f.end_row
drop table #final_execs

drop table #exec_by_spid
-- now we do the aggregation - first by each execution...so we can later

-- aggregate across executions....if we wanted to track executions by
-- a particular SPID, we would branch from here....
select exec_id, SPID, KPID, ProcedureID, DBID, BatchID,
elapsedTotal, CpuTime=sum(CpuTime),
WaitTime=sum(WaitTime), PhysicalReads=sum(PhysicalReads),
LogicalReads=sum(LogicalReads), PagesModified=sum(PagesModified),
PacketsSent=sum(PacketsSent), RowsAffected=sum(RowsAffected)
into #exec_details
from #proc_details
group by exec_id, SPID, KPID, ProcedureID, DBID, BatchID, elapsedTotal
-- then we do the aggregation by Proc line....this is to spot the bad lines

select DBID, ProcedureID, LineNumber, num_execs=count(*),
elapsed_min=min(elapsedLine),
elapsed_avg=avg(elapsedLine),
elapsed_max=max(elapsedLine),
elapsed_tot=sum(elapsedLine),
60
CpuTime_min=min(CpuTime),
CpuTime_avg=avg(CpuTime),
CpuTime_max=max(CpuTime),
CpuTime_tot=sum(CpuTime),
WaitTime_min=min(WaitTime),
WaitTime_avg=avg(WaitTime),
WaitTime_max=max(WaitTime),
WaitTime_tot=sum(WaitTime),
PhysicalReads_min=min(PhysicalReads),
PhysicalReads_avg=avg(PhysicalReads),
PhysicalReads_max=max(PhysicalReads),
PhysicalReads_tot=sum(PhysicalReads),
LogicalReads_min=min(LogicalReads),
LogicalReads_avg=avg(LogicalReads),
LogicalReads_max=max(LogicalReads),
LogicalReads_tot=sum(LogicalReads),
PagesModified_min=min(PagesModified),
PagesModified_avg=avg(PagesModified),
PagesModified_max=max(PagesModified),
PagesModified_tot=sum(PagesModified),
PacketsSent_min=min(PacketsSent),
PacketsSent_avg=avg(PacketsSent),
PacketsSent_max=max(PacketsSent),
PacketsSent_tot=sum(PacketsSent),
RowsAffected_min=min(RowsAffected),
RowsAffected_avg=avg(RowsAffected),
RowsAffected_max=max(RowsAffected),
RowsAffected_tot=sum(RowsAffected)
into #line_sum
from #proc_details
where LineNumber > 0
group by ProcedureID, DBID, LineNumber
drop table #proc_details
select ProcedureID, DBID, num_execs=count(*),

elapsed_min=min(elapsedTotal),
elapsed_avg=avg(elapsedTotal),
elapsed_max=max(elapsedTotal),
CpuTime_min=min(CpuTime),
CpuTime_avg=avg(CpuTime),
CpuTime_max=max(CpuTime),
WaitTime_min=min(WaitTime),
WaitTime_avg=avg(WaitTime),
WaitTime_max=max(WaitTime),
PhysicalReads_min=min(PhysicalReads),
PhysicalReads_avg=avg(PhysicalReads),
PhysicalReads_max=max(PhysicalReads),
LogicalReads_min=min(LogicalReads),
LogicalReads_avg=avg(LogicalReads),
LogicalReads_max=max(LogicalReads),
PagesModified_min=min(PagesModified),
PagesModified_avg=avg(PagesModified),
PagesModified_max=max(PagesModified),
PacketsSent_min=min(PacketsSent),
PacketsSent_avg=avg(PacketsSent),
PacketsSent_max=max(PacketsSent),
RowsAffected_min=min(RowsAffected),
RowsAffected_avg=avg(RowsAffected),
RowsAffected_max=max(RowsAffected)
into #exec_sum
from #exec_details
group by ProcedureID, DBID
drop table #exec_details
-- now we need to get the top 10 by exec count

set rowcount 10
select DBName=db_name(DBID), ProcName=object_name(ProcedureID,DBID),
num_execs, elapsed_min, elapsed_avg, elapsed_max,
CpuTime_min, CpuTime_avg, CpuTime_max,
WaitTime_min, WaitTime_avg, WaitTime_max,
PhysicalReads_min, PhysicalReads_avg, PhysicalReads_max,
LogicalReads_min, LogicalReads_avg, LogicalReads_max,
PagesModified_min, PagesModified_avg, PagesModified_max,
PacketsSent_min, PacketsSent_avg, PacketsSent_max,
RowsAffected_min, RowsAffected_avg, RowsAffected_max
from #exec_sum
order by num_execs desc
-- now get the top 10 by average elapsed time
61

from #exec_sum
order by elapsed_avg desc
-- now get the top 10 by average logical IOs

from #exec_sum
order by LogicalReads_avg desc
-- now lets do the same - but by Proc LineNumber

select DBName=db_name(DBID), ProcName=object_name(ProcedureID,DBID), LineNumber,
from #line_sum
order by num_execs desc
-- now get the top 10 by average elapsed time

from #line_sum
order by elapsed_avg desc
-- now get the top 10 by average logical IOs

from #line_sum
order by LogicalReads_avg desc
set rowcount 0
drop table #exec_sum

drop table #line_sum
return 0
end
go
Although it appears complicated, the above procedure could easily be modified to produce a complete
procedure profile of executions for the 12.5 and 15.0 systems and then compare the results reporting on
both the procedures that differed and the offending lines. This proc was left intact for the simple reason
that if you dont have any ASE 12.5 statistics, the above proc could be used as is to track proc executions in
12.5 or 15.0 regardless of migration status.
62
You can obtain the SQL text for the procedure execution describing what the parameters are by joining on
the SPID, KPID, and BatchID. The only difference is that the SQL text is collected as one large entity
(later split into 255 byte rows) vs. individual statements. Consequently, if a SQL batch executes a proc
more than once, you will have to use the ContextID to identify which proc execution which set of
parameters belong to.
Using Sysquerymetrics
A variation of both that uses the exact query matching based on the hash key but yields execution statistics
is to use sysquerymetrics in ASE 15.0 instead of sysqueryplans. Details on how to use sysquerymetrics are
provided in a later section (Diagnosing Issues in ASE 15.0: Using sysquerymetrics & sp_metrics).
However, the sysquerymetrics provides a number of performance metric columns including logical ios,
cpu time, elapsed time, etc. The columns in sysquerymetrics include:
Field Definition
uid User ID
gid Group ID
id Unique ID
hashkey The hashkey over the SQL query text
sequence Sequence number for a row when multiple rows are required for
the SQL text
exec_min Minimum execution time
exec_max Maximum execution time
exec_avg Average execution time
elap_min Minimum elapsed time
elap_max Maximum elapsed time
elap_avg Average elapsed time
lio_min Minimum logical IO
lio_max Maximum logical IO
lio_avg Average logical IO
pio_min Minumum physical IO
pio_max Maximum physical IO
pio_avg Average physical IO
cnt Number of times the query has been executed.
abort_cnt Number of times a query was aborted by Resource Governor as a
resource limit was exceeded.
Qtext query text
Note however, that this displays the query text and not the query plan. Consequently you first identify the
slow queries in 15.0 and then compare the plans. To do this, you begin much the same way as you do with
the AQP capture method, but adding the sysquerymetrics capture detailed later:
1. Enable query plan capture on the 12.5 server. This can be done at the session level with set
plan dump on or at the server level with sp_configure abstract plan dump, 1
3. Turn off AQP dump and bcp out sysqueryplans
63
4. Enable query plan capture on the 15.0 server.

5. Enable metrics capture on the 15.0 server either at the server level with sp_configure "enable
metrics capture", 1 or at the session level with set metrics_capture on
6. Execute the same module as in #2
7. Turn off AQP dump in ASE 15
8. bcp in the 12.5 data into the scratch database (create a table called queryplans_125)
9. copy the AQP data for the ASE 15.0 data into the scratch database (use select/into to create a
table called queryplans_150)
10. Create an index on hashkey, type and sequence for both AQP tables.
11. Either use sp_metrics to backup the sysquerymetrics data or copy it to table in the scratch
database as well.
12. Run queries to identify plan differences
The SQL queries now change slightly from the AQP method. First we find the list of query plans that have
changed and then compare that list to the queries that execute slower than a desired number.
-- get a list of the queries that have changed
select t.hashkey
into #qpchgs
and t.sequence = f.sequence
and t.text != f.text
union all
select f.hashkey
where f.sequence not in (select t.sequence
union all
select t.hashkey
where t.sequence not in (select f.sequence
go
-- eliminate duplicates
select distinct hashkey
into #qpchanges
from #qpchgs
go
drop table #qpchgs
go
-- now get a list of the slow queries in ASE 15.0 that are a member
-- of the list of changed queries
select hashkey, sequence, exec_min, exec_max, exec_avg,
elap_min, elap_max, elap_avg, lio_min,
lio_max, lio_avg, pio_min, pio_max, pio_avg,
cnt, weight=cnt*exec_avg
qtext
from <db>..sysquerymetrics -- database under test
where gid = <gid> -- group id sysquerymetrics backed up to
and elap_avg > 2000 -- slow query is defined as avg elapsed time > 2000
and hashkey in (select hashkey from #qpchanges)
64
Note that this doesnt tell you if it ran faster or slower in 15.0 it merely identifies queries in 15.0 that
exceeds some arbitrary execution statistic that has a query plan change from 12.5.
Another very crucial point is that we are comparing on AQP vs. comparing strictly showplans. The
rationale behind this is that comparing the AQPs is less error prone than comparing showplan text.
Changes in formatting (i.e. the vertical bars to line up the levels in ASE 15), new expanded explanations
(DRI checks), and other changes from enhancements in ASE showplan will result in a lot more false
postivives of query plan changes. There still is a good probability that some queries will be reported as
diffent just due to differences in AQP extensions, but it will be far fewer than with comparing showplan
texts. If you are using a tool that uses showplan text, you may want to consider this - and especially if
using an in-house tool - change the logic to instead use the AQP for the queries.
Diagnosing and Fixing Issues in ASE 15.0
This section discusses some tips and tricks to diagnosing problems within ASE 15 as well as some
gotchas.
Using sysquerymetrics & sp_metrics
The sysquerymetrics data is managed via the stored procedure sp_metrics. The ASE 15.0 documentation
contains the syntax for this procedure, however, the use needs a bit of clarification. The key point to
realize is that the currently collecting query metrics are stored in sysquerymetrics with a gid=1. If you
save previous metric data via sp_metrics backup, you must choose an integer higher than 1 that is not
already in use. Typically this is done by simply getting the max(gid) from sysquerymetrics and adding 1.
The full sequence to capturing query metrics is similar to the following:
-- set filter limits as desired (as of 15.0 ESD #2)
exec sp_configure "metrics lio max", 10000
exec sp_configure "metrics pio max", 1000
exec sp_configure "metrics elap max", 5000
exec sp_configure "metrics exec max", 2000
go
--Enable metrics capture

set metrics_capture on
go
-- execute test script

-- go
--Flush/backup metrics & disable

exec sp_metrics 'flush'
go
select max(gid)+1 from sysquerymetrics
go
-- if above result is null or 1, use 2 or higher
exec sp_metrics 'backup', '#'
go
set metrics_capture off
go
--Analyze
select * from sysquerymetrics where gid = #
go
--Drop
exec sp_metrics 'drop', '2', '5'
go
Query metrics can be further filtered by deleting rows with metrics less than those of interest. Remember
that as a system table, you need to turn on allow updates first. Additionally, when dropping metrics, the
begin range and end range must exist. For example, in the above, if the first metrics group was 3, the drop
would fail with an error that the group 2 does not exist.
65
The biggest gotcha with sysquerymetrics is that similar to AQP capture, sysquerymetrics can consume a
large amount of space in the system segment. For example, in an internal Sybase system, enabling metrics
capture for an hour consumed 1.5GB of disk space. There is a work-around to this if you want to reduce
the impact:
1. Run consecutive 10 or 15 minute metrics capture

2. At end of each capture period:
a. Select max(gid) from sysquerymetrics
b. sp_metrics 'backup', 'gid'
3. Start next capture period
4. Filter previous results (assuming configuration parameters were set higher)
a. i.e. Delete rows with <10000 lio_min, etc.
The second problem with sysquerymetrics is that query parsing/execution will take longer as the query plan
(and execution statistics) are recorded to sysqueryplans. This impact is not recorded as part of the
execution statistics, but may show up when timing batches of SQL between two different servers. The
overhead seen at some customer sites is approximately 1 ASE clock tick (100ms by default) - which can
manifest itself as a performance degradation of 10 seconds with 100 queries in a batch. As a result, batch
processing times should be done with sysquerymetrics off.
Using sysquerymetrics to support regression testing
Earlier, we saw how sysquerymetrics can be used to help identify queries with changed query plans in 15.0.
Another useful technique is to use sysquerymetrics to support regression testing after changing the
optimization goals, parallel resources/partitioning for a table or employing other ASE 15.0 features such as
function-based indices. At a high level, the steps are as follows:
1. Enable metrics capture for the first/reference run.

2. Run the desired module of the application
3. Backup the reference metrics to gid=2
4. Make the desired configuration changes
5. Repeat steps 1-3, backing up to the next higher gid
Queries that were impacted can be identified by joining sysquerymetrics on the hashkey. For example,
consider the following query:
select r.hashkey, r.exec_avg, m.exec_avg, r.elap_avg, m.elap_avg,
r.lio_avg, m.lio_avg, r.pio_avg, m.pio_avg, r.qtext
from sysquerymetrics r, sysquerymetrics m
where r.gid=2 and m.gid=<#> -- substitute # of current gid
and r.hashkey=m.hashkey
and ((r.exec_avg + (r.exec_avg * 0.1) < m.exec_avg)
or (r.elap_avg + (r.elap_avg * 0.1) < m.elap_avg)
or (r.lio_avg + (r.lio_avg * 0.1) < m.lio_avg)
or (r.pio_avg + (r.pio _avg * 0.1) < m.pio_avg)
You can adjust the condition above to define what you call as regression in the above example, we add a
10% tolerance factor to the reference times to avoid impacts by checkpoint process, etc. (Note: In our
example, we have only one user, and hence we have not add r.uid = m.uid. This can be changed in real
application if multiple users executing the same module are to be tested).
66
Using Showplan options
As mentioned earlier, diagnostic trace flags 302, 310, etc. are being deprecated. These trace flags often
output somewhat cryptic output, could not be controlled to the extent of the output and generally, very little
of the output was usable by the average user. In ASE 15.0, they are being replaced with Showplan options
using the following syntax.
set option <show> <normal/brief/long/on/off>
The list of options includes:
Option Description
show show optional details
show_lop Shows logical operators used.
show_managers Shows data structure managers used.
show_log_props Shows the logical managers used.
show_parallel Shows parallel query optimization.
show_histograms Shows the histograms processed.
show_abstract_plan Shows the details of an abstract plan.
show_search_engine Shows the details of a search engine.
show_counters Shows the optimization counters.
show_best_plan Shows the details of the best QP plan.
show_code_gen Shows the details of code generation.
show_pio_costing Shows estimates of physical I/O
show_lio_costing Shows estimates of logical input/output.
show_elimination Shows partition elimination.
show_missing_stats Shows columns with missing stats
A couple of notes about usage:
Some of these require traceflag 3604 (or 3605 if log output desired) for output to be visible
Generally, you should execute 'set option show on' first before other more restrictive options.
This allows you to specify brief or long to override the more general 'show' level of detail.
These may not produce output if query metrics capturing is enabled.
Consider the following scenarios. Let's assume that we want to see the query plan, while making sure we
haven't missed any obvious statistics, and we want to view the abstract plan (so we can do a 'create plan'
and fix the app later if need be). For this scenario, the command sequence would be:
set showplan on
set option show_missing_stats on
set option show_abstract_plan on
go
Let's see how this looks in action by just getting the missing statistics:
1> set option show_missing_stats long
2> go
1> dbcc traceon(3604)
2> go
DBCC execution completed. If DBCC printed error messages, contact a user with
System Administrator (SA) role.
67
1> select * from part, partsupp

2> where p_partkey = ps_partkey and p_itemtype = ps_itemtype
3> go
NO STATS on column part.p_partkey
NO STATS on column part.p_itemtype
NO STATS on column partsupp.ps_itemtype
NO STATS on density set for E={p_partkey, p_itemtype}
NO STATS on density set for F={ps_partkey, ps_itemtype}
Now let's do something a bit more common. Let's attempt to debug index selection, io costing, etc. the way
we used to with trace flags 302, 310, 315, etc.
dbcc traceon(3604)
set showplan on
set option show on -- get index selectivity
set option show_missing_stats -- highlight missing statistics
set option show_lio_costing -- report logical I/O cost estimates
go
To see how these would work in real life, lets take a look at a typical problem of whether to use update
statistics or update index statistics and using show_missing_stats and show_lio_costing as the means to
determine which one is more appropriate.
A key point is that show_missing_stats only reports when there are no statistics at all for the column. If
there are density stats as part of an index but not for the column itself, the column will be considered to not
have statistics - which can be extremely useful for identifying why you should run update index
statistics vs. just update statistics. For example, consider the following simplistic example from
pubs2:
use pubs2
go
delete statistics salesdetail
go
update statistics salesdetail
go
dbcc traceon(3604)
set showplan on
--set option show on
set option show_lio_costing on
go
select *
from salesdetail
where stor_id='5023'
and ord_num='NF-123-ADS-642-9G3'
go
set showplan off
go
set option show_missing_stats off
--set option show off
dbcc traceoff(3604)
go
NO STATS on column salesdetail.ord_num

Beginning selection of qualifying indexes for table 'salesdetail',
Estimating selectivity of index 'salesdetailind', indid 3

stor_id = '5023'
ord_num = 'NF-123-ADS-642-9G3'
Estimated selectivity for stor_id,
selectivity = 0.4310345,
Estimated selectivity for ord_num,
selectivity = 0.1,
scan selectivity 0.07287274, filter selectivity 0.07287274
8.453237 rows, 1 pages
Data Row Cluster Ratio 0.9122807
Index Page Cluster Ratio 0
Data Page Cluster Ratio 1
using no index prefetch (size 4K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
68
using no table prefetch (size 4K I/O)

in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'salesdetailind' on table 'salesdetail' = 1.741512
Estimating selectivity for table 'salesdetail'

Table scan cost is 116 rows, 2 pages,
The table (Allpages) has 116 rows, 2 pages,

Data Page Cluster Ratio 1.0000000
stor_id = '5023'
selectivity = 0.1,
Search argument selectivity is 0.07287274.
The Cost Summary for best global plan:
PARALLEL:
number of worker processes = 3
max parallel degree = 3
min(configured,set) parallel degree = 3
min(configured,set) hash scan parallel degree = 3
max repartition degree = 3
resource granularity (percentage) = 10
FINAL PLAN ( total cost = 61.55964)

Path: 55.8207
Work: 61.55964
Est: 117.3803
QUERY PLAN FOR STATEMENT 1 (at line 1).
1 operator(s) under root
The type of query is SELECT.
ROOT:EMIT Operator
|SCAN Operator
| FROM TABLE
| salesdetail
| Index : salesdetailind
| Forward Scan.
| Positioning by key.
| Keys are:
| stor_id ASC
| ord_num ASC
| Using I/O Size 4 Kbytes for index leaf pages.
| With LRU Buffer Replacement Strategy for index leaf pages.
| Using I/O Size 4 Kbytes for data pages.
| With LRU Buffer Replacement Strategy for data pages.
Total estimated I/O cost for statement 1 (at line 1): 61.
Now, observe the difference in the following (using update index statistics).
use pubs2
go
delete statistics salesdetail
go
update index statistics salesdetail
go
dbcc traceon(3604)
set showplan on
--set option show on
go
select *
from salesdetail
where stor_id='5023'
and ord_num='NF-123-ADS-642-9G3'
69
go
set showplan off
go
set option show_missing_stats off
--set option show off
dbcc traceoff(3604)
go
Beginning selection of qualifying indexes for table 'salesdetail',
Estimating selectivity of index 'salesdetailind', indid 3

stor_id = '5023'
scan selectivity 0.04101932, filter selectivity 0.04101932
Index Page Cluster Ratio 0
Data Page Cluster Ratio 1

Data Page LIO for 'salesdetailind' on table 'salesdetail' = 1.41739
Estimating selectivity for table 'salesdetail'


stor_id = '5023'
The Cost Summary for best global plan:
PARALLEL:
number of worker processes = 3
max parallel degree = 3
min(configured,set) parallel degree = 3
min(configured,set) hash scan parallel degree = 3
max repartition degree = 3
resource granularity (percentage) = 10

Path: 55.54325
Work: 60.17239
Est: 115.7156
ROOT:EMIT Operator
|SCAN Operator
| FROM TABLE
| salesdetail
| Index : salesdetailind
| Forward Scan.
| Positioning by key.
| Keys are:
70
| stor_id ASC
| ord_num ASC
| Using I/O Size 4 Kbytes for index leaf pages.
| With LRU Buffer Replacement Strategy for index leaf pages.
| With LRU Buffer Replacement Strategy for data pages.
Note the difference in rows estimated (4 vs. 8), selectivity (0.04 vs. 0.07), and the number of estimated
I/Os (60 vs. 61). While not significant, you also have to remember that this was pubs2 which had a
whopping 116 total rows. This should underscore the need to have updated statistics for all the columns in
an index vs. just the leading column. One aspect of this is that if you drop and recreate indexes, the
statistics collected during index creation are the same as for update statistics - you may want to
immediately run update index statistics afterwards.
Showplan & Merge Joins
In addition to new operators as well as new levels of detail discussed above, showplan also now includes
some additional information that can help diagnose index issues when a merge join is picked. Consider the
following query:
select count(*)
from tableA, tableB
where tableA.TranID = tableB.TranID
and tableA.OriginCode = tableB.OriginCode
and tableB.Status <> 'Y'
Since tableA doesnt have any SARG conditions, the entire table is likely required - and a table scan would
be expected. Assuming a normal pkey/fkey relationship, however, we would expect tableB to use the index
on the fkey relationship {TranID, OriginCode}. Now, lets consider the following showplan output for the
query:
ROOT:EMIT Operator
|SCALAR AGGREGATE Operator

| Evaluate Ungrouped COUNT AGGREGATE.
|
| |MERGE JOIN Operator (Join Type: Inner Join)
| | Using Worktable2 for internal storage.
| | Key Count: 1
| | Key Ordering: ASC
| |
| | |SCAN Operator
| | | FROM TABLE
| | | tableB
| | | Table Scan.
| | | Forward Scan.
| | | Positioning at start of table.
| | | Using I/O Size 16 Kbytes for data pages.
| | | With LRU Buffer Replacement Strategy for data pages.
| |
| | |SORT Operator
| | | Using Worktable1 for internal storage.
| | |
| | | |SCAN Operator
| | | | FROM TABLE
| | | | tableA
| | | | Table Scan.
| | | | Forward Scan.
| | | | Positioning at start of table.
| | | | Using I/O Size 16 Kbytes for data pages.
| | | | With LRU Buffer Replacement Strategy for data pages.
Note that the merge join is only reporting a single join key - despite the fact the query clearly has two!!!
The likely cause of this is our favorite problem with update statistics vs. update index
statistics - with no statistics on OriginCode (assuming it is the second column in the index or
pkey/fkey constraint), the optimizer automatically estimates the number of rows that qualify for the second
join column (OriginCode) by using the magic numbers (10% in this case due to equality). Whether it was
71
the distribution of data or other factors, the result was an unwanted table scan of tableB. The reason for
this was that in processing the merge join, the outer table was built specifying only a single join key - then
the inner table was sorted by that join key and then the SARG and the other join condition were evaluated
as part of the scan. The optimizer estimate tableB would have fewer rows likely due to the SARG
condition (Status <> Y) - but it is hard to tell from the details that we have.
Now, lets see what happens if we run update index statistics - forcing statistics to be collected on
all columns in the index vs. just the first column.
ROOT:EMIT Operator

|
| | Key Count: 2
| | Key Ordering: ASC ASC
| |
| | |SORT Operator
| | |
| | | | FROM TABLE
| | | | tableA
| | | | Table Scan.
| |
| | |SCAN Operator
| | | FROM TABLE
| | | tableB
| | | Index : tableB_idx
| | | Forward Scan.
| | | Positioning at index start.
| | | Using I/O Size 16 Kbytes for index leaf pages.
| | | With LRU Buffer Replacement Strategy for index leaf pages.
Ah ha!!! Now we have 2 columns in the merge join scan! As a result, we see that the join order changed,
and tableB now uses an index. Result: a much faster query.
Since a merge join could be much faster than a nested loop join when no appropriate indexes are available
(in this case - no SARG on tableA, but also see examples later in discussion on tempdb and merge joins),
applications migrating to ASE 15.0 may see a number of bad queries picking the merge join due to poor
index statistics. The initial gut reaction for DBAs will likely to be to attempt to turn off merge join. While
a nested loop join may be faster than a bad merge join, it is also likely that it will be worse than a good
merge join. As a result, a key to diagnosing merge join performance issues is to check the number of keys
used in the merge vs. the number of join clauses in the query.
Set Statistics Plancost Option
In addition to the optional level of details, ASE 15.0 includes several new showplan options that can make
analysis easier. The first is the 'set statistics plancost on/off' command. For example, given the following
sequence:
dbcc traceon(3604)
go
set statistics plancost on
go
select S.service_key, M.year, M.fiscal_period,count(*)
from telco_facts T,month M, service S
where T.month_key=M.month_key
and T.service_key = S.service_key
72
and S.call_waiting_flag='Y'
and S.caller_id_flag='Y'
and S.voice_mail_flag='Y'
group by M.year, M.fiscal_period, S.service_key
order by M.year, M.fiscal_period, S.service_key
go
The output is a lava query tree as follows:

Emit
(VA = 7)
12 rows est: 1200
cpu: 500
/
GroupSorted
(VA = 6)
12 rows est: 1200
/
NestLoopJoin
Inner Join
(VA = 5)
242704 rows est: 244857
/ \
Sort IndexScan
(VA = 3) month_svc_idx (T)
72 rows est: 24 (VA = 4)
lio: 6 est: 6 242704 rows est: 244857
pio: 0 est: 0 lio: 1116 est: 0
cpu: 0 bufct: 16 pio: 0 est: 0
/
NestLoopJoin
Inner Join
(VA = 2)
72 rows est: 24
/ \
TableScan TableScan
month (M) service (S)
(VA = 0) (VA = 1)
24 rows est: 24 72 rows est: 24
lio: 1 est: 0 lio: 24 est: 0
pio: 0 est: 0 pio: 0 est: 0
Effectively a much better replacement for 'set statistics io on'. In fact, for parallel queries, as of ASE 15.0.1
and earlier, set statistics io on will cause errors, so set statistics plancost on should be used instead.
Note in the highlighted section above the cpu and sort buffer cost. This is associated with the new in-
memory sorting algorithms. This option gives us the estimated logical I/O, physical I/O and row counts to
the actual ones evaluated at each operator. If you see that the estimated counts are totally off, then the
optimizer estimates are completely out of whack. Often times, this may be caused by missing or stale
statistics. Let us take the following query and illustrate this fact. The query is also being run with
show_missing_stats option.
2> go
1> set option show_missing_stats on

2> go
1> set statistics plancost on

2> go
1> select
2> l_returnflag,
3> l_linestatus,
4> sum(l_quantity) as sum_qty,
5> sum(l_extendedprice) as sum_base_price,
6> sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
7> sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
8> avg(l_quantity) as avg_qty,
73
9> avg(l_extendedprice) as avg_price,

10> avg(l_discount) as avg_disc,
11> count(*) as count_order
12> from
13> lineitem
14> where
15> l_shipdate <= dateadd(day, 79, '1998-12-01')
16> group by
17> l_returnflag,
18> l_linestatus
19> order by
20> l_returnflag,
21> l_linestatus
22> go
==================== Lava Operator Tree ====================
Emit
(VA = 4)
4 rows est: 100
cpu: 800
/
Restrict
(0)(13)(0)(0)
(VA = 3)
4 rows est: 100
/
GroupSorted
(VA = 2)
4 rows est: 100

/
Sort
(VA = 1)
60175 rows est: 19858
lio: 2470 est: 284
pio: 2355 est: 558
cpu: 1900 bufct: 21
/
TableScan
lineitem
(VA = 0)
60175 rows est: 19858
lio: 4157 est: 4157
pio: 1205 est: 4157
============================================================
NO STATS on column lineitem.l_shipdate
(4 rows affected)
As you can see that the estimated number of rows is incorrect at the scan level. The query does have a
predicate
l_shipdate <= dateadd(day, 79, '1998-12-01')
and if there is no statistics on l_shipdate, we shall use some magic value. In this case, we do use a magic
value that gives us an estimated row count of 19858 rows. This is way off from the actual row count of
60175 rows. That may explain why we decided to sort because the cost of sort would be cheaper, if the
number of rows streamed into the sorter is estimated to be almost a third of the actual count. The other
interesting thing to note is that the row reduction at the GroupSorted operator is significant. It went down
from 60175 to 4 rows. Hence the advantages of the GroupSorted algorithm can be easily overridden by a
hash based grouping algorithm, which would probably be all cached in memory. Based on the hint that we
had from the show_missing_stats option, we decide to run update statistics on the l_shipdate column.
1> update statistics lineitem(l_shipdate)
2> go
1>
2> select
3> l_returnflag,
4> l_linestatus,
74

13> from
14> lineitem
15> where
17> group by
18> l_returnflag,
19> l_linestatus
20> order by
21> l_returnflag,
22> l_linestatus
Emit
(VA = 4)
4 rows est: 100
cpu: 0
/
Restrict
(0)(13)(0)(0)
(VA = 3)
4 rows est: 100
/
Sort
(VA = 2)
4 rows est: 100
lio: 6 est: 6
pio: 0 est: 0
cpu: 800 bufct: 16
/
HashVectAgg
Count
(VA = 1)
4 rows est: 100
lio: 5 est: 5
pio: 0 est: 0
bufct: 16
/
TableScan
lineitem
(VA = 0)
60175 rows est: 60175
lio: 4157 est: 4157
pio: 1039 est: 4157
Well, we now see that the estimated row count for the TableScan operator is same as the actual. This is
great news and also, our query plan has changed to use the HashVectAgg (hash based vector aggregation)
instead of the Sort and GroupSorted combination that was used earlier. This query plan is way faster than
what we got earlier. But we're not done. If you look at the output of the HashVectAgg operator, the
estimated rowcount is 100, whereas the actual row count is 4. Well, we could further improve the statistics,
though this is probably our best plan. Since, the grouping columns are on l_returnflag and l_linestatus, we
decide to create a density on the pair of columns.
1> use tpcd
2> go
1> update statistics lineitem(l_returnflag, l_linestatus)
2> go
1>
2> set showplan on
1>
2> set statistics plancost on
3> go
75
1>
2> select
3> l_returnflag,
4> l_linestatus,
13> from
14> lineitem
15> where
17> group by
18> l_returnflag,
19> l_linestatus
20> order by
21> l_returnflag,
22> l_linestatus
ROOT:EMIT Operator
|RESTRICT Operator
|
| |SORT Operator
| |
| | |HASH VECTOR AGGREGATE Operator
| | | GROUP BY
| | | Evaluate Grouped COUNT AGGREGATE.
| | | Evaluate Grouped SUM OR AVERAGE AGGREGATE.
| | | Evaluate Grouped COUNT AGGREGATE.
| | |
| | | | FROM TABLE
| | | | lineitem
| | | | Table Scan.
| | | | With MRU Buffer Replacement Strategy for data pages.
Emit
(VA = 4)
4 rows est: 4
cpu: 0
/
Restrict
(0)(13)(0)(0)
(VA = 3)
4 rows est: 4
/
Sort
(VA = 2)
4 rows est: 4
lio: 6 est: 6
pio: 0 est: 0
cpu: 700 bufct: 16
/
76
HashVectAgg
Count
(VA = 1)
4 rows est: 4
lio: 5 est: 5
pio: 0 est: 0
bufct: 16
/
TableScan
lineitem
(VA = 0)
60175 rows est: 60175
lio: 4157 est: 4157
pio: 1264 est: 4157
Look at the estimated row count for the HashVectAgg. It is same as that of the actual row count.
Query Plans As XML
In ASE 15.0, you can also obtain query plans in XML. This is useful if building an automated tool to
display the query plan graphically such as the dbIsql Plan Viewer. However, another useful technique is to
use an XML query plan to find the last time statistics were updated for a particular table. In the past, the
only means to do this was by using the optdiag utility or by directly querying the systabstats/ sysstatistics
system tables consider the following example:
$SYBASE/ASE-15_0/bin/optdiag statistics le_01.dbo.part -Usa -P
Server name: "tpcd"
Specified database: "le_01"

Specified table owner: "dbo"
Specified table: "part"
Specified column: not specified
Table owner: "dbo"

Table name: "part"
...................................................
Statistics for column: "p_partkey"
Last update of column statistics: Sep 13 2005 7:51:39:440PM
Range cell density: 0.0010010010010010

Total density: 0.0010010010010010
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Histogram for column: "p_partkey"

Column datatype: integer
Requested step count: 20
Actual step count: 20
Sampling Percent: 0
Step Weight Value
1 0.00000000 <= 0
2 0.05205205 <= 52
.......................................................
Statistics for column: "p_brand"
Last update of column statistics: Sep 13 2005 7:51:39:440PM
Range cell density: 0.0010010010010010

Total density: 0.0010010010010010
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
You can also do the following to find out when statistics was last updated. You can use the query directly
to do that and only statistics for the columns deemed useful for the query will be used to display when the
stats were last updated for these columns (while optdiag shows statistics for all indices whether used by the
query or not and requires multiple executions if more than one table involved). Consider the following
example:
1> set plan for show_final_plan_xml to message on
2> go
1> select count(*) from part where p_partkey > 20
77
2> go
-----------
979
1> select showplan_in_xml(-1)
2> go
-----------
979
<?xml version="1.0" encoding="UTF-8"?>
<query>
<planVersion> 1.0 </planVersion>
<statementNum>1</statementNum>
<lineNum>1</lineNum>
<text>
<![CDATA[
SQL Text: select count(*) from part where p_partkey > 20
]]>
</text>
<objName>part</objName>
<columnStats>
<column>p_partkey</column>
<updateTime>Sep 13 2005 7:51:39:440PM</updateTime>
</columnStats>
You can also get the above information using the show_final_plan_xml option. Note how the "set plan"
uses the "client" option and traceflag 3604 to get the output on the client side. This is different from how
you need to use the "message" option of "set plan".
2> go
DBCC execution completed. If DBCC printed error messages, contact a user with
System Administrator (SA) role.
1> set plan for show_final_plan_xml to client on
2> go
1> select * from part, partsupp
2> where p_partkey = ps_partkey and p_itemtype = ps_itemtype
3> go
<?xml version="1.0" encoding="UTF-8"?>
<query>
<planVersion> 1.0 </planVersion>
<optimizerStatistics>
<statInfo>
<objName>part</objName>
<missingHistogram>
<column>p_itemtype</column>
</missingHistogram>
<missingDensity>
<column>p_itemtype</column>
</missingDensity>
</statInfo>
<statInfo>
<objName>partsupp</objName>
<missingHistogram>
<column>ps_partkey</column>
<column>ps_itemtype</column>
</missingHistogram>
<missingDensity>
<column>ps_partkey</column>
<column>ps_itemtype</column>
</missingDensity>
</statInfo>
</optimizerStatistics>
The useful aspect of this is that it is a single step operation. Normally, a textual Showplan does not provide
this information, consequently users then need to perform the second step of either using optdiag or
querying the system tables to obtain the last time statistics were updated. By using the XML output, all the
information is available in a single location which when added to the ease of parsing XML output vs.
textual, allows tool developers to provide enhanced functionality with ASE 15 not available easily in
previous releases.
78
Fixing SQL Queries using AQP
As mentioned earlier, Sybase introduced AQP in version 12.0. The goal was to allow customers to modify
a query to use a specific query plan without having to alter the code. The general technique that is
transparent to applications especially is the following steps:
1. Identify the problem queries in the new version

2. Turn AQP capture on in the previous release and run the queries.
3. Extract the AQP from the ap_stdout group
4. Review and modify the AQPs as necessary (i.e. adjust for partial plan use, etc.)
5. Use create plan to load them into the default ap_stdin group in the newer release
6. Enable abstract plan loading on the new release (sp_configure abstract plan load, 1)
7. Adjust the abstract plan cache to avoid io during optimization (sp_configure abstract plan
cache)
8. Retest the queries in the new version to see if the AQP is used
9. Adjust the AQP as necessary to leverage new features that may help the query perform even
better
10. Work with Sybase Technical Support on resolving original optimization issue
Note that this technique is especially useful to get past a problem that might delay an upgrade beyond the
window of opportunity, or to get past the show-stoppers that occasionally occur. As mentioned earlier,
more documentation is available in the Performance and Tuning Guide on Optimization and Abstract Plans.
This method also replaces the need to use set forceplan on, and in particular allows more control than just
join-order processing enforcement that forceplan implements. By illustration, consider the following use
cases.
Forcing an Index Using AQP
Generally, we all are aware that we can force an index by adding the index force to the query itself. For
example:
1> select count(*) from orders, lineitem where o_orderkey = l_orderkey
2> go

ROOT:EMIT Operator
|NESTED LOOP JOIN Operator (Join Type: Inner Join)

|
| |SCAN Operator
| | FROM TABLE
| | orders
| | Table Scan.
| | Forward Scan.
| | Positioning at start of table.
| |SCAN Operator
| | FROM TABLE
| | lineitem
| | Table Scan.
| | Forward Scan.
This is an example where the lineitem table is being scanned without an index. This may not be the best
available query plan. Maybe the query would run faster if one use the index on lineitem called l_idx1. This
can be done by rewriting the query as follows using the legacy force option.
79
1> select count(*) from orders, lineitem (index l_idx1) where o_orderkey =
l_orderkey
2> go
ROOT:EMIT Operator
|NESTED LOOP JOIN Operator (Join Type: Inner Join)

|
| |SCAN Operator
| | FROM TABLE
| | orders
| | Table Scan.
| | Forward Scan.
|
| |SCAN Operator
| | FROM TABLE
| | lineitems
| | Index : l_idx1
| | Forward Scan.
| | Positioning by key.
| | Keys are:
| | l_orderkey ASC
Though force options are simple, pretty soon you'll realize that you cannot do everything using this option
plus as we mentioned, this requires changing the application code which even if you could do this,
would take much longer than simply using an AQP. Let us take the same problem, but this time use an
AQP. To make it easier, we will let ASE generate the AQP for us, edit that, and use it to force an index.
1> set option show_abstract_plan on
2> go
2> go
3> go
The Abstract Plan (AP) of the final query execution plan:
( nl_join ( t_scan orders ) ( t_scan lineitem ) ) ( prop orders ( parallel 1 ) (
prefetch 2 ) (lru ) ) ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )
To experiment with the optimizer behavior, this AP can be modified and then passed to the optimizer using
the PLAN clause:
SELECT/INSERT/DELETE/UPDATE ...PLAN '( ... )
Now that we have a starting AP, we can replace the table scans (t_scan) with index accesses and try our
modified plan (indent for readability). To start with, we will specify the tables using (prop tablename) as
in:
2> plan
3> "( nl_join
4> ( t_scan orders )
5> ( t_scan lineitem )
6> )
7> ( prop orders ( parallel 1 ) ( prefetch 2 ) (lru ) )
8> ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )
What we need to do is to be able to force the index. The option to force an index scan is (i_scan <index
name> <table name>). Let us then rewrite the query with the new AP.
2> plan
3> "( nl_join
5> ( i_scan l_idx1 lineitem )
6> )
80
Forcing a join order
Join orders can be forced using the old legacy style by using the command set forceplan on. In this case, we
may want to force the join order such that the lineitem table is outer to the orders table, which essentially
boils down to switching the join order of these two tables. It can be achieved through the legacy option,
where you specify a different join order by switching tables in the from clause of the query.
1> set forceplan on
2> go
1> select count(*) from lineitem, orders where o_orderkey = l_orderkey
2> go
Since, we are on the subject of using AP and we know how to get a starting AP, we shall see how we can
use an existing AP and then modify it. Our starting AP looks like the following:
2> plan
3> "( nl_join
6> )
What we need to do is to be able to force the join order. This can be done by switching the join order in the
AP.
2> plan
3> "( nl_join
4> ( t_scan lineitem)
6> )
Forcing a different join strategy
Things get much more difficult when you have to force a different join strategy. There are session level
options whereby you can join strategies on or off. But the best way would be to use an AP. Let us start with
the session level options and say you want to try and see if merge join performs better than nested loop
join:
1> set nl_join 0
2> go
1> select * from orders, lineitem where o_orderkey = l_orderkey
2> go
( m_join ( i_scan l_idx1 lineitem ) ( sort ( t_scan orders ) ) ) ( prop lineitem (
parallel 1 ) (prefetch 2 ) ( lru ) ) ( prop orders ( parallel 1 ) ( prefetch 2 ) (
lru ) )
the PLAN clause: SELECT/INSERT/DELETE/UPDATE ...PLAN '( ... )
ROOT:EMIT Operator
|MERGE JOIN Operator (Join Type: Inner Join)

| Using Worktable2 for internal storage.
| Key Count: 1
| Key Ordering: ASC
|
| |SCAN Operator
81
| | FROM TABLE
| | lineitem
| | Index : X_NC1
| | Forward Scan.
| | Positioning at index start.
| | Index contains all needed columns. Base table will not be read.
| |SORT Operator
| |
| | |SCAN Operator
| | | FROM TABLE
| | | orders
| | | Table Scan.
| | | Forward Scan.
You can selectively turn join strategies on/off as needed at a session level. The ones that you can
experiment with are:
set nl_join [0 | 1]
set merge_join [0 | 1]
set hash_join [0 | 1]
Using APs, you can start off with the one produced by the optimizer as shown before.
2> plan
3> "( nl_join
6> )
Then you can modify the AP to change the join algorithm from nested loop to merge. Note that merge join
needs ordering on the joining column. To get the required ordering, use a AP construct called enforce that
will generate the right ordering on the right column.
2> plan
3> "( m_join
4> (enforce ( t_scan orders ))
5> (enforce ( t_scan lineitem ))
6> )
ROOT:EMIT Operator
|MERGE JOIN Operator (Join Type: Inner Join)

| Using Worktable2 for internal storage.
| Key Count: 1
| Key Ordering: ASC
| |SORT Operator
| | |SCAN Operator
| | |FROM TABLE
| | |orders
| | |Table Scan.
| | |Forward Scan.
| | |Positioning at start of table.
| | |
| | |SCAN Operator
| | | FROM TABLE
82
| | | lineitem
| | | Table Scan.
| | | Forward Scan.
Note the fact that orders table is sorted, while the lineitem table is not. The enforce construct will find out
the right ordering required to do the join.
Forcing different subquery attachment
Changing subquery attachment is applicable only to those correlated subqueries that cannot be flattened.
The main reason why you want to change the subquery attachment is probably to reduce the number of
times a subquery gets evaluated. Let us take a three table join as shown below and we use the old trick of
getting the AP. Only the skeletal plan output is shown that shows that the subquery is being attached after
the join of the three outer tables is performed. This is highlighted in the showplan output.
1> select count(*)
2> from lineitem, part PO, customer
3> where l_partkey = p_partkey and l_custkey = c_custkey
4> and p_cost = (select min(PI.p_cost) from part PI where PO.p_partkey =
PI.p_partkey)
5> go

( scalar_agg ( nested ( m_join ( sort ( m_join ( sort ( t_scan customer ) ) ( sort
(t_scan lineitem ) ) ) ) ( i_scan part_indx (table (PO part ) ) ) ( subq (
scalar_agg ( t_scan (table (PI part ) ) ) ) ) ) ) (prop customer ( parallel 1 ) (
prefetch 2 ) ( lru ) ) ( prop (table (PO part)) ( parallel 1 ) (prefetch 2 ) ( lru
) ) (prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) ) ( prop (table(PI part))
( parallel 1 ) (prefetch 2 ) ( lru ) )
the PLAN clause: SELECT/INSERT/DELETE/UPDATE ...PLAN '( ... )
ROOT:EMIT Operator

|
| |SQFILTER Operator has 2 children.
| | |MERGE JOIN Operator (Join Type: Inner Join)
| | |
| | | |SORT Operator
| | | | Using Worktable4 for internal storage.
| | | |
| | | | |MERGE JOIN Operator (Join Type: Inner Join)
| | | | |
| | | | | |SORT Operator
| | | | | | Using Worktable1 for internal storage.
| | | | | |
| | | | | | |SCAN Operator
| | | | | | | FROM TABLE
| | | | | | | customer
| | | | | | | Table Scan.
| | | | | |SORT Operator
| | | | | | Using Worktable2 for internal storage.
| | | | | |
| | | | | | | FROM TABLE
| | | | | | | lineitem
| | | | | | | Table Scan.
83
| | | | FROM TABLE
| | | | part
| |
| | Run subquery 1 (at nesting level 1).
| |
| | QUERY PLAN FOR SUBQUERY 1 (at nesting level 1 and at line 4).
| |
| | Correlated Subquery.
| | Subquery under an EXPRESSION predicate.
| |
| | |SCALAR AGGREGATE Operator
| | | Evaluate Ungrouped MINIMUM AGGREGATE.
| | |
| | | | FROM TABLE
| | | | part
| |
| | END OF QUERY PLAN FOR SUBQUERY 1.
One other thing that may facilitate your understanding is to get the operator tree. You can get it by using
trace flag 526 or set statistics plancost. Trace 526 outputs the lava operator tree without the cost (as
illustrated below) having the cost available is useful for determining whether the AP you are forcing is
efficient, however. As a result, set statistics plancost on is the recommended approach. Note the location
of the aggregate on PI (highlighted).
Emit
(VA = 12)
/
ScalarAgg
Count
(VA = 11)
/
SQFilter
(VA = 10)
/ \
MergeJoin ScalarAgg
Inner Join Min
(VA = 7) (VA = 9)
/ \ /
Sort TableScan TableScan
(VA = 5) part(PO) part(PI)
(VA = 6) (VA = 8)
/
MergeJoin
Inner Join
(VA = 4)
/ \
Sort Sort
(VA = 1) (VA = 3)
/ /
TableScan TableScan
customer lineitem
(VA = 0) (VA = 2)
============================================================
This query plan may not be optimal. The subquery is dependent on the outer table part (PO), which means
that it can be attached anywhere after this table has been scanned. Let us assume that the correct join order
needs to be part (PO) as the outer most table, followed by lineitem and then customer as the innermost
table. Let's also assume that we need to attach the subquery to the scan of table PO. This can be achieved
by starting off with the AP produced in the previous example and then modifying it to our need.
1> select count(*)
2> from lineitem, part PO, customer
84
3> where l_partkey = p_partkey and l_custkey = c_custkey

4> and p_cost = (select min(PI.p_cost) from part PI where PO.p_partkey =
PI.p_partkey)
5> plan
6> "(scalar_agg
7> (m_join
8> (sort
9> (m_join
10> (nested
11> (scan (table (PO part)))
12> (subq (scalar_agg (scan (table (PI part)))))
13> )
14> (sort
15> (scan lineitem)
16> )
17> )
18> )
19> (sort
20> (scan customer)
21> )
22> )
23> )
24>(prop customer ( parallel 1 ) ( prefetch 2 ) ( lru ) )
25>(prop (table (PO part)) ( parallel 1 ) (prefetch 2 ) ( lru ) )
26>(prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )
27>(prop (table(PI part)) ( parallel 1 ) (prefetch 2 ) ( lru ) )
28> go
Emit
(VA = 12)
/
ScalarAgg
Count
(VA = 11)
/
MergeJoin
Inner Join
(VA = 10)
/ \
Sort Sort
(VA = 7) (VA = 9)
/ /
MergeJoin TableScan
Inner Join customer
(VA = 6) (VA = 8)
/ \
SQFilter Sort
(VA = 3) (VA = 5)
/ \ /
TableScan ScalarAgg TableScan
part(PO) Min lineitem
(VA = 0) (VA = 2) (VA = 4)
/
TableScan
part (PI)
(VA = 1)
============================================================
ROOT:EMIT Operator
85

|
| | Key Count: 1
| |
| | |SORT Operator
| | |
| | | |MERGE JOIN Operator (Join Type: Inner Join)
| | | | Key Count: 1
| | | | Key Ordering: ASC
| | | |
| | | | |SQFILTER Operator has 2 children.
| | | | |
| | | | | |SCAN Operator
| | | | | | FROM TABLE
| | | | | | part
| | | | | | PO
| | | | | | Table Scan.
| | | | | | Forward Scan.
| | | | |
| | | | | Run subquery 1 (at nesting level 1).
| | | | |
| | | | | QUERY PLAN FOR SUBQUERY 1 (at nesting level 1 and at line 4).
| | | | |
| | | | | Correlated Subquery.
| | | | | Subquery under an EXPRESSION predicate.
| | | | |
| | | | | |SCALAR AGGREGATE Operator
| | | | | | Evaluate Ungrouped MINIMUM AGGREGATE.
| | | | | |
| | | | | | | FROM TABLE
| | | | | | | part
| | | | | | | PI
| | | | | | | Table Scan.
| | | | | | | Forward Scan.
| | | | | END OF QUERY PLAN FOR SUBQUERY 1.
| | | |
| | | | |SORT Operator
| | | | | Using Worktable1 for internal storage.
| | | | |
| | | | | | lineitem
| | | | | | Table Scan.
| | |SORT Operator
| | |
| | | | FROM TABLE
| | | | customer
| | | | Table Scan.
Note the shift in the location of the subquery as highlighted above.
Query Processing FAQ:
Is there a switch or trace flag that re-enables the old 12.5 optimizer?
This is a common myth that seems to surface for every new release that was given further credence during
the early beta stages of ASE 15, both optimizers were available for testing purposes. As with any GA
release, only a single codeline is released into the product. Certain new features or changes may be able to
86
be influenced by a traceflag however, in many cases doing so impacts queries that were helped by the
change. Consequently using the AQP feature is a more targeted approach.
Is there a tool that can automatically find queries that were impacted?
Yes. DBExpert 15.0 includes the migration analyzer, which compares the query plans and execution
statistics between queries in 12.5 and 15.0 identifying which queries where changed and what the impact
was.
Should I still use trace flags 302, 310, etc. to diagnose optimizer issues?
No. These trace flags have been replaced by showplan options which in addition to not requiring sa_role
the new options provide much clearer output as well as the ability to control the amout of detailed output.
In a future release, trace flags 302, 310, etc. will be fully deprecated an unoperable.
Why doesnt set statistics io on work in parallel mode?
In early ASE 15.0 releases, this was not supported, however, support was added in ASE 15.0 ESD #2
(released in May 2006). However, a more accurate picture can be achieved by using set statistics plancost
on which will show the parallel access nodes in the lava operator tree.
Query result not sorted when doing group by queries.
In pre 15.0, the grouping algorithm creates a work table with clustered index. Grouping works based on
insertion into that work table. In 15.0, the grouping algorithm has been changed to use a hash based
strategy. This does not generate a sorted result set. In order to generate a sorted result set, queries will
have to be changed to include an order by clause. Note that this change is in line with ANSI SQL
standards.
I have heard that joins between columns of different compatible data types have been solved in
15.0. Do we still need trace flag 291 in 15.0?
No. You must not use trace flag 291 or else, you could get wrong answers. ASE 15.0 has improved
algorithms to take care of joins between compatible but different data types. There is no problem using
indices even if the SARGs are of different datatypes.
My customer uses "set tablecount". Is that still supported in ASE 15.0 ?
No. The option "set tablecount" is obsolete in 15.0. This is because we can use a more
sophisticated cost based pruning and optimization timeout mechanism.
I see merge join being chosen despite "enable sort-merge join and JTC" turned off.
In general, "enable sort-merge join and JTC" is not supported in 15.0. If you do want to turn "sort-merge"
join off, you would have to do it at a session level using the command "set merge_join 0" or use an
optimization goal that disables merge join like "allrows_oltp". However, the reason that ASE is picking a
merge join (when it appears it shouldnt) over a nested-loop join is likely due to not being able to use an
index due to poor or missing statistics, or not all the necessary columns are covered. Prior to simply
attempting to disable the merge join, you may want to look at the query diagnostics to determine the best
course of action. Arbitrarily disabling a merge join may detrimentally affect queries that can benefit from
it, and resolving the underlying cause may achieve much better performance than strictly disabling merge
joins.
87
When I add an orderby clause to my query, it runs orders of magnitude slower.
This is possible if you have DSYNC turned on for tempdb. In one instance, we measured a time difference
of 20 minutes to 7 hours, when DSYNC option was turned OFF versus it being turned ON. In general, you
should try to turn DSYNC off. Here is sample set of commands that will help you to have DYNC turned off
for your database. You need to adapt that you to your environment. In the first case, we show how the
default tempdb is created on new devices with the DSYNC turned off.
USE master
Go
DISK INIT name = 'tempdbdev01', physname = '/tempdb_data' , size = '4G', dsync = 'false'
Go
DISK INIT name = 'tempdblogdev01', physname = '/tempdb_log', size = '4G', dsync = 'false'
Go
ALTER DATABASE tempdb ON tempdbdev01 = '4G' LOG ON tempdblogdev01 = '4G'
Go
USE tempdb
Go
EXEC sp_dropsegment 'logsegment', 'tempdb', 'master'
go
EXEC sp_dropsegment 'system', 'tempdb', 'master'
go
EXEC sp_dropsegment 'default', 'tempdb', 'master'
go
In case, you already have devices established for tempdb, you would merely have to turn the DYSNC
property off. You will have to reboot ASE.
EXEC sp_deviceattr 'tempdbdev01', 'dsync', 'false'
Go
EXEC sp_deviceattr 'tempdblogdev01', 'dsync', 'false'
go
I think we have discovered a bug in the optimizer, what information should we collect?
Query processing problems can be of several types. In this case, either you are seeing a stack trace in the
errorlog (possibly coupled with a client disconnect) or degraded performance (either unknown cause, or via
query forcing a correct plan is obtainable). It is imperative that you isolate the problem query. Once you
have done that, you may want to get the following output (remember, Sybase has non-disclosure
agreements with all customers, which may alleviate concerns about providing business sensitive data):
Preferably, Sybase would like to have a full database dump. However, if not available, the
full schema of the tables involved, stored procedure source code, and bcp extraction of the
data is the next best option. While many some issues may be resolvable without this
information, such fixes are not always guaranteed as they are based on "guesses" of what is
happening. By having a copy of the data, the exact data cardinality, data volumes, data
selectivity that go into the optimizer costing algorithms is available not only for problem
detection but also for testing the resolution ensuring that when a fix is provided, you can be
certain that it does fix the problem.
Get the output of ddlgen. If you can't provide the data, then the full schema for all tables,
indices, triggers and procedures involved will be needed.
Get the output of optdiag including simulate mode if used. If the data can not be provided,
optimizer engineering will need to have some notion of the data volumes, cardinalities and
selectivity involved. If you were able to influence the optimizer using simulated statistics,
send those as well.
Force the good plan and collect the following information on it as well as the bad plan
o set statistics plancost on
o set statistics time on
o set option show long (this output could be huge. You do not need the output of trace
flags 302/310 anymore, however, you many need trace 3604 enabled)
88
o set showplan on (with traceflag 526 turned on)
We are getting the wrong answers from ASE to certain queries
There are two situations with this with or without parallelism involved. Without parallelism, ideally
Sybase engineering would likely to be able to determine the problem using your data so once again,
whatever you can provide in terms of database dumps or bcp'd extracts of the data is most helpful.
However, in addition, collect the output after enabling the following options and running the query:
set option show_code_gen on

dbcc traceon(201)
If the problem only occurs in parallel queries vs. when run serial fashion, in addition to the ones listed
above, add the following:
set option show long

set option show_parallel long
If proper partition elimination is not happening, add the following option to those above.
set option show_elimination long
89
Storage & Disk IO Changes
One of the other major changes in ASE 15.0 was in the area of storage and disk i/o. The two main features
were the Very Large Storage Support (VLSS) and DIRECTIO implementations. As a result of these
changes, there will likely be impacts on maintenance applications and tools - especially DBA scripts that
monitor space usage. These impacts will be discussed later, while this section focuses on the higher-level
changes brought on by these two changes.
Very Large Storage Support
In pre-15.0 releases of Adaptive Server, a virtual page was described internally as a 32-bit integer: the first
byte holds the device number (the vdevno) and the succeeding three bytes describe the page offset within
the device in units of 2K bytes (the virtual page number). This is illustrated below:
Figure 6 - ASE logical page & device bitmask prior to ASE 15.0
This architecture limited the number of devices to 256 and the size of each device to 32 gigabytes, which
makes a maximum storage limit of 8 terabytes in the entire server. While the upper bound of the storage
size typically was not an issue, several customers were hitting issues with the limit on the number of
devices - largely because the device size restriction forced more devices to be created than would have been
necessary had ASE supported larger device sizes.
ASE 15.0 separated the device number and logical page ids into two separate fields allowing the server to
address 2 billion devices containing 2 billion logical pages. This now means that ASE supports devices up
to 4TB in size and a single database could theoretically have 2 billion 16K pages for a total of 32TB. The
server has a maximum of 1 Exabyte based on a theoretical limit of 32,676 databases of 32TB each.
However on a more practical limit, the question that may be asked is how many devices should a database
have at a minimum. This question often arises when DBAs are working with storage administrators who
are trying to over-simplify administration by striping all the available devices into one single large device -
an inoptimal configuration. The best advice for ASE and the number of devices that need to be created
should be based on the following criteria:
Identitify all the tables you expect to have heavy IO on (likely about 10-20 per production
system). These tables should be created on separate devices that map to separate LUNs along
with separate devices/LUNs for the indexes on these tables. This does not mean one device
per index, but likely one device for all the indices for one table at a minimum.
Other tables can be spread across those devices or other devices as desired. However, the
transaction log for each database should have a separate device as it will likely be the heaviest
hit device within OLTP databases.
As far as system databases, each tempdb should have 2 or more devices. Other devices as
necessary for master, sybsystemprocs, etc.
The rationale behind these suggestions are due to several reasons. First, there is a single pending IO queue
for each device and device semaphore within ASE. If all the tables are on a single device, the writes will
be queued in order, possibly delaying unrelated transactions. If the IO is delayed, it needs to grab the
device semaphore, which could lead to higher internal contention within ASE than necessary.
91
The second reason for this suggestion is that by separating databases on different devices/LUNs when
multiple databases are within the same server, DBAs can work with storage administrators to exploit SAN
utilities to quickly create copies of a database for development testing or for HA implementations using
quiesce database or mount/unmount database commands. Since the device copying/relocation is done at
the physical drive level, having multiple databases spanning the same devices could result in wasted disk
space when the device is copied to the development or DR site.
DIRECTIO Support & FileSystem Devices
A commonly asked question is whether Sybase recommends file systems or raw partition devices. The
answer is that it depends on a number of factors such as OS file system implementation as well as the
application IO profile, available OS resources such as memory & cpu and OS tuning for file system cache
limits and swapping preferences. Generally, in the past, Sybase has stated that file system devices behave
well for read operations particularly large reads where the file system read-ahead can outpace even ASE
asynchronous prefetch capabilities, whereas raw partitions did better for write activity especially in high
concurrency environments.
In ASE 12.0, Sybase introduced the device 'dsync' attribute, which implemented DSYNC I/O for file
system devices. A common misconception was that this bypassed the file system buffer to ensure
recoverability. In actuality, it still used the filesystem buffer, but forced a flush after each file system write.
This double buffering in both ASE and the file system cache plus the flush request caused slower response
times for writes to file system devices than raw partitions. In ASE 15.0, Sybase has added DIRECTIO
support via the 'directio' device attribute to overcome this problem. Internal tests have shown a substantial
performance improvement in write activity with devices using DIRECTIO vs. devices using DSYNC.
Currently, as of ASE 15.0 GA release, DIRECTIO is only supported on the following platforms.
Sun Solaris
IBM AIX
MicroSoft Windows
Other operating systems may be added in later releases or updates to ASE 15.0. There are several very
important considerations about DIRECTIO that should be considered:
On operating systems that support it, you may have to tune the OS kernel, mount the file
system with special options (i.e. the forcedirectio mount option in Solaris) as well as make
sure the OS patch levels are sufficient for high volume of DIRECTIO activity.
DIRECTIO and DSYNC are mutually exclusive. If using DSYNC currently and you wish to
use DIRECTIO, you will first have to disable the 'dsync' attribute. Note that changing a
device attribute requires rebooting the ASE.
Make sure that the memory requirements for filesystem caching as well as CPU requirements
for changes in OS processing of IO requests does not detract from ASE resources. In fact,
when using file system devices, it is likely that you will need to leave at least 1 or 2 cpus
available to the OS to process the write requests (reverting back to the old N-1 advice for the
maximum number of engines).
It is extremely important that before switching a device from using raw, DSYNC or DIRECTIO, that you
test your application at scaling the user load 25%, 50%, 75% and 100%. The reason for testing multiple
scenarios is that you can see if the scaling is linear or degrades as you get closer to 100% load. If
performance starts flattening, then if you need to increase the load due to increased user population you
may need to add more resources to ASE.
92
Tempdb & FileSystem devices
For sites running ASE with tempdb on a filesystem on SMP hardware, DIRECTIO support may benefit
even over having dsync off. Again, please test your application at varying loads with different tempdb
device configurations. It is likely that dsync off will have an advantage on smaller SMP systems with
lower user concurrency in tempdb or with fewer/larger temp tables, while directio enable may have an
advantage on larger SMP systems or on systems with high concurrency in tempdb with smaller/higher write
activity tempdbs. Note that as of ASE 12.5.0.3, you can have multiple tempdbs, consequently, it may be
benefitial to have one tempdb running with file system devices using dsync off for the batch reports at night
whereas daytime OLTP processing tempdb's may use raw partitions or file system devices with
DIRECTIO.
In addition, having dsync on may still dramatically degrade query performance for sorting operations
such as adding an 'order by' clause to your query. This is possible if you have DSYNC turned on for
tempdb. In one instance, we measured a time difference of 20 minutes to 7 hours, when DSYNC option
was turned OFF versus it being turned ON. In general, you should try to turn DSYNC off or use
DIRECTIO. Here is sample set of commands that will help you to have DYNC turned off for your
database. You need to adapt that you to your environment. In the first case, we show how the default
tempdb is created on new devices with the DSYNC turned off.
USE master
Go
DISK INIT name = 'tempdbdev01', physname = '/tempdb_data' , size = '4G', dsync = 'false'
Go
DISK INIT name = 'tempdblogdev01', physname = '/tempdb_log', size = '4G', dsync = 'false'
Go
ALTER DATABASE tempdb ON tempdbdev01 = '4G' LOG ON tempdblogdev01 = '4G'
Go
USE tempdb
Go
EXEC sp_dropsegment 'logsegment', 'tempdb', 'master'
go
EXEC sp_dropsegment 'system', 'tempdb', 'master'
go
EXEC sp_dropsegment 'default', 'tempdb', 'master'
go
In case, you already have devices established for tempdb, you would merely have to turn the DYSNC
property off. You will have to reboot ASE.
EXEC sp_deviceattr 'tempdbdev01', 'dsync', 'false'
Go
EXEC sp_deviceattr 'tempdblogdev01', 'dsync', 'false'
go
Enabling DIRECTIO can either be enabled in similar fashion using disk init or by using the sp_deviceattr
stored procedure. Any change to the device attribute will require an ASE reboot.
Running tempdb on a Unix filesystem device without DIRECTIO may also cause considerable swapping
and other activities due to the UFS cache utilization. Without DIRECTIO, you will use UFS cache no
matter if DSYNC is on or off. Consequently, you should make sure that ASE + UFS cache size is less than
85% of physical memory. You also should make sure UFS cache is constrained to the desired size vs. the
default which is often unconstrained. Constraining UFS cache size typically requires using the operating
systems kernel tuning facilities, so you will need to work with your system administrator.
Additionally, if using SAN disk block replication, do NOT under any circumstances include tempdb
devices in the disk replication volume group. Doing so will cause extreme performance degradation to
queries involving sort operations as well as normal temp table write activity. If the storage admins insist on
block replicating the entire SAN, purchase a couple of high speed internal disks and place them directly in
the server cabinet (or in a direct attached storage bay).
93
Changes to DBA Maintenance Procedures
This sections purpose is to try to highlight changes in ASE 15.0 that affect DBA utilities as well as point
out how some of the earlier changes discussed may impact DBA procedures. Perhaps the biggest impact is
due to the VLSS implementation as well as table partitioning and their respective impacts on DBA scripts
for space utilization.
Space Reporting System Functions
Perhaps the most noticeable change is that the system functions that formerly were used to report space
usage have been deprecated and replaced with partition-aware variants. The following table lists these
functions, along with the syntax changes.
ASE 12.5 ASE 15.0

data_pgs(object_id,{doampg | ioampg}) data_pages(dbid, object_id [, indid [, ptn_id]])
used_pgs(object_id, doampg, ioampg) used_pages(dbid, object_id [, indid [, ptn_id]])
reserved_pgs(object_id,{doampg | ioampg}) reserved_pages(dbid, object_id [, indid [, ptn_id]])
rowcnt(sysindexes.doampg) row_count(dbid, object_id [, ptn_id])
ptn_data_pgs(object_id, partition_id) (data_pages())
The big change that is apparent in the above functions is the replacement of sysindexes.doampg or ioampg
with indid and partition_id. The change that is not as apparent is that whereas before the functions were
used with scans of sysindexes, now they are used with scans of syspartitions likely joined with
sysindexes. For example:
-- ASE 12.5 logic to report the spaced used by nonclustered indices
select name, indid, used_pgs(id, doampg, ioampg)
from sysindexes
where id=object_id(authors)
and indid > 1
In ASE 15.0, this changes to the following:

-- ASE 15.0 logic to report the spaced used by nonclustered indices
select i.name, p.indid, used_pages(dbid(), p.id ,p.indid)
from sysindexes I, syspartitions p
where i.id=object_id(authors)
and i.indid > 1
and p.indid > 1
and p.id=i.id
and p.id=object_id(authors)
and p.indid=i.indid
order by indid
Which doesnt make sense until you realize that with storage now linked to the syspartitions table vs.
sysindexes (and logically this makes sense), if you are trying to gauge space utilization on a partition basis,
it is likely that you would run queries such as:
-- ASE 15.0 logic to report the spaced used by nonclustered indices on a partition
basis
select p.name, i.name, p.indid, used_pages(dbid(), p.id, p.indid, p.partitionid)
from sysindexes I, syspartitions p
where i.id=object_id(authors)
and i.indid > 1
and p.indid > 1
and p.id=i.id
and p.id=object_id(authors)
and p.indid=i.indid
order by p.partitionid, p.indid
95
Note that the deprecated ASE 12.x functions still execute, but they always return a value of 0. The reason
is partially because they rely on sysindexes.doampg and sysindexes.ioampg, which are no longer
maintained. While syspartitions seems to have similar structures in the columns datoampage and
indoampage, these values are on a partition-basis consequently index space usage would have to
aggregated for partitioned tables.
Sysindexes vs. syspartitions & storage
As mentioned earlier in the discussion associated with deprecated and new space reporting functions
(data_pgs() data_pages()), disk space allocations are now associated with syspartitions in ASE 15.0 vs.
sysindexes as previously was the case. The following table identifies the ASE 12.5 space pointers in
sysindexes and the equivalent locations in ASE 15.0 syspartitions.
Space association ASE 12.5 ASE 15.0

sysindexes syspartitions
Unique row id + indid id + indid +
partitionid
First page first firstpage
Root page root rootpage
Data OAM page doampg datoampage
Index OAM page ioampg indoampage
Custom DBA scripts or previous dbcc commands that used these locations will need to be changed to
reflect the current implementation. Published dbcc commands have already been modified. If you used
previously undocumented dbcc commands (such as dbcc pglinkage()), these likely will fail as they have
been either deprecated or simply not maintained. Such commands were typically only used to perform
detail problem diagnosis/data recovery steps and should not have been used on a recurring basis. If a data
corruption problem occurs, contact Sybase Technical Support for the correct procedures under ASE 15.0.
Continuing to use undocumented dbcc commands from previously releases in ASE 15.0 is likely to cause
corruptions simply due to the change in space associated noted here.
VDEVNO column
Another major system change in 15.0 is the lifting of some of the storage limitations. Prior to ASE 15.0,
ASE had a 256 device limit and 32GB/device size limit. This was driven by the fact that ASEs storage is
organized by virtual pages using a 4 byte bit mask. The high order byte was used to denote the device id
(or vdevno) and consequently the 8 bits limited ASE to 256 devices (8 bits as a signed integer). The
remaining 3 bytes were used to track the actual virtual page numbers which considering a 2K page (the
storage factor in sysdevices) provides a limit of 32GB per device. Note that theoretically, a 16K server
could have large devices, but due the default of 2K and the implementation of sysdevices, a limit of 32GB
was imposed. This 32-bit bitmask logical page id resembled the following:
Figure 7 - Pre-ASE 15.x Logical PageID and Vdevno
As illustrated above, prior to ASE 15.0, the vdevno was the high-order byte of the 32 bit page
implementation. Deriving the vdevno often meant that users where using calculations such as 224 to isolate
96
the high-order byte. This technique also had an implied limitation of 255 devices. As an additional
problem, DBA's attempting to associate device fragments from master..sysusages with master..sysdevices
had to join using a between clause based on the high and low virtual page numbers.
In ASE 15.0, the virtual page number is now 2 32-bit integers one for the device number (vdevno) and
one for the page id itself. The vdevno is also present in sysusages and sysdevices. As a result, DBA scripts
that were previously used to calculate space consumption need to be modified. Consider the following
examples:
-- ASE 12.5 implementation
select d.name, u.size
from sysuages u, sysdevices d
where u.vstart >= d.low
and u.vstart <= d.high
and u.dbid = <database id>
-- ASE 15.0 implementation

select d.name, u.size
from sysusages u, sysdevices d
where u.vdevno = d.vdevno
and u.dbid = <database id>
In addition to the columns being added, ASE 15.0 loosens the rule controlling the assignment of vdevno
numbers. In pre-15, the vdevno passed to disk init must be less than the configured number of devices.
In 15.0 it can be any value up to 2,147,483,647 independent of the current configured number of devices.
This may or may not be useful for customers, for example: customers can group ASE virtual devices on the
same physical disk by using the vdevno
vdevno 1000 to 1999 on physical disk 1

vdevno 2000 to 2999 on physical disk 2
This sort of coding can be also be used for databases as well
Vdevno 1000 to 1999 belong to db A

Vdevno 2000 to 2999 belong to db B
DBA Maintenance FAQ
Why did Sybase change the data_pgs(), etc. function names couldn't they have been left intact
and the new parameters made optional to preserve compatibility?
No. The problem was more than just the parameters the bigger issue was that the previous functions were
always embedded in queries accessing sysindexes, whereas in ASE 15.0, with the space association being
moved to syspartitions, the very query themselves would have to be rewritten. Additionally, the columns
used for the functions such as doampg were no longer appropriate as the new partitioning scheme
would have multiple OAM pages one for each partition. As a result, Sybase could not even "maintain"
the sysindexes columns to ensure compatibility. As a result, it was decided that rather than have unreliable
results, the functions would be deprecated and return a value of 0. This would allow applications to
continue working vs. failing outright. Since new functions would be needed, it was also a time in which
Sybase could make the function parameters more consistent and also easier to use (i.e. using indid and
partitionid vs. the OAM page pointers).
97
Update Statistics & datachange()
This section highlights some of the differences and questions around update statistics in ASE 15.0
Automated Update Statistics
ASE 15.0 includes the ability to automate the task of updating statistics only when necessary. This feature
is mentioned in this document because many people are confused and think that this implies that running
update statistics is no longer necessary as they believe that automated update statistics means that the server
automatically tracks statistics with each DML operation. This doesn't happen for a variety of reason - the
main one is that this could slow down OLTP operations. Even if done outside the scope of the transaction,
since multiple users would likely be attempting to modify the statistics for the same exact rows in
systabstats, the contention would result in effectively single threading the system. Another reason is that
incrementally adding rows to a large table might not accurately update the statistics as the newly computed
density might have the same exact value due to precision loss. For example, take a 1,000,000 row table and
add 100,000 rows to it. Let's assume that the table contains a date column and all of the 100,000 rows have
today's date. If we add the rows one at a time, the range cell density wouldn't change as each new row as it
is added only amounts to 1/1,000,000th of the table or .000001. This would be especially true if less than
6 digits of precision were used as the 6th digit would be ever truncated. However, 100,000 rows adds 10%
to the overall table size.
Instead, the functionality is accomplished using the Job Scheduler template included with ASE 15.0. The
Job Scheduler (JS) engine was a new component added in ASE 12.5.1 (12.5.2 for Windows) that allows
DBAs to schedule recurring jobs from a central management server for multiple ASEs in the environment
including support for different versions. More documentation on setting up the JS engine and the Sybase
Central interface is found in the ASE Job Schedulers User Guide. The JS template for update statistics
allows DBAs to very quickly set thresholds for particular tables so that update statistics can be run only if
necessary during the scheduled time. A screen snapshot of this template is shown below.
Datachange() function
The automated update statistics feature is based on the new system function datachange(). This function, as
documented, returns the percentage of data modified within the table. As a result, existing update statistics
DBA scripts can be modified to take advantage of this by simple logic such as:
select @datachange = datachange(authors, null, null)
if @datachange > 50
begin
99
update statistics authors

end
Several notes:
The percentage returned is based on the number of DML operations and the table size. Each
insert or delete counts as 1, while updates count as 2 (as if it were a delete followed by an
insert).
The percentage returned is based on the number of rows remaining in the table. For example,
deleting 500 rows from a 1,000 row table will result in datachange() returning 100%
The datachange() function parameters are tablename, partition, and column in that order. This
allows DBAs to detect just the change in particularly volatile fields and update the index
statistics for just specific indices vs. all or for a specific partition.
The rationale for reporting a percentage instead of the number of rows is that the number of rows does not
really provide useful information by itself and would only be useful when compared in the context to the
size of the table. If datachange()=5,000, this could be very significant if the table contains 5,100 rows or
insignificant if it contains 500 million. By using a percentage, it makes it easier to establish relative
thresholds in maintenance scripts, etc.
Update Statistics Frequency and ASE 15
Because ASE 15.0 now has several different algorithms for sorting, grouping, unions, joining and other
operations, it is more important that ASE 15 have more up to date statistics and more column statistics than
previous releases. This does not necessarily infer that ASE 15 needs to have update statistics run when 5%
of the table has changed whereas in 12.5 you used to wait for 10%. What is implied is that in many
situations, customers would run update statistics on fairly static data such as data in reporting systems.
While ASE 12.5 might not have picked a bad plan simply because with a single algorithm it was tough to
do so, ASE 15 lacking current statistics may pick one of the new algorithms which may prove to be
disastrously slow as the actual data volume far exceeds the projected volume based on the stale statistics.
Similarly as you saw earlier (in the discussion on showplan options in Diagnosing and Fixing Issues in
ASE 15), the row estimates based on the availability of column statistics for all the columns in an index vs.
just the density statistics can have a considerable impact on row estimations - which play a crucial role, not
only in index selectivity, but also in parallel query optimization decisions, join selectivity, etc. Because of
this, when migrating to ASE 15, the following advice about statistics is provided:
Use update index statistics instead of update statistics.

Use a higher step count and histogram factor to provide more accurate statistics and skew
coverage.
For really large tables, consider partitioning the tables and running update index statistics on
each partition (see the next section).
For extremely large tables or tables not being partitioned, consider running update index
statistics with sampling.
When running update index statistics, consider raising the number of sort buffers via
sp_configure to orders of magnitude higher than normal execution. This can be done
dynamically - and while it does require more proc cache (or less parallel executions of update
index statistics) - it can have a profound effect on the time it takes to run update statistics on
really large tables.
Use datachange() to determine when update statistics is necessary. On volatile indexes,
consider running update index statistics on that index individually or perhaps just on the
volatile column vs. all the indexes in the table. As the need to run update index statistics is
100
more realistically arrived at, the tables could be cycled through the different maintenance
periods vs. doing them all.
Update Statistics on Partitions
The ability to run update statistics on a specific partition should by itself reduce the time needed to run
update statistics substantially. Most large tables that take lengthy time running update statistics contain a
lot of historical data. Few, if any, of these historical rows change, yet update statistics must scan them all.
Additionally, when viewed from the entire tables perspective even 1 million rows added to a 500 million
row table is only 0.2% - which would suggest that statistics does not need to be updated. However, these
are the likely the rows most often used and their distribution heuristics are not included in the range cell
densities, etc. As a result, query optimization likely suffers.
If you partition the data, particularly on a date range for the scenario above i.e. month, then older/static
data can be skipped when running update statistics (after the first time). Thereafter, you can use the
datachange() function to check the amount of change within the current partition and run update statistics
as necessary. Note that all partitions have a name if you did not supply one (i.e. you used the hash
partition short cut syntax), ASE provides a default name for you similar to tempdb tables. Use sp_help to
identifying the system supplied names. An example of datachange() based on a system supplied name and
focusing on a specific column is illustrated below:
1> select datachange("mytable","part_1360004845", "p_partkey")
2> go
---------------------------
100.000000
Obviously, this would be a good candidate for running update statistics on the partition listed above.
update statistics mytable partition part_1360004845 (p_partkey)
go
Note the highlighted syntax change allowing update statistics to focus on a specific partition (and in this
case column).
The reduction in time for update statistics may allow you to create statistics or heuristics on columns that
were avoided previously due to time constraints. For example:
update statistics mytable (col1, col2)
Creates heuristics on {col1} and densities on the combined {col1,col2} pair. If you think having heuristics
on {col2} will help, by updating statistics on a partition basis, you might now have the time to do so.
Consequently, the command may now look like:
update statistics mytable partition part_1360004845 (col1, col2)
go
update statistics mytable partition part_1360004845 (col2)
go
Needs Based Maintenance: datachange() and derived_stats()
Prior to ASE 15.0, often DBAs did table level maintenance in the blind or in reactionary mode. For
example, they just arbitrarily ran update statistics every weekend or whenever users started complaining
about query performance. Obviously, datachange() will help reduce time spent updating statistics
needlessly.
However, DBAs often dropped and recreated indices on some tables on a regular basis as well due to table
fragmentation. In a sense, datachange() is the perfect complement to the derived_stats() function that was
added in ASE 12.5.1. The syntax for the derived_stat function has been updated in ASE 15 to also work on
partitioned tables as well:
101
derived_stat(object_name | object_id,
index_name | index_id,
[partition_name | partition_id,]
statistic)
The values for statistic are:
Value Returns
data page cluster ratio or dpcr The data page cluster ratio for the object/index pair
index page cluster ratio or ipcr The index page cluster ratio for the object/index pair
data row cluster ratio or drcr The data row cluster ratio for the object/index pair
large io efficiency or lgio The large I/O efficiency for the object/index pair
space utilization or sput The space utilization for the object/index pair
The statistic values match those returned from optdiag, consequently provide useful information for
determining when a reorg might be needed. For instance, some DBAs have found that reorg should be
done on an index when the index page cluster ratio (ipcr) changes by 0.1 from a known starting point.
One change in this function from 12.5.1 is the addition of the partition column which might throw
existing queries off slightly. As a result, the following rules apply:
If the four arguments are provided, derived_stat uses the third argument as the partition, and
returns derived statistics on the fourth argument.
If three arguments are provided, derived_stat assumes you did not specifiy a partition, and
returns derived statistic on the third argument.
This technique provides compatibility for existing scripts that used the derived_stat function.
By combining data_change() and derived_stat() and monitoring performance on index selection via
monOpenObjectActivity, DBAs could develop fairly accurate trigger points at which specific values for
data rows modified or cluster ratio changes as a result of data modifications would trigger the need for
running a reorg on a table or index. This also can be combined with queries using systabstats such as the
forwrowcnt, delrowcnt, and emptypgcnt to fine tune which of the reorgs are really required.
Update Statistics FAQ
Why doesnt the server just automatically update the statistics as a result of my DML statement
eliminating the need to even run update statistics entirely?
There are several reasons, however, the main one is that this could slow down OLTP operations. Even if
done outside the scope of the transaction, since multiple users would likely be attempting to modify the
statistics for the same exact rows in systabstats, the contention would result in effectively single threading
the system.
Why does datachange() report a percentage (%) vs. the number of rows modified?
Reporting the number of rows modified provides no context to the size of the table. If datachange()=5,000,
this could be very significant if the table contains 5,100 rows or insignificant if it contains 500 million.
By using a percentage, it makes it easier to establish relative thresholds in maintenance scripts, etc.
102
Computed Columns/Function-Based Indices
The following section highlights some of the behavior nuances of using computed columns.
Computed Column Evaluation
The most important rule to consider when understanding the behavior of computed columns is to
understand when they are evaluated especially if trying to pre-determine the likely output of non-
deterministic columns. The evaluation rules are as follows:
Non-materialized (virtual) computed columns have their expression evaluated during the
query processing. Consequently, it reflects the state of the current users session.
Materialized (physical) computed columns have their expression evaluated only when a
referenced column is modified.
For example, consider the following table:

create table test_table (
rownum int not null,
status char(1) not null,
-- virtual columns
sel_user as suser_name(),
sel_date as getdate(),
-- materialized columns
cr_user as suser_name() materialized,
cr_date as getdate() materialized,
upd_user as (case when status is not null
then suser_name() else 'dbo' end)
materialized,
upd_date as (case when status is not null
then getdate() else 'jan 1 1970' end)
materialized
)
This table has 3 pairs of computed columns that are evaluated differently.
sel_user/sel_date These are virtual columns which have their expression evaluated when someone queries
the table.
cr_user/cr_date Although these are physical/materialized columns, since they do not reference any other
columns, their expression will only be evaluated when rows are inserted. They will not be affected by
updates.
upd_user/upd_date These columns reference the status column although the status column does not
determine the value. As a result, these columns will only be changed if the status column is modified
effectively inserts and updates that set the status column to any value.
As a result, the last two computed column pairs (cr_user/cr_date and upd_user/upd_date) are unaffected by
queries. So although they are based on non-deterministic functions, the values are consistent for all
queries.
Non-Materialized Computed Columns & Invalid Values
As mentioned above, non-materialized computed columns have their expressions evaluated only at query
time not during DML operations. This can lead to query problems if the formula used to create the
expression is not validated prior to creating the computed column. Consider the following:
create table t (a int, b compute sqrt(a))
go
insert t values (2)
insert t values (-1)
insert t values (3)
go
103
select * from t
go
1> select * from t

2> go
a b
----------- --------------------
2 1.414214
Domain error occurred.
The computed column b is not evaluated until queried hence the domain error isnt noted until the select
statement. This can be especially nefarious if the select statement is embedded within a trigger.
104
Application and 3rd Party Tool Compatibility
In addition to the storage changes and procedures for updating statistics, there are a number of other system
changes related to new tables to support encrypted columns, and other features. However, since these are
described in the whats new, this section instead will focus on temp table changes and third party tool
compatibility.
#Temp Table Changes
There are two main changes that affect temporary tables to be considered. First we will discuss the temp
table naming changes and then take a look at query optimization behaviors:
#Temp Table Naming
Prior to ASE 15.0, the 30 character limit meant that user temporary tables in tempdb were named with 12
distinct characters plus a 17 byte hash separated by an underscore. Names shorter than 12 characters were
padded with underscores to achieve a length of 12 characters. For example:
select *
into #temp_t1
from mytable
where
Would likely result in a temp table with a name of #temp_t1______0000021008240896.
In ASE 15, this padding with underscore is no longer implemented. Additionally, the limitation of 12
distinct characters has been lifted along with the limitation of 30 characters for object names. This can
cause some slight differences in temp table behavior that should not affect applications other than
applications built in ASE 15 may not be backward compatible with 12.5. Consider the following scenarios
in which two temp tables are created in the same session for each scenario:
-- This fails in ASE 12.5, but succeeds in 15.0
-- The reason is that in 12.5, the automatic padding with underscore
-- results in tables with the same name.
create table #mytemp ()
create table #mytemp___ ()
go
-- This also fails in ASE 12.5, but succeeds in 15.0

-- The reason is that in ASE 12.5, the names are truncated to 12 characters
create table #t12345678901 ()
create table #t1234567890123 ()
-- The following refer to the same table in ASE 12.5, but different
-- tables in 15.0 the reason is identical to the above (name truncation)
select * from #t12345678901
select * from #t1234567890123456
#Temp Table Query Optimization
In ASE 12.x, particularly with enable sort-merge join and JTC disabled, joins involving temporary tables
- particularly between two or more temporary tables would use the single remaining join strategy of Nested
Loop Join (NLJ). Generally queries such as this are contained within stored procedures - consider the
following two examples (note the single line difference disabling merge join):
create procedure temp_test1
@book_type varchar(30),
@start_date datetime,
@end_date datetime
as begin
select * into #sales from sales
105
select * into #salesdetail from salesdetail
-- List the title, price, qty for business books in 1998

select t.title, t.price, s.stor_id, s.ord_num, sd.qty, total_sale=t.price * sd.qty
from #sales s, #salesdetail sd, titles t
where t.type=@book_type
and s.stor_id=sd.stor_id
and s.ord_num=sd.ord_num
and s.date between @start_date and @end_date
return 0
end
go
create procedure temp_test2

@start_date datetime,
@end_date datetime
as begin
select * into #sales from sales

select * into #salesdetail from salesdetail
set merge_join 0
-- List the title, price, qty for business books in 1998

select t.title, t.price, s.stor_id, s.ord_num, sd.qty, total_sale=t.price * sd.qty
from #sales s, #salesdetail sd, titles t
where t.type=@book_type
and s.stor_id=sd.stor_id
and s.date between @start_date and @end_date
return 0
end
go
Now then, we are all familiar with the fact that if ASE lacks any information on the #temp table, it assumes
it is 10 rows per page and 10 pages total or 100 rows in size. As a result, ASE would consider the join
between the two #temp tables as a likely candidate for a merge join as the sort expense is not that high and
certainly cheaper than an n*m table scan. To see how this works, lets take a look at the showplan and
logical i/o costings for each one. First we will look at the showplan and statistics i/o for the query allowing
merge joins (some of the output deleted for clarity):
1> exec temp_test1 'business', 'Jan 1 1988', 'Dec 31 1988 11:59pm'
-- some showplan output removed for clarity/space
ROOT:EMIT Operator
|RESTRICT Operator
|
| |NESTED LOOP JOIN Operator (Join Type: Inner Join)
| |
| | |MERGE JOIN Operator (Join Type: Inner Join)
| | | Key Count: 2
| | | Key Ordering: ASC ASC
| | |
| | | |
| | | | |SCAN Operator
| | | | | FROM TABLE
| | | | | #sales
| | | | | s
| | | | | Table Scan.
| | | | | Forward Scan.
| | | | | Positioning at start of table.
| | | | | Using I/O Size 32 Kbytes for data pages.
| | | | | With LRU Buffer Replacement Strategy for data pages.
| | |
106
| | | |
| | | | | #salesdetail
| | | | | sd
| |
| | |SCAN Operator
| | | FROM TABLE
| | | titles
| | | t
| | | Table Scan.
| | | Forward Scan.
-- actual I/Os for statement #4 as reported by set statistics io on.
Table: #sales01000310024860367 scan count 1, logical reads: (regular=1 apf=0 total=1), physical reads:
(regular=0 apf=0 total=0), apf IOs used=0
Table: #salesdetail01000310024860367 scan count 1, logical reads: (regular=2 apf=0 total=2), physical
reads: (regular=0 apf=0 total=0), apf IOs used=0
Table: titles scan count 11, logical reads: (regular=11 apf=0 total=11), physical reads: (regular=0
apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 28.
Total writes for this command: 0
(44 rows affected)
So as you can see, we scanned each of the two temp tables 1 time to create Worktable1 and Worktable2 and
performed a merge join - and then did a NLJ on the table scan with titles (no index on titles.type) for a total
I/O cost of 28. Lets compare this to when the merge join is disabled:
1> exec temp_test2 'business', 'Jan 1 1988', 'Dec 31 1988 11:59pm' with recompile
-- some showplan output removed for clarity/space
ROOT:EMIT Operator
|RESTRICT Operator
|
| |SEQUENCER Operator has 2 children.
| |
| | |STORE Operator
| | | Worktable1 created, in allpages locking mode, for REFORMATTING.
| | | Creating clustered index.
| | |
| | | |INSERT Operator
| | | | The update mode is direct.
| | | |
| | | | | sd
| | | |
107
| | | | TO TABLE
| | | | Worktable1.
| |
| | |N-ARY NESTED LOOP JOIN Operator has 3 children.
| | |
| | | | FROM TABLE
| | | | titles
| | | | t
| | | | Table Scan.
| | |
| | | | FROM TABLE
| | | | #sales
| | | | s
| | | | Table Scan.
| | |
| | | | FROM TABLE
| | | | Worktable1.
| | | | Using Clustered Index.
| | | | Positioning by key.
-- actual I/Os for statement #5 as reported by set statistics io on.
Table: #salesdetail01000130025150907 scan count 1, logical reads: (regular=2 apf=0 total=2), physical
Table: Worktable1 scan count 12, logical reads: (regular=157 apf=0 total=157), physical reads:
Table: titles scan count 1, logical reads: (regular=1 apf=0 total=1), physical reads: (regular=0 apf=0
total=0), apf IOs used=0
Table: #sales01000130025150907 scan count 4, logical reads: (regular=4 apf=0 total=4), physical reads:
(44 rows affected)
As you can see, the merge join reduced the I/O cost of this query by 300. If we had enabled set option
show_lio_costing, we would have seen in each case the following output:
Estimating selectivity for table '#sales'

date >= Jan 1 1988 12:00:00:000AM
date <= Dec 31 1988 11:59:00:000PM
Estimated selectivity for date,
selectivity = 0.25,
using table prefetch (size 32K I/O)
Estimating selectivity for table '#salesdetail'


Search argument selectivity is 1.
Beginning selection of qualifying indexes for table 'titles',
108
This is fine for small temp tables - the problems develop when the temp tables are much larger. For
instance, we will use the same queries, but this time in the pubstune database that is often used by Sybase
Education in performance and tuning classes. The difference between the two as far as row counts are:
Table pubs2 pubstune

titles 18 5,018
sales 30 132,357
salesdetail 116 1,350,257
To demonstrate the differences in optimization, we will run 4 tests using 4 stored procedures:
Default optimization (likely will use merge join)

Disabled merge join (forcing NLJ)
Default optimization with update statistics
Default optimization with create index
In the sample data, the year 2000 recorded ~26,000 sales
Default Optimization (Merge Join)
For this test, we will use the following stored procedure:

create procedure temp_test_default
@begin_date datetime,
@end_date datetime
as begin
select * into #sales

from sales
where date between @begin_date and @end_date
select sd.* into #salesdetail

from salesdetail sd, #sales s
where sd.stor_id=s.stor_id
and sd.ord_num=s.ord_num
select @@rowcount
select t.title_id, t.title, s.stor_id, s.ord_num, s.date, sd.qty,

total_sale=(sd.qty*t.price)-(sd.qty*t.price*sd.discount)
into #results
from titles t, #sales s, #salesdetail sd
where s.stor_id=sd.stor_id
and sd.title_id=t.title_id
select count(*) from #results
drop table #sales

drop table #salesdetail
drop table #results
return 0
end
go
The script to run the test is:

dbcc traceon(3604)
go
set showplan on
go
go
set statistics io on
go
exec temp_test_default 'business', 'Jan 1 2000', 'Dec 31 2000 11:59pm' with recompile
109
go
dbcc traceoff(3604)
set showplan off
set option show_lio_costing off
set statistics io off
go
As expected, the estimates for #sales and #salesdetail are the typical 10 rows/page and 10 pages (100 rows)



Not too surprising, it picked a merge join as per the showplan below (note the estimated I/Os at the bottom)
The type of query is INSERT.
ROOT:EMIT Operator
|INSERT Operator
| The update mode is direct.
|
| | Key Count: 1
| |
| | |SORT Operator
| | |
| | | | Key Ordering: ASC ASC
| | | |
| | | | |
| | | | | | #sales
| | | | | | s
| | | | | | Table Scan.
| | | | | | Positioning at start of table.
| | | | | | Using I/O Size 32 Kbytes for data pages.
| | | | | | With LRU Buffer Replacement Strategy for data pages.
| | | |
| | | | |
| | | | | | #salesdetail
| | | | | | sd
| | | | | | Table Scan.
110
| |
| | |SCAN Operator
| | | FROM TABLE
| | | titles
| | | t
| | | Table Scan.
| | | Forward Scan.
|
| TO TABLE
| #results
The actual I/Os processed for the three way join were - the bottom number is the result of how many rows
are in the final result set (which also happens to be the number of rows in #salesdetail):
Table: #sales01000130014531845 scan count 1, logical reads: (regular=222 apf=0 total=222), physical
Table: #salesdetail01000130014531845 scan count 1, logical reads: (regular=2784 apf=0 total=2784),
physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Table: #results01000130014531845 scan count 0, logical reads: (regular=262042 apf=0 total=262042),
-----------
261612
Note that most of the 531,220 I/Os are related to the #results table -the others are used when performing the
actual joins & sorting for the work tables
Merge Join Disabled (NLJ)
Now lets disable the merge join and forcing an NLJ (by default - we are not using allrows_dss which
would allow a hash join). The script changes slightly to:
dbcc traceon(3604)
go
set showplan on
go
-- set option show on
-- go
go
set statistics io on
go
set merge_join 0
go
exec temp_test_default 'business', 'Jan 1 2000', 'Dec 31 2000 11:59pm' with recompile
go
dbcc traceoff(3604)
set showplan off
set option show_lio_costing off
set statistics io off
go
Once again, the estimates for #sales and #salesdetail are the same.

111


The showplan predictably picks up the N-ary NLJ - after first reformatting #sales as seen below (but note
the difference in logical I/Os at the bottom):
ROOT:EMIT Operator
|SEQUENCER Operator has 2 children.

|
| |STORE Operator
| | Worktable1 created, in allpages locking mode, for REFORMATTING.
| | Creating clustered index.
| |
| | |INSERT Operator
| | | The update mode is direct.
| | |
| | | | FROM TABLE
| | | | #sales
| | | | s
| | | | Table Scan.
| | |
| | | TO TABLE
| | | Worktable1.
|
| |INSERT Operator
| | The update mode is direct.
| |
| | |N-ARY NESTED LOOP JOIN Operator has 3 children.
| | |
| | | | FROM TABLE
| | | | #salesdetail
| | | | sd
| | | | Table Scan.
| | |
| | | | FROM TABLE
| | | | titles
| | | | t
| | | | Index : titleidind
| | | | Keys are:
| | | | title_id ASC
| | |
| | | | FROM TABLE
| | | | Worktable1.
| |
112
| | TO TABLE
| | #results
| | Using I/O Size 32 Kbytes for data pages.
However, notice the difference in I/Os!!

Table: Worktable1 scan count 261612, logical reads: (regular=815090 apf=0 total=815090), physical
Table: titles scan count 261612, logical reads: (regular=784836 apf=0 total=784836), physical reads:
-----------
261612
The problem is the optimizer picked #salesdetail as the outer table, then reformatted #sales and then finally
joined with titles. Since there were 261,262 rows in #salesdetail, this cause the I/O count to jump from
500K to 3.7M - a 700% increase.
This suggests that even for larger #temp table joins (#sales had 26,000 rows, #salesdetail had 260,000
rows), a merge join in ASE 15.0 may still be significantly faster than ASE 12.5.
Default Optimization with Update Statistics
Normally we associate update statistics with indexes - but what if the table doesnt have any indices - does
it still help the optimizer? To test this, we will alter the earlier proc as follows:
create procedure temp_test_stats
@end_date datetime
as begin

from sales
update statistics #sales

update statistics #salesdetail
-- these two statements are fakes - just to force the update stats to take
-- effect since it appears not to work immediately for the following query
select rowcnt=@@rowcount
into #rowcnt
drop table #rowcnt

into #results
drop table #sales

drop table #results
113
return 0
end
go
Now if we run it with the default optimization for ASE 15.0 (merge join enabled - same as earlier), lets
first take a look at the index selectivity outputs for the #sales and #salesdetail tables.

ord_num = ord_num
selectivity = 0.1,
stor_id = stor_id
selectivity = 0.1,


title_id = title_id
Estimated selectivity for title_id,
So running update statistics by itself to try to tell the optimizer the number of pages/rows doesnt work. Or
so we think - but then we notice the optimizer re-optimizes based on re-resolving the query due to the
update statistics and we see the following near the bottom:

ord_num = ord_num
selectivity = 0.1,
stor_id = stor_id
selectivity = 0.1,


ord_num = ord_num
selectivity = 0.1,
stor_id = stor_id
selectivity = 0.1,
So, it does pick up the statistics for cost optimization, but the estimated I/Os in the showplan does not
change as evident below:
114
ROOT:EMIT Operator
|INSERT Operator
|
| | Key Count: 1
| |
| | |SORT Operator
| | |
| | | |
| | | | |
| | | | | | #sales
| | | | | | s
| | | | | | Table Scan.
| | | |
| | | | |
| | | | | | #salesdetail
| | | | | | sd
| | | | | | Table Scan.
| |
| | |SCAN Operator
| | | FROM TABLE
| | | titles
| | | t
| | | Table Scan.
| | | Forward Scan.
|
| TO TABLE
| #results
Which is identical to the default merge join processing from earlier - including the estimated I/Os. The
actual I/Os for the main query
-----------
261612
Not too surprising, the I/Os didnt change - without any indices, we still have to access the tables the same.
115
Default Optimization with Create Index
Now, lets test the always controversial topic of whether or not ASE uses indexes created on #temp tables
or not. To test this, we will alter the earlier proc as follows:
create procedure temp_test_idx
@end_date datetime
as begin

from sales
create unique index sales_idx on #sales (stor_id, ord_num)

with statistics using 1000 values

create unique index salesdetail_idx on #salesdetail (stor_id, ord_num, title_id)

create index salesdetailtitle_idx on #salesdetail (title_id)


into #results
drop table #sales

drop table #results
return 0
end
go
Now if we run it with the default optimization for ASE 15.0 (merge join enabled), lets first take a look at
the index selectivity outputs for the #sales and #salesdetail tables.
(some output removed)
Estimating selectivity of index 'sales_idx', indid 2

ord_num = ord_num
stor_id = stor_id
selectivity = 0.1,
scan selectivity 3.793051e-005, filter selectivity 3.793051e-005
restricted selectivity 1
unique index with all keys, one row scans
(some output removed)
Estimating selectivity of index 'salesdetail_idx', indid 2

ord_num = ord_num
stor_id = stor_id
selectivity = 0.1,
scan selectivity 4.946968e-005, filter selectivity 4.946968e-005
restricted selectivity 1
Index Page Cluster Ratio 0.9985192
116


Data Page LIO for 'salesdetail_idx' on table '#salesdetail' = 1.55072
Estimating selectivity of index 'salesdetailtitle_idx', indid 3

scan selectivity 1, filter selectivity 1
261612 rows, 1156 pages
Index Page Cluster Ratio 0.9990109
using index prefetch (size 32K I/O)

Data Page LIO for 'salesdetailtitle_idx' on table '#salesdetail' = 229869

Path: 338222.5
Work: 441325.4
Est: 779547.9
So the optimizer does at least consider the indexes in the optimization. The showplan now is
ROOT:EMIT Operator
|INSERT Operator
|
| | Key Count: 1
| |
| | |SORT Operator
| | |
| | | |
| | | | | sd
| | | | | Index : salesdetail_idx
| | | | | Positioning at index start.
| | | | | Using I/O Size 32 Kbytes for index leaf pages.
| | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | |
| | | | | #sales
| | | | | s
| | | | | Index : sales_idx
| | | | | Positioning at index start.
| | | | | Using I/O Size 32 Kbytes for index leaf pages.
| | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| |
| | |SCAN Operator
| | | FROM TABLE
| | | titles
117
| | | t
| | | Table Scan.
| | | Forward Scan.
|
| TO TABLE
| #results
So it still is going to use a merge join, but now it estimates the I/Os at 27,000 (vs. 46). The actual I/Os
are still approximately the same (~500K) as illustrated below:
-----------
261612
Lets see what happens when we disable the merge-join and force the NLJ - which NLJs typically will do
index traversals. The show plan becomes:
ROOT:EMIT Operator
|INSERT Operator
|
| |N-ARY NESTED LOOP JOIN Operator has 3 children.
| |
| | |SCAN Operator
| | | FROM TABLE
| | | #sales
| | | s
| | | Table Scan.
| | | Forward Scan.
| |
| | |SCAN Operator
| | | FROM TABLE
| | | #salesdetail
| | | sd
| | | Index : salesdetail_idx
| | | Forward Scan.
| | | Positioning by key.
| | | Keys are:
| | | stor_id ASC
| | | ord_num ASC
| | | Using I/O Size 4 Kbytes for index leaf pages.
| | | With LRU Buffer Replacement Strategy for index leaf pages.
| |
| | |SCAN Operator
| | | FROM TABLE
| | | titles
| | | t
| | | Using Clustered Index.
| | | Index : titleidind
| | | Forward Scan.
118
| | | Positioning by key.
| | | Keys are:
| | | title_id ASC
|
| TO TABLE
| #results
Since we have to process every row in the outer table (#sales) the table scan is not surprising. However, we
do note that #salesdetail does indeed pick up the index for the join!!! Note, however, the estimated I/O cost
of 1.4 million, with an actual I/O cost of:
Table: #salesdetail01000290015809592 scan count 26364, logical reads: (regular=120303 apf=0
total=120303), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Table: titles scan count 261612, logical reads: (regular=784836 apf=0 total=784836), physical reads:
-----------
261612
Interesting to compare this to the 3.7 million I/Os for the NLJ without the indexes - obviously, the use of an
index does indeed reduce the I/O cost for NLJ by nearly 40% - howver, it is still 4 times as many I/Os as
the merge join. The obvious question that might be asked is - so, the I/Os are cheaper - what about
execution times?? Running the same tests, with diagnostics off, but set statistics time on reveals the
following results (times are for the main query only - not the entire proc execution):
Method Cpu time (ms) Elapsed time (ms) Diff %*

Default proc (merge-join) 21,900 21,873 (ref)
Default proc w/ NLJ 35,500 35,643 162%
Proc w/ upd stats SMJ 19,900 20,000 (ref)
Proc w/ upd stats NLJ 32,200 33,906 162%
Proc w/ index SMJ 16,300 16,953 (ref)
Proc w/ index NLJ 23,300 23,250 143%
* The difference is measured compared to the SMJ run in the same proc
Still the merge join (SMJ) has better overall execution times. The last run is a bit misleading - while the
query execution times appear to be quite a bit better with the indexes vs. those without, the create index
times were about 10 seconds - resulting in the same overall time. The reason the query time is likely faster
is simply due to the fact that the create index statements forced the tables to be cached for the subsequent
query (whereas update stats with no indexes does not have the same effect).
From all of this, we can conclude that most applications that have joins with more than one #temp table
involved, should see some improvement from the enablement of merge joins in ASE 15.0. While there
may be exceptions, those can be dealt with on an individual basis allowing the others to benefit from the
improved execution times.
3rd Party Tool Compatibility
Third party applications or custom developed applications that access the database for normal DML and
query operations should largely remain unaffected by an upgrade to ASE 15.0. However, third party tools
or custom applications that perform DDL or interrogate the system catalogs may encounter problems due to
119
the system table changes, the deprecated functions, etc. For example, previous releases of Embarcadero's
DBArtisan had a widely liked feature in which not only did the object browser display the table names, but
it also listed the number of rows in the table. This row count was derived by using the rowcnt() function,
which has now been deprecated. As a result, users of DBArtisan may be surprised when running older
versions of the tool to see that their ASE 15.0 server is "missing all the data" as DBArtisan will report 0
rows for every table. Remember, in deprecating the functions, Sybase left the function intact, but also had
it return a constant value of 0 along with a warning that the function was deprecated.
Users of third party system administration/DBA utilities should contact the vendors to see when they will
be releasing an ASE 15.0 compatible version.
Application/3rd Party Tool Compatibility FAQ
What changes were made to the system tables in ASE 15.0
Quite a few. Most notably, the system catalogs in ASE 15.0 were updated to reflect the changes to support
semantic partitions as well as column encryption. As a result, new system tables were added, many were
extended to include partition information and new objects were added to the object classes. A full
description of the changes is contained in the ASE 15.0 What's New Guide.
120
dbISQL & Sybase Central
In ASE 15.0, a new client utility was introduced for ASE customers dbIsql. The reason is that dbIsql
provides a common client utility for all the Sybase servers (previously it was used for ASA and ASIQ) and
can connect to OpenServer-based applications such as Sybase Replication Server. Additionally, as a java
based client, it can run on Linux and other Unix platforms and is a more mature product than the jisql
utility previously shipped with ASE. Given the growth of Linux desktops in corporate environments, it
was important to provide a GUI client that ran on multiple platforms.
Sybase Central 4.3 ships with ASE 15.0. However, this is not the same build as the Sybase Central 4.3 that
may already be installed on your PC from other installations. This current build supports plug-ins for
SySAM, Unified Agent Framework (UAF) and includes new functionality not available in previous
releases (starting, stopping, pinging remote servers, remote errorlog viewing, automatic server detection,
server groups, etc.). It also has a new location specifically in %SYBASE%\Shared\Sybase Central 4.3
for Windows users vs. the former location directly under %SYBASE%. This can lead to a number of
problems if the previous versions are left intact as program launch icons, path settings and CLASSPATH
may point to the previous location. The best advice is to rename the old installation directory and then after
a period of time (when you are sure all the other product plug-ins are compatible with the newer release)
completely delete it. Product plug-ins are individually available from www.sybase.com.
dbISQL & Sybase Central FAQ
What happened to SQL Advantage? Does it still work?
SQLAdvantage is no longer being maintained by Sybase. While it still can connect to ASE 15.0, some of
the newer API features in OCS 15.0 are not available to it, consequently it may be limited in functionality.
If you have problems with SQLAdvantage running on a machine you recently installed the ASE 15.0 PC
Client software, the likely cause is that it is finding OCS 15.0 in the path ahead of OCS 12.5, and with the
renaming of some of the dll libraries, it fails. The work-around is to create a shell script that has all of the
OCS 12.5 library directories in the PATH and none of the OCS 15.0 libraries and launch SQL Advantage
from the shell script.
I can't get dbIsql to launch it always fails with class not found or other errors.
Both the GA and ESD 1 batch scripts contained an error in the call to dbIsql's class file. In some cases the
"-Dpath" option was garbled, although the %path% for the option to set was present. To fix, navigate to
%SYBASE%\dbisql\bin, edit the dbisql.bat file and make sure the execution line reads as follows (note the
line breaks below are due to document formatting. Pay particular attention to the dashes ("-") at the ends of
some lines which are the switch character when copy/pasting this into your script):
REM optionally trim the path to only what is necessary
set PATH="c:\sybase\ASE-15_0\bin;c:\sybase\OCS-15_0\bin;.;"
REM this should all be on one line line breaks are due to document formatting
"%JRE_DIR%\bin\java" -Disql.helpFolder="%DBISQL_DIR%\help" -
Dsybase.jsyblib.dll.location="%SYBROOT%\Shared\win32\\" -
Djava.security.policy="%DBISQL_DIR%\lib\java.policy" -Dpath="%path%" -classpath
"%DBISQL_DIR%\lib;%isql_jar%;%jlogon_jar%;%jodbc_jar%;%xml4j_jar%;%jconn_jar%;%dsparser_jar%;%helpmana
ger_jar%;%jcomponents_jar%;%jh_jar%;%jsyblib_jar%;%planviewer_jar%;%sceditor_jar%;%uafclient_jar%;%jin
icore_jar%;%jiniext_jar%;%jmxremote_jar%;%jmxri_jar%;%commonslogging_jar%;%log4j_jar%"
sybase.isql.ISQLLoader -ase %*
Optionally, you can remove the argument entirely as it is not necessary.
Another common problem is that customers on Windows machines running Sybase ASA tools (from either
Sybase ASA or Sybase IQ), the ASA installer installs its version of dbisql as a startup process for quicker
launches. You will need to open the task manager to kill the process and then remove it from the startup
lists before attempting to run the ASE version of dbISQL (aka Interactive SQL)
121
Sybase Central frequently crashes or won't start up
Make sure that you are using the current build of Sybase Central (4.3.0.2427 as of ASE 15.0 ESD2). The
most likely cause is that you are launching an old version of Sybase Central (i.e. 4.1) or and older build of
4.3 that is not compatible with the ASE 15.0 plug-in. Make sure you are launching the version out of
%SYBASE%\Shared\Sybase Central 4.3 (to make sure of this, open a DOS window, navigate to the
directory and execute scjview.bat.
Additionally some issues have been noted when using Sybase Central with the UAF plug-in which uses
java security policies and VPN software that does TCP redirection, such as VSClient from InfoExpress.
If this is the problem, exit the VPN client application entirely and try Sybase Central again.
In any case, all Sybase Central crashes create a stack trace in a file named scj-errors.txt (if more than one,
the files will be number sequentially after the first crash i.e. scj-errors-2.txt) located in the Sybase Central
home directory. If reporting problems to Sybase Technical Support, include this file as it identifies all the
plug-in versions as well as all the jar versions used by Sybase Central and the plug-ins.
122
Appendix A - Common Performance Troubleshooting Tips

1. Delete the statistics and run update index statistics on the table. If necessary, gradually
increase the step count and/or decrease the histogram tuning factor from the default of 20 to a
number that keeps the total number of steps <2,000. A good range to consider is you want the
statistics to be in the range 200 (histogram tuning factor * index steps) 2,000.
2. Before increasing the step count, check the show_lio_costing & show_histograms to see if IO
costing is not in line with expected values.
3. If a merge join is select, make sure the merge key count is same as join key count (via
showplan). If not, run update index statistics.
4. Install MDA tables and use monOpenObjectActivity and monSysStatement to identify query
related problems.
5. Use a different optimization strategy, such as allrows_oltp.
6. Watch the procedure cache using the new MDA tables for this to observe if the increased proc
cache requirements for sorting (due to merge sorts, etc.) impact the number of procs cached
and increase as necessary.
7. Use plancost along with showplan when troubleshooting query plans
8. For quick ad-hoc query problems, consider using the QP Metrics features to identify the
queries involved.
123
125
Sybase Incorporated
Worldwide Headquarters
One Sybase Drive
Dublin, CA 94568, USA
Tel: 1-800-8-Sybase, Inc.
www.sybase.com
Copyright 2000 Sybase, Inc. All rights reserved. Unpublished rights reserved under U.S. copyright laws. Sybase and the Sybase logo are
trademarks of Sybase, Inc. All other trademarks are property of their respective owners. indicates registration in the United States.
Specifications are subject to change without notice. Printed in the U.S.A.

Planning Your ASE15 Migration v1

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Planning Your ASE15 Migration v1

Hochgeladen von

Copyright:

Verfügbare Formate

Planning Your ASE 15.

Tips, Tricks, Gotchas & FAQ.

Table of Contents .......................................................................................................................... iii

Update Statistics on Partitions .................................................................................................. 101

Introduction & Pre-Upgrade Planning

This document is organized into the following sections:

Implementing SySAM 2.0

High Level Upgrade Steps for ASE 15

Plan Your SySAM 2.0 License Management Architecture

Determine Your Upgrade Path

1. Queisce the database using a manifest file

Reduced 32-bit Platform Availability

Prepare for the Upgrade

Stress Test Your Application with ASE 15 QP

Post-upgrade monitoring of query processing

Upgrade of 3rd party tools to versions that support ASE 15.0

General ASE 15 Migration FAQ

Are there customers running ASE 15.0 in production? How many?

Why should we upgrade to ASE 15.0?

But Isnt 12.5.4 Also Planned?

SySAM 2.0 Installation & Implementation

SySAM 2.0 Background

SySAM 2.0 Implementation Steps

1. Determine license management architecture (served, unserved, OEM, mixed)

There are a couple of key points about the above diagram:

0:18:45 (lmgrd) SYBASE using TCP-port 1060

License Server Quantity & Locations

The geographic distribution of your business locations and IT infrastructure to support it

Disaster Recovery (DR) Sites

License Server Redundancy

2 of the 3 license servers must be available at all times

The three hostids have to be provided at the time of license generation

Development System License Servers

Limited resources (single cpu, small number of connections, etc.)

Protects the production systems from development/testing system issues

Business Unit License Provisioning

Inventory Current Deployment

Consider the following example using our mythical ABC Bank:

Host Bus Unit / Software/ SySAM

NYC Prod_01 12 Trading ASE SR 1 nyc_sysam_01

License Server HostIDs

Generate Licenses from SPDC

A sample license file may look like the following:

SySAM Software Installation

Name Quantity Type Status GraceExpiry

A simple license grace detection query would merely check:

Software Deployments & SySAM

ESDs, IRs and Updating SySAM licenses

ASE 15 & SySAM FAQ

Is SySAM 2.0 Required?

Is a license required for the ASE Developers Edition?

Where can we go for more information?

Preparing for the Upgrade

Review Database Integrity

Check Security Tables (sysusers)

Run dbcc checkstorage

Check database space

Find all currently partitioned tables

Increase HW Resource Requirements

Preparing for Post-Upgrade Monitoring for QP Changes

Review Trace Flags

Trace Description 12.5.x 15.0 Additional Remarks

364 Use range density instead of total density 3 8 No longer supported

370 use min-max index only as alternative to 3 8 No longer supported