Sie sind auf Seite 1von 14

Best Practice

Subject: Sybase ASE Monitoring


Author(s): Tom Oorebeek, Staff DBA, Sybase IT

Reviewer(s): Hema Seshadri, Sr. DBA Manager, Sybase IT

Abstract:

In today’s information driven world, high availability and optimal performance of database
infrastructure is more important than ever. Many business aspects rely on being able to retrieve real-
time data from their databases. Monitoring this critical infrastructure ensures maximum uptime.
Monitoring is important not just from an overall availability and performance perspective, but also from
the perspective of the end-user such that business productivity is not compromised. This document
discusses some of the best practice ideas on what aspects of a production ASE environment should
be monitored.

Sybase, Inc. 2009 Page 1 of 14


Table of Contents
Introduction............................................................................................................................................. 3
1 Best Practices: ASE Monitoring..................................................................................................... 4
1.1 ASE versions................................................................................................................................ 4
1.2 Operating systems....................................................................................................................... 4
1.3 Tools............................................................................................................................................. 4
1.4 Monitoring aspects....................................................................................................................... 4
1.5 What next?................................................................................................................................... 4
2. ASE resources.................................................................................................................................... 6
2.1 The ASE itself............................................................................................................................... 6
2.2 Licenses....................................................................................................................................... 6
2.3 Database Availability.................................................................................................................... 7
2.4 Data Storage................................................................................................................................ 7
2.5 Disk Space................................................................................................................................... 8
2.6 User Activity................................................................................................................................. 9
2.7 Error Logs.................................................................................................................................... 9
2.8 Blocking...................................................................................................................................... 10
2.9 Data Consistency....................................................................................................................... 10
Appendix A............................................................................................................................................ 11
Sample dataserver startup script: #!/bin/sh...............................................................................11

Sybase, Inc. 2009 Page 2 of 14


Introduction

- This Best Practices document is setup to help a Sybase DBA understand the various
aspects of ASE monitoring and also to provide a quick start for setting up such monitoring.

Sybase, Inc. 2009 Page 3 of 14


1 Best Practices: ASE Monitoring

1.1 ASE versions

This document applies to all currently supported ASE versions. If certain features are specific to a
particular version, it will be mentioned in the description at hand.

1.2 Operating systems

The monitoring aspects are generic across H/W platforms, although the sample scripts in Appendix A
were written for the UNIX platform.

1.3 Tools
Various tools are available in the market for monitoring an ASE, e.g. Sybase SCC, Sybase Central,
sysmon, MDA Tables, DB Artisan, DB Virtualizer, or Nimbus. This document focuses on what to
monitor rather than recommending a tool or method to monitor. As necessary, references are made to
automatic/unattended monitoring through scripts and cronjobs (Unix environments only).

The ASE server software includes software like Historical Server and Monitor Server, but these are
separate products with their own manuals. As such these products are not described here either.

1.4 Monitoring aspects


What aspects of an ASE should be monitored? How often? Although the type, frequency and
threshold levels for monitoring a system will vary from environment to environment, some aspects are
essential to all ASE environments.

 ASE Availability
 Licenses
 Database availability
 Data storage
o Devices
o Database segments
o Log
 User Activity
 Error logs
 Blocking
 Data Consistency

1.5 What next?

You follow the best practices and setup extensive monitoring. Now, what? Whether you choose to
page a on call DBA, email the DBA or an entire alias, or fix the problem on a proactive basis, is up to

Sybase, Inc. 2009 Page 4 of 14


you. This document does not spell out the actions to follow an alert from your monitoring tool, although
certain recommendations may be made.

Typical Sybase environments have multiple ASEs that cater to a variety of business applications. It is
recommended to buy or build a tool that provides an enterprise view of these systems, with the ability
to drill down to issues for any given system.
A dashboard approach to the problem that is tied in to the monitoring system, allows for the monitoring
script/feature to display the server status based on the alerts just received. For example, a server
down status might mark the server icon red. A 1105 error might mark the server orange. A server with
no monitoring alerts would stay green and so on. As alerts are attended to, the server status would
change accordingly.

Sybase, Inc. 2009 Page 5 of 14


2. ASE resources

2.1 The ASE itself

a. Is the server up and running?


Ping the database server frequently to ensure it is alive and responding to commands. Check
for the process at O/S level and also ping the database for a simple command. Ensure it
returns the expect results. This confirms the basic availability of your server. Page the on call
DBA if the server process does not show up or respond.
Do the same for your Backup Server process.

b. If it was just rebooted,


 Did it start up correctly with all options enabled?
 Is the license valid?
 Did it come up with async I/O?
 Did all the databases come up correctly?
 Check for anything abnormal

ASE startup messages are more or less standard server messages, like opening of the
allocated devices and databases, but others, depending on configuration and/or
traceflags, can show extra information about various aspects of the running ASE, like:
connections (successful and failure logins), licenses etc.
See sample startserver script file in Appendix A for in-built error checking.

Once a server has started, the errorlog should be checked for startup behavior and non-
standard messages, whereas a running server should have its errorlog automatically and
periodically be scanned and non-standard messages be emailed to the concerned
DBA(s).

To decide if a message is important enough for reporting, one could setup search strings
per different type of messages, like errors (fatal or not), warnings and operational
messages.
A separate category would be messages that could be picked up as errors or warnings,
but can be skipped during normal processing. This type of messages would include
objects names with the word error or msg in it.

Be sure to include keywords such as E(e)rror, stack, infected, warning, msg, instead,
will shutdown, failed and any other keywords/phrases applicable to your environment.
Filter on errors you know to be informational. See your ASE Error Messages Guide for
details on informational errors. Error numbers of the pattern 6?? (6 followed by two digits)
tend to be severe. Set up special alerts for this – for example, email messages with the
word ‘PANIC’ or ‘SEVERE ALERT’ in them.

2.2 Licenses

Sybase, Inc. 2009 Page 6 of 14


Many Sybase products are offered with several editions and license types. Some features are not part of
the core product and as such, require active licenses. Most customers use the SYSAM manager to manage
licenses. Some may have a central licensing server.

Monitor your license server process. Ensure it is up and running. Look for errors. If you use stand alone
licensing, check for keywords such as ‘grace’, ‘will shutdown’, ‘Failed to obtain’ and ‘expire’.

You can specify the email address to send licensing warnings and errors during ASE installation or by using
the sp_lmconfig command after installation.

2.3 Database Availability


The server itself may be up and running, but are the individual databases available to the users? For
example, after a scheduled downtime where a particular database was made offline to users, setup
monitoring that will alert a DBA if a database is unintentionally left in dbo or read-only mode. Or sql
queries may be timing out on a particular user database.

Another example would be that the database is available to users, but is perhaps in log suspend or is
refusing connections for some reason.
- Setup alerts for free log space per user database. Set up your alerts such that they escalate
in priority if the situation does not improve.
- You may set your last chance thresholds to dump the log or abort the transaction, but there
could be scenarios where the LCT action does not have a chance to kick in. Be sure to
check for 1105 and log suspend errors.
In addition to alerts, one should check the general response time that will alert the DBA if a simple
query such as ‘sp_who’ does not return in a reasonable amount of time. Be sure to use a non-
privileged account when checking database availability as discussed in this section.

2.4 Data Storage


Device space and database sizes are more or less static and pre-allocated. Information about these is
provided by the standard procedures, sp_helpdevice (with ASE 15.x also showing free space) and
sp_helpdb (also showing free space per fragment).

These standard procedures however generally do not show the details required during continuous
monitoring or they show too much detail, like per each fragment, when we are only interested in
overall data, log and/or index segments and device space usage.

Most important space monitoring is the current level of segment (data and/or log) space usage and the
growth per day, week or month. Keeping historical space and growth information allows for better
resource (disk space) capacity planning and allocation.
See the devusage and listseg current space usage monitoring scripts in the list of useful scripts in
Appendix A.

Devices

In general database devices are created on raw partitions and are therefore not monitored with O/S
commands. However, in case an ASE also uses filesystems to store (some of) its database devices,
these filesystems should also be monitored. Examples are the devices for tempdb and/or development
and test ASEs.

Devices files are more or less static, but ASE 12.5 introduced the disk resize command, so even

Sybase, Inc. 2009 Page 7 of 14


existing devices can grow and fill up the filesystem they are stored on. Monitor growth on these
devices.

2.5 Disk Space

Depending on the Operating System being used, disk space usage can be monitored using the df
(Solaris, AIX) or bdf (HP-UX) commands. Important directories to monitor at the O/S level for a
Sybase installation are:

- The Sybase software installation tree (“Sybase directory”)


- Errorlog directory
- Possible device directories
- Database dump directories (if dumping to disk)

The Sybase software directory is more or less static, apart from the ever growing errorlog and
configuration files. For best practices, you may want to consider separating these files from the main
installation directory for easier maintenance.

Each ASE has its own startup files like the RUNSERVER startup script, configuration files and its
errorlog. Separating these files under there own folder structure enables better handling, monitoring
and also allows easier upgrading of the ASE software in the future.
As an example one could store the software and startup files using the following tree setup: (using
PROD_ASE as example server name)

Directory: Contents:
-------------------- ---------------------------------------
/sybase/PROD_ASE Startup script(s) and sub directories.
/sybase/PROD_ASE/cfg Configuration files
/sybase/PROD_ASE/log ASE error logs

/sybase/sybase_12.5.4 Version 12.5.4 software tree.


/sybase/sybase_15.0.3 Version 15.0.3 software tree.
/sybase Link to the actual version directory currently being used.

2.6 Database & Transaction Log dumps


To provide for easy recovery, it is recommended that you backup your Sybase databases periodically.
Backups may done directly to tape or, first backed up to disk filesystems and later backed up at the
O/S or network level. In this latter method, both the software being used (Sybase directory, startup
directories, scripts and log files) as well as the database dumps can be stored on one tape or
combination of tapes.

Directories used for database dumps have to be checked at least once a day, once the database
dumps have been made, as once the data volume grows, also the size of the database dump files will
grow. For performance reasons it is possible to dump databases to parallel stripes, meaning multiple
dump directories (and/or filesystems) can be used for one database dump. Be sure to monitor space
in each of the dump directories, so your nightly backups succeed.
Transactional log dumps are periodically taken to ensure point-in-time recovery. Keep these tran log
dump files in a separate directory from your full backups and monitor this directory for free space as
well.
Setup your monitoring to confirm daily successful backups. In addition to lack of disk space being a
possible issue, there could be other failures, including disk failures or errors. Check the backup server
errorlog for error keywords.

Sybase, Inc. 2009 Page 8 of 14


2.6 User Activity
User Activity is significant to system performance and health. It needs to be monitored from a historical
perspective as well real-time to identify performance problems as they happen. Gather information on
active user sessions, CPU cycles, cache hits, and disk I/O over a period of time. This will help you set
up alert thresholds.

User connections

Monitor the total number of currently connected clients/users on a regular basis. This will help
establish a baseline. Regular checking if a server (almost) hits its maximum number of user
connections prevents users from being locked out (signaled with Error 1608 messages in the log). One
way to check the number of current connections is the standard procedure sp_who, but more detailed
and summary providing scripts and procedures are referenced in this document. See the sp_w
procedure, db_spy and cnt_sessions scripts in Appendix A.

CPU, Disk I/O, cache hits

A standard ASE procedure that can be used to monitor for contention and bottlenecks in real-time is
sp_sysmon. Running this procedure on a regular basis with a reasonable timeframe (e.g. 5-15
minutes per run) gives us information about various ASE counters and behavior, for that timeframe.
When combined with information gathered on active processes (see sample db_spy script in
Appendix A), one should be able to pinpoint server or process misbehavior to certain processes and/or
configuration issues. Several third party monitoring tools use this method of gathering information in
their monitoring.

For details on using and interpreting sp_sysmon, please see the standard documentation set:
Performance and Tuning Guide, chapter: Monitoring Performance with sp_sysmon.

There are tools available on the O/S side of the house as well to check for disk I/O bottlenecks on
devices that house the database and log files.

See sample script run_sysmon in Appendix A.

Problem SQL/stack trace


Typical problems faced by DBAs include sudden slow performance, hung queries, queries consuming
100% CPU and the like. Tools such as MDA Tables can be a big help here. Since problems don’t
always occur when a DBA is at their desk, it helps to collect historical information that can be used as
a baseline. Some of the things MDA collects that can help you monitor as well as root cause a
problem:
• Process Activity: CPU usage, IO activity, resource usage
• Resource usage: Data cache, procedure cache, engines
• Object usage: Tables, partitions, indexes, stored procedures
• Query history: SQL text, statement metrics, query plans, errors

See Practical Use of MDA tables for examples and more information.

2.7 Error Logs

All logfiles associated with your ASE (ASE errorlog, backup server log, monitoring or maintenance jobs
output logs) should be scanned for errors, warnings and other important messages. At a minimum this
scanning should report the existence of any messages (found/not found or success/failure). One step

Sybase, Inc. 2009 Page 9 of 14


further would be to also report the object (e.g. database or segment) the messages are related to. See
Unix Scripts in Appendix A for a template script for error checking.
Be sure to include keywords such as E(e)rror, stack, infected, warning, msg, instead, will
shutdown, failed and any other keywords/phrases applicable to your environment. Filter on errors
you know to be informational. See your ASE Error Messages Guide for details on informational errors.
Error numbers of the pattern 6?? (6 followed by two digits) tend to be severe. Set up special alerts for
this – for example, email messages with the word ‘PANIC’ or ‘SEVERE ALERT’ in them.

Be sure to monitor space for your ASE errorlogs. In the default setup, ASE always appends to the
current errorlog, so this file is always growing and can become too large to query or edit with an editor.
In case of an ASE reboot, it is therefore advised to rename the old errorlog (adding date/timestamp)
and have the ASE create a new errorlog during every reboot.

2.8 Blocking

Most blocking conflicts are temporary in nature, and will resolve themselves eventually in a very short
period of time. However, potentially bad application design or thoughtless adhoc user transactions
can cause massive blocking impacting multiple users of the database. At such times, the server may
be up and running, but for all practical purposes it is unavailable to the user.

Setup your monitoring to check for blocking that is non-transient. Any process (spid) that blocks for
more than say, 2 minutes should be watched. Depending on the tolerance level of your user base,
setup the alert to page the oncall DBA upon reaching a certain threshold. Often times, agreements
with the business allow for spids blocking > ‘x’ minutes to be killed by the monitoring script.
See script sp_block in Appendix A for an example.

2.9 Data Consistency

Data consistency must be checked periodically. dbcc checkdb, dbcc checkcatalog and dbcc
checkalloc are options that may take awhile to run on large databases, possibly blocking regular user
access to the objects currently being checked.

dbcc checkstorage is an alternative that includes many dbcc checks, as is archive db (ADA) that
provides for offline dbccs. Neither will fix errors reported. Setup your monitoring to first check for
successful running and completion of this job and two, to report on faults reported.

Sybase, Inc. 2009 Page 10 of 14


Appendix A
Each sample script below executes the tasks it is written for, stored output of the job to the script’s
logfile, writes status information to a central logfile (per server) and where possible also writes timing
and status information to a central database.

Sample directory structure

Using a standard setup of script and log directories eases the way scripts can be used and copied to
other hosts/ASE’s.

Directory: Contents:
-------------- -----------------------------------------------------
/dba/jobs DBA scripts for standard ASE tasks and monitoring.
/dba/input DBA input files, used by the DBA scripts
/dba/output DBA output files, like *.bcp etc.
/dba/log DBA script logfiles and central server logfile(s).
/dba/sysmon DBA script logfiles for a specific job, e.g.: run_sysmon

The above mentioned directories are used in the script samples, shown below.

Sample dataserver startup script:

#!/bin/sh
# -------------------------------------------------------------------------------
# Start dataserver: PROD_ASE
# -------------------------------------------------------------------------------
. /sybase/sybase_12.5/SYBASE.sh # Set Sybase environment

SERVER=PROD_ASE # Name of the ASE


MASTER=/dev/rdsk/sybase/PROD_ASE.master # Its master device

CFGDIR=/sybase/$SERVER/cfg # Configuration directory


LOGDIR=/sybase/$SERVER/log # Log directory
CFGFIL=$CFGDIR/$SERVER.cfg # Configuration file
LOGFIL=$LOGDIR/errorlog # Errorlog
TRACES="" # Traceflags
TRACES="$TRACES -T1204" # Print deadlock info in log
# TRACES="$TRACES -T4013" # Show login records in log
# (now config setting)
# -------------------------------------------------------------------------------
# Check is server is already running (prevent logfile rename)
# -------------------------------------------------------------------------------
if [ `/usr/bin/ps -ef | grep dataserver | grep $SERVER | wc -l` -gt 0 ]
then
echo "WHOA ... $SERVER is already running !!!"
exit 1
else
# ------------------------------------------------------------
# Check Async IO setting
# ------------------------------------------------------------
asyncERR=`grep -c "allow sql server async i/o = 0" $CFGFIL`
if [ $asyncERR -gt 0 ]
then
echo “------------------------------------------------------------“
echo " ERROR: Async IO not configured !!!”
echo “------------------------------------------------------------“

Sybase, Inc. 2009 Page 11 of 14


exit 1
fi

[ -f $LOGFIL ] && { mv $LOGFIL $LOGFIL.`date +%y%m%d.%H%M` ; } # Rename log

# ------------------------------------------------------------
# Start ASE, specifying special directories and options
# ------------------------------------------------------------
$SYBASE/$SYBASE_ASE/bin/dataserver -s$SERVER \
-e$LOGFIL \
-d$MASTER \
-c$CFGFIL \
-i$SYBASE \
-M$CFGDIR \
$TRACES > /dev/null &
fi

Sybase, Inc. 2009 Page 12 of 14


Filename: db_spy
Purpose: Save current processing information.

#!/bin/sh
# --------------------------------------------------------------------------------------
# db_spy Collect ASE information about currently running processes
# --------------------------------------------------------------------------------------
$RUNSQL <<-EOF | egrep -v "return status" >> $LOGFIL
$USRPWD
SET NOCOUNT ON
go
print " -------------------------"
print " Current processes"
print " -------------------------"
EXEC sp_who
go
print " -------------------------"
print " Current blocked processes"
print " -------------------------"
EXEC sp_block -- Shows blocking info
go
print " -------------------------"
print " Current locks"
print " -------------------------"
EXEC sp_lock
go
print " -------------------------"
print " Heavy hitters"
print " -------------------------"
EXEC sp_hogs -- Shows cpu and IO info from sysprocesses
go
print " -------------------------"
print " Monitor info"
print " -------------------------"
EXEC sp_monitor -- Standard proc
go
EOF

Sybase, Inc. 2009 Page 13 of 14


Filename: run_sysmon
Purpose: Runs sp_sysmon for the given nr of times and duration

#!/bin/sh
# ------------------------------------------------------------------------------
# Runs sp_sysmon X times against the given server for a given time period
# ------------------------------------------------------------------------------
SCRIPT=`basename $0`
SERVER=`echo $1 | tr "[a-z]" "[A-Z]"` # Server name
RUNMAX="$2" # nr of time to run
PERIOD="$3" # Duration to run in hh:mm:ss format

[ $# -lt 3 ] && { echo "------------------------------------------------------"


echo "Usage: $SCRIPT server times period "
echo " "
echo "Where: server = name of the dataserver to connect to "
echo " times = nr of times to run sp_sysmon "
echo " period = hh:mm:ss "
echo "------------------------------------------------------"
exit 1 ; }

. /sybase/sybase_15/SYBASE.sh # Set SYBASE environment

SYBUSR=`getserverusr $SERVER`
SYBPWD=`getserverpwd $SERVER`
FILNAM="/dba/sysmon/sysmon.out.$SERVER.${RUNMAX}x$PERIOD"
RUNSQL="$SYBASE/$SYBASE_OCS/bin/isql -U$SYBUSR -S$SERVER"

RUNCNT=1
while [ $RUNCNT -le $RUNMAX ]
do
OUTFIL=$FILNAM.$RUNCNT.`date +"%y%m%d.%H%M"`
if [ $RUNMAX -ge 10 -a $RUNCNT -lt 10 ]
then
OUTFIL=$FILNAM.0$RUNCNT.`date +"%y%m%d.%H%M"`
fi

$RUNSQL <<- ENDSQL | egrep -v "Password|return status = 0" > $OUTFIL


$SYBPWD
exec sp_echotime "`basename $OUTFIL`"
print 'Server: %1!', @@servername
print 'Version: %1!', @@version

exec sp_sysmon "$PERIOD", @dumpcounters='Y'

exec sp_echotime "`basename $OUTFIL`"


go
ENDSQL

compress $OUTFIL
RUNCNT=`expr $RUNCNT + 1`
done

Sybase, Inc. 2009 Page 14 of 14