Beruflich Dokumente
Kultur Dokumente
2005? Part - I
I've been asked this question every time that there's a new version and yet I've never been able to
give what I think is a nice, concise, logical answer that satisfies the asker. Probably it's a lack of my
ability to easily form words in my mouth and get them out in the proper order, so I decided it might
make some sense to do this on paper (metaphorically speaking) and help others out.
Like many of you, I usually get this question from someone outside of SQL Server. A windows admin,
a network guy, etc., someone who has little contact with SQL Server. Or maybe it's someone who's
been stuck with admin'ing a SQL Server instance.
In any case, I wanted to try and explain this concisely for the non-DBAs. As I began this project,
however I soon realized that it's not easy to just give a good general answer. As with everything
else in SQL Server it seems that "it depends" is the best general answer, so I broke this up into a
few areas. This part will look at the administrative differences and the next will cover more of the
development differences.
Why Upgrade?
This is an interesting question and one I've been asked quite a bit over the last 18 months since SQL
Server 2005 has been released. The short answer is that if SQL Server 2000 meets your needs, then
there's no reason to upgrade. SQL Server 2000 is a strong, stable platform that has worked well for
millions of installations. If it meets your needs, you are not running up against the limits of the
platform, and you are happy with your system, then don't upgrade.
However, there is a caveat to this. First the support timeline for SQL Server 2000 shows mainstream
support ending next year, in April 2008. I can't imagine that Microsoft wouldn't extend that given
the large number of installations of SQL Server 2000, but with the next version of SQL Server likely
to come out next year, I can see this being the point at which you cannot call for regular support.
The extended support timeline continues through 2013, but that's an expensive option.
The other consideration is that with a new version coming out next year, you might want to just
start making plans to upgrade to that version even if you're happy with SQL Server 2000. If the plan
is to release a new version every 2-3 years, you'll need to upgrade at least every 5-6 years to
maintain support options.
Be sure that in any case you are sure the application you are upgrading, if it's a third party, is
supported on SQL Server 2005.
Lastly, if you have multiple servers and are considering new hardware for more than 1 of them, it
might make some sense to be sure to look at buying one large 64-bit server and performing some
consolidations. I might recommend that you wait for the next version of SQL Server if you are
worried about conflicts as I have heard rumors of switches to help govern the resource usage in
Katmai (SQL Server 2008).
A quick summary of the differences:
Conclusion
These seem to be the major highlights from my perspective as an administrator. While there are
other improvements, such as the schema changes flowing through replication, I'm not sure that
they represent compelling changes for the non-DBA.
In the next article, I'll examine some of the changes from a developer perspective and see if any of
those give you a reason to upgrade.
And I welcome your comments and thoughts on this as well. Perhaps there are some features I've
missed in my short summary.
What are the Differences Between SQL Server 2000 and SQL Server
2005? – Part II
In part I of this series I looked at the administrative differences and in this part I'll cover some of the
development differences between the versions. I'm looking to make a concise, short list of things
you can tell a developer who is interested, but not necessarily knowledgeable about SQL Server, to
help them decide which version might be best suited to meet their needs.
And hopefully help you do decide if an upgrade is worth your time and effort.
One short note here. As I was working on this, it seemed that there are a great many features that I
might put in the BI or security space instead of administrator or development. This may not be
comprehensive, but I'm looking to try and show things from the main database developer
perspective.
Why Upgrade?
This is an interesting question. As with part I of this series, I'm not completely sure of how to
recommend this. If your server is running well as an administrator, there's no reason to upgrade. As
a developer, however, it's a bit more complicated.
Developers, almost by definition, are looking to change things on a regular basis. For developers,
they are fixing things, enhancing them, or rebuilding them. In the first or even second case, it may
not make much sense to upgrade if your application is working well. In the latter case, I'd really
think hard about upgrading because a rebuild, or re-architecture, takes a lot of time and resources.
If you're investing in a new application, or a new version of an application, then SQL Server 2005
might make sense to take advantage of the features of SQL Server 2005.
I'm guessing that many of these features will be around through at least the next two versions of
SQL Server. While I can see there being a radical rewrite after Katmai (SQL Server 2008), I can't
imagine that many things won't still be around in the version after that. They may get deprecated
after that, but they should be there for that version, which should see support through 2018 or
2019.
If you are struggling with ETL, trying to implement messaging, or web services, then it also might
make sense to upgrade your database server to SQL Server 2005.
A quick summary of the differences:
Conclusion
These are the highlights that I see as a developer and that are of interest. There are other features
in the security area, scalability, etc. that might be of interest, but I think these are the main ones.
I welcome your comments and thoughts on this as well. Perhaps there are some features I've
missed in my short summary that you might point out and let me know if you think it makes sense
to discuss some of the security changes. As far as BI stuff, hopefully one of you will send me some
differences in an article of your own.
• Write comments in your stored procedures, triggers and SQL batches generously,
whenever something is not very obvious. This helps other programmers understand
your code clearly. Don't worry about the length of the comments, as it won't impact the
performance, unlike interpreted languages like ASP 2.0.
• Do not use SELECT * in your queries. Always write the required column names after the
SELECT statement, like:
• Try to avoid server side cursors as much as possible. Always stick to a 'set-based
approach' instead of a 'procedural approach' for accessing and manipulating data.
Cursors can often be avoided by using SELECT statements instead.
If a cursor is unavoidable, use a WHILE loop instead. I have personally tested and
concluded that a WHILE loop is always faster than a cursor. But for a WHILE loop to
replace a cursor you need a column (primary key or unique key) to identify each row
uniquely. I personally believe every table must have a primary or unique key.
• Avoid the creation of temporary tables while processing data as much as possible, as
creating a temporary table means more disks I/O. Consider using advanced SQL, views,
SQL Server 2000 table variable, or derived tables, instead of temporary tables.
• Try to avoid wildcard characters at the beginning of a word while searching using the
LIKE keyword, as that results in an index scan, which defeats the purpose of an index.
The following statement results in an index scan, while the second statement results in
an index seek:
Also avoid searching using not equals operators (<> and NOT) as they result in table
and index scans.
• Use 'Derived tables' wherever possible, as they perform better. Consider the following
query to find the second highest salary from the Employees table:
SELECT MIN(Salary)
FROM Employees
WHERE EmpID IN
(
SELECT TOP 2 EmpID
FROM Employees
ORDER BY Salary Desc
)
The same query can be re-written using a derived table, as shown below, and it
performs twice as fast as the above query:
SELECT MIN(Salary)
FROM
(
SELECT TOP 2 Salary
FROM Employees
ORDER BY Salary DESC
) AS A
This is just an example, and your results might differ in different scenarios depending
on the database design, indexes, volume of data, etc. So, test all the possible ways a
query could be written and go with the most efficient one.
• Prefix the table names with the owner's name, as this improves readability and avoids
any unnecessary confusion. Microsoft SQL Server Books Online even states that
qualifying table names with owner names helps in execution plan reuse, further
boosting performance.
• Use SET NOCOUNT ON at the beginning of your SQL batches, stored procedures and
triggers in production environments, as this suppresses messages like '(1 row(s)
affected)' after executing INSERT, UPDATE, DELETE and SELECT statements. This
improves the performance of stored procedures by reducing network traffic.
• Use the more readable ANSI-Standard Join clauses instead of the old style joins. With
ANSI joins, the WHERE clause is used only for filtering data. Where as with older style
joins, the WHERE clause handles both the join condition and filtering data. The first of
the following two queries shows the old style join, while the second one shows the new
ANSI join syntax:
Do not prefix your stored procedure names with "sp_". The prefix sp_ is reserved for
system stored procedure that ship with SQL Server. Whenever SQL Server encounters a
procedure name starting with sp_, it first tries to locate the procedure in the master
database, then it looks for any qualifiers (database, owner) provided, then it tries dbo as
the owner. So you can really save time in locating the stored procedure by avoiding the
"sp_" prefix.
Views are generally used to show specific data to specific users based on their interest.
Views are also used to restrict access to the base tables by granting permission only on
views. Yet another significant use of views is that they simplify your queries. Incorporate
your frequently required, complicated joins and calculations into a view so that you don't
have to repeat those joins/calculations in all your queries. Instead, just select from the
view.
Do not let your front-end applications query/manipulate the data directly using SELECT or
If you have a choice, do not store binary or image files (Binary Large Objects or BLOBs)
inside the database. Instead, store the path to the binary or image file in the database and use
that as a pointer to the actual binary file stored elsewhere on a server. Retrieving and
manipulating these large binary files is better performed outside the database, and after all, a
database is not meant for storing files.
Avoid dynamic SQL statements as much as possible. Dynamic SQL tends to be slower than
static SQL, as SQL Server must generate an execution plan every time at runtime. IF and CASE
statements come in handy to avoid dynamic SQL. Another major disadvantage of using
dynamic SQL is that it requires users to have direct access permissions on all accessed objects,
like tables and views. Generally, users are given access to the stored procedures which
reference the tables, but not directly on the tables. In this case, dynamic SQL will not work.
Consider the following scenario where a user named 'dSQLuser' is added to the pubs database
and is granted access to a procedure named 'dSQLproc', but not on any other tables in the
pubs database. The procedure dSQLproc executes a direct SELECT on titles table and that
works. The second statement runs the same SELECT on titles table, using dynamic SQL and it
fails with the following error:
sp_addlogin 'dSQLuser'
GO
sp_defaultdb 'dSQLuser', 'pubs'
USE pubs
GO
sp_adduser 'dSQLUser', 'dSQLUser'
GO
CREATE PROC dSQLProc
AS
BEGIN
SELECT * FROM titles WHERE title_id = 'BU1032' --This works
DECLARE @str CHAR(100)
SET @str = 'SELECT * FROM titles WHERE title_id = ''BU1032'''
EXEC (@str) --This fails
END
GO
GRANT EXEC ON dSQLProc TO dSQLuser
GO
Now login to the pubs database using the login dSQLuser and execute the procedure dSQLproc
to see the problem.
Consider the following drawbacks before using the IDENTITY property for generating
primary keys. IDENTITY is very much SQL Server specific, and you will have problems porting
your database application to some other RDBMS. IDENTITY columns have other inherent
problems. For example, IDENTITY columns can run out of numbers at some point, depending on
the data type selected; numbers can't be reused automatically, after deleting rows; and
replication and IDENTITY columns don't always get along well.
So, come up with an algorithm to generate a primary key in the front-end or from within the
inserting stored procedure. There still could be issues with generating your own primary keys
too, like concurrency while generating the key, or running out of values. So, consider both
options and go with the one that suits you best.
• Minimize the use of NULLs, as they often confuse the front-end applications, unless
the applications are coded intelligently to eliminate NULLs or convert the NULLs
into some other form. Any expression that deals with NULL results in a NULL
output. ISNULL and COALESCE functions are helpful in dealing with NULL values.
Here's an example that explains the problem:
Consider the following table, Customers which stores the names of the customers
and the middle name can be NULL.
Now insert a customer into the table whose name is Tony Blair, without a middle
name:
The following SELECT statement returns NULL, instead of the customer name:
SELECT FirstName + ' ' + MiddleName + ' ' + LastName FROM Customers
Always use a column list in your INSERT statements. This helps in avoiding problems when
the table structure changes (like adding or dropping a column). Here's an example which
shows the problem.
Now run the above INSERT statement. You get the following error from SQL Server:
This problem can be avoided by writing an INSERT statement with a column list as shown
below:
Perform all your referential integrity checks and data validations using constraints (foreign
key and check constraints) instead of triggers, as they are faster. Limit the use triggers only for
auditing, custom tasks and validations that can not be performed using constraints. Constraints
save you time as well, as you don't have to write code for these validations, allowing the
RDBMS to do all the work for you.
Always access tables in the same order in all your stored procedures and triggers
consistently. This helps in avoiding deadlocks. Other things to keep in mind to avoid deadlocks
are: Keep your transactions as short as possible. Touch as few data as possible during a
transaction. Never, ever wait for user input in the middle of a transaction. Do not use higher
level locking hints or restrictive isolation levels unless they are absolutely needed. Make your
front-end applications deadlock-intelligent, that is, these applications should be able to
resubmit the transaction incase the previous transaction fails with error 1205. In your
applications, process all the results returned by SQL Server immediately so that the locks on
the processed rows are released, hence no blocking.
Offload tasks, like string manipulations, concatenations, row numbering, case conversions,
type conversions etc., to the front-end applications if these operations are going to consume
more CPU cycles on the database server. Also try to do basic validations in the front-end itself
during data entry. This saves unnecessary network roundtrips.
Do not call functions repeatedly within your stored procedures, triggers, functions and
batches. For example, you might need the length of a string variable in many places of your
procedure, but don't call the LEN function whenever it's needed, instead, call the LEN function
once, and store the result in a variable, for later use.
Make sure your stored procedures always return a value indicating their status. Standardize
on the return values of stored procedures for success and failures. The RETURN statement is
meant for returning the execution status only, but not data. If you need to return data, use
OUTPUT parameters.
If your stored procedure always returns a single row resultset, consider returning the
resultset using OUTPUT parameters instead of a SELECT statement, as ADO handles output
parameters faster than resultsets returned by SELECT statements.
Always check the global variable @@ERROR immediately after executing a data
manipulation statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction
in case of an error (@@ERROR will be greater than 0 in case of an error). This is important,
because, by default, SQL Server will not rollback all the previous changes within a transaction if
a particular statement fails. This behavior can be changed by executing SET XACT_ABORT ON.
The @@ROWCOUNT variable also plays an important role in determining how many rows were
affected by a previous data manipulation (also, retrieval) statement, and based on that you
could choose to commit or rollback a particular transaction.
BEGIN TRAN
TRY
BEGIN
-- Your code
COMMIT TRAN
END
CATCH
BEGIN
ROLLBACK TRAN
END
To make SQL Statements more readable, start each clause on a new line and indent when
needed. Following is an example:
code might work fine, but it will fail on a case sensitive SQL Server if your code is not
consistent in case. For example, if you create a table in SQL Server or a database that has a
case-sensitive or binary sort order, all references to the table must use the same case that was
specified in the CREATE TABLE statement. If you name the table as 'MyTable' in the CREATE
TABLE statement and use 'mytable' in the SELECT statement, you get an 'object not found'
error.
Do not use column numbers in the ORDER BY clause. Consider the following example in
which the second query is more readable than the first one:
Well, this is all for now folks. I welcome your feedback on this,