Вы находитесь на странице: 1из 10

Pitfalls of using T-SQL cursors

Conventional wisdom states that cursors are evil! A cursor is a memory resident set of
pointers that reference data in your result set or, more accurately, the data in the base
tables. Cursors have a bad reputation because they are typically the favorite hammer of
every junior SQL developer in search of a nail, they do not perform well and they deplete
systems resources.
We all have stories about cursors we had to rewrite as set-based operations, after which
batches ran considerably faster, or how mysterious server crashes were resolved when a
query was rewritten without a cursor. Then why is it that when you query syscomments
on text like '%cursor%' in SQL Server 2005 you find no less than 213 distinct Microsoft
shipped procedures and functions that use cursors? In SQL Server 2000, there are 245
distinct procedures or functions using cursors. Either Microsoft's SQL developers are
practicing poor coding practices, or cursors really do have a place in SQL development.
What do you think?
I personally think cursors do have a place.
1. They are necessary for some dynamic operations that can't be accomplished
with set-based operations.
2. They are simple to understand, which makes them ideal for quick-and-dirty
programming and a tool of choice for junior SQL developers.
3. They outperform while loops when you need row-by-row processing.
4. They are ideal for scrolling a portion of a large results set.
5. By default, they provide a window into your tables or results set, which
maximizes concurrency for all applications. This window into your data reflects
updates that occur as the cursor iterates the results set. The cursor itself holds a
shared lock briefly as it fetches the next row in your results set.
In this feature, I'll explain how cursors work, when to use them and when to avoid them.

How cursors work


To visualize how a cursor works, think of the apparatus used by a skyscraper window
washer to travel up and down the skyscraper, stopping at each floor to wash each
window. For most cursor types, key data is brought into memory and the cursor navigates
this key data on a row-by-row basis; similar to the window washer going floor by floor.

A cursor requires two operations. The placement operation moves the cursor to a row in
the results set. The retrieve statement returns the data underlying that row, called a fetch.
Set operations accomplish this with a single statement:
select * from TableName where pk=1
Keeping in mind the window-washer analogy, let's walk through the steps you would take
in using a cursor.
Two sets of syntaxes are supported by cursors: SQL-92 and T-SQL Extended Syntax. For
the most part I will look at the T-SQL Extended Syntax and reference the SQL-92 syntax
for comparative purposes. There are no new cursor features in SQL Server 2005.
First you create the cursor using a declare statement, which involves setting the cursor
options and specifying the results set.
Cursor options
There are four sets of cursor options:
STATIC
The STATIC cursor copies the data in the results set into tempdb and DML that occurs in
the underlying results fails to reflect in the cursor's data. Subsequent fetch statements are
made on the results set in tempdb. This is perfect when the data underlying your cursor is
static or your cursor has no real-time requirements. For example, most cursors Microsoft
uses are declared as static as the operation being carried out on the data needs to be done
on point-in-time requirements. In other words, it does not need to know about new rows
-- the data is static.
KEYSET
The KEYSET cursor is implemented by copying primary key data into tempdb. As the
cursor moves through the result set, the cursor will see modifications to the underlying
data but it will not see new rows. If data is fetched from a row that no longer exists, nulls
will be returned and the @@FETCH_STATUS variable will have a value of -2. The order
of the cursor data is also maintained. The KEYSET cursor is the default cursor type.
Fetch statements are made on the underlying base tables based on the keyset data cached
in tempdb. These cursors take more time to open than DYNAMIC cursors, but they also
have fewer resource requirements.
DYNAMIC
The DYNAMIC cursor is similar to the KEYSET cursor in that the cursor can see data
modifications in the underlying base tables, but it can also see newly inserted and deleted
rows. It does not preserve order, which can lead to the Halloween problem as illustrated
in script 4. Fetch statements are made on the underlying base tables, based on the key
data cached in tempdb, but the key data is refreshed with each modification to key data
on the base tables. It is the most expensive cursor type to implement.

FAST_FORWARD
A FAST_FORWARD cursor provides optimal performance but only supports the NEXT
argument, which only fetches the next row. Other cursor types will be able to fetch the
next row, the prior row (using the PRIOR command), the first row (using the FIRST
argument), the last row using the LAST argument, the nth row (using the ABSOLUTE
arguments), or leap ahead n rows from the current cursor location (using the RELATIVE
argument).
The above cursor options control the following:
Scope or visibility
Is the cursor only visible within a batch (local) or beyond the batch (global)? Cursors are
only visible within a connection.
Scrollability
Can the fetch statement fetch only the next row, fetch in any direction and/or by a
specific number of rows? The advantage of a forward-only cursor is that it performs
faster than a cursor type that can move in any direction. The two options are
FORWARD_ONLY and SCROLL (any number and in any direction).
Membership
Which rows are members of your cursor? Can the cursor see changes happening in the
underlying results set, and can it see newly inserted/deleted rows?
Updatability
Can you update or delete the rows in the underlying results set? To update the tables
underlying your cursor, you must have the following:
1. A primary key on the base tables underlying your cursor to update them. Otherwise
you will get the message:
Server: Msg 16929, Level 16, State 1, Line 1
The cursor is READ ONLY.
The statement has been terminated.
2. A cursor defined as KEYSET, DYNAMIC or FAST_FORWARD.
3. The WHERE CURRENT syntax to update or delete a row. Please refer to this script for
an illustration of cursor updatability functions.
You retrieve rows from the cursor using fetch statements. You should always check the
value of the @@Fetch_Status variable to ensure that it has a value of 0. The
@@Fetch_Status variable can have three values:

0 - row successfully returned


-1- fetch statement has read beyond the number of rows in the cursor
-2 - row no longer exists in your results set
The fetch statements are analogous to our window washers moving down the sides of the
sky scraper. With the fetch statement, the logical operations are position and then
retrieve; twice as many operations as a set statement (i.e., with a set statement it's
INSERT, UPDATE or DELETE).
Finally, clean up after your cursor using the close MyCursorName statement and then
deallocate its resources using the deallocation MyCursorName statement. Note that a
quick way to return to the beginning of a FAST_FORWARD cursor is to close and reopen
it.

Cursor advantages
So when and why should you consider using cursors? Here I'll spotlight cursor
advantages.
Row-by-row operations
Cursors are best used when performing row-by-row operations that can't be accomplished
with set-based operations (i.e., when you need to fire a stored procedure once per row in
a table).
Here is an example of a cursor used in the Microsoft shipped routine sp_helppublication
DECLARE hC CURSOR LOCAL FAST_FORWARD FOR
SELECT pubid, name FROM syspublications WHERE name like @publication
OPEN hC
FETCH hC INTO @pubid, @pubname
WHILE (@@fetch_status <> -1)
BEGIN
IF is_member(N'db_owner') <> 1
BEGIN
exec @retcode = sp_MSreplcheck_pull
@publication = @pubname,
@raise_fatal_error = 0,
@given_login = @username
END
IF (is_member(N'db_owner') = 1) OR (@retcode = 0 AND @@error =
0)
INSERT INTO #accessiblepubs values(@pubid)
FETCH hC INTO @pubid, @pubname
END
CLOSE hC
DEALLOCATE hC

This same procedure can be accomplished in a singly nested while loop, but cursors are a
better option for performance. For more than one nested loop, you should use a while
loop.
Here is an example with the above batch rewritten as a while loop.
DECLARE @count int
DECLARE @pubid int
DECLARE @publication varchar(10)
DECLARE @pubname varchar(10)
DECLARE @retcode int
SET @publication='pubs'
DECLARE @username sysname
SET @username=suser_name()
SELECT @count=count(pubid) FROM pubs.dbo.syspublications WHERE name like
@publication
WHILE (@count > 0 )
BEGIN
IF is_member(N'db_owner') <> 1
BEGIN
SELECT @pubid=pubid, @pubname=name FROM syspublications WHERE name like
@publication
exec @retcode = sp_MSreplcheck_pull @publication = @pubname,
@raise_fatal_error = 0, given_login = @username
END
IF (is_member(N'db_owner') = 1) OR (@retcode = 0 AND @@error = 0)
INSERT INTO #accessiblepubs values(@pubid)
SELECT @count=@count-1
END

Quick and dirty


SQL developers are often under the gun to write code fast. Writing a cursor requires less
mental effort than writing its set-based equivalent. Unfortunately these shortcuts often
remain in production and cause problems further down the line. (Thanks for the above
two observations from SQL MVPs Itzik Ben Gan and Erland Sommarskog.)
Cursors are faster than using while loops. Here is an example illustrating the timings.
USE PUBS
GO
CREATE TABLE NUMBERS (pk INT NOT NULL IDENTITY PRIMARY KEY , CHARCOL
char(20))
DECLARE @intcol INT
SET @intcol=1
WHILE @intcol<8001
BEGIN
INSERT INTO NUMBERS (charcol) VALUES (@intcol)
SELECT @intcol=@intcol+1
END
DECLARE @PK VARCHAR(20)
DECLARE @datetime DATETIME
SET @datetime=GETDATE()

DECLARE test CURSOR FOR SELECT charcol FROM NUMBERS


OPEN test
FETCH NEXT FROM test INTO @PK
WHILE (@@FETCH_STATUS =0)
BEGIN
SELECT @PK
FETCH NEXT FROM test
INTO @PK
END
CLOSE TEST
DEALLOCATE TEST
SELECT DATEDIFF(ms, @datetime, GETDATE())
--3786 ms
DECLARE @counter INT
DECLARE @datetime DATETIME
SET @datetime=GETDATE()
SET @counter=1
WHILE (@counter< 8001)
BEGIN
SELECT charcol from numbers where pk=@counter
SELECT @counter=@counter+1
END
SELECT DATEDIFF(ms, @datetime, GETDATE())
--4676 ms almost a full second longer

This is a trivial example, but it does illustrate the performance advantage of cursors over
while loops. In our case of 8,000 rows there is nearly a one-second speed advantage in
using a cursor over a while loop.
Scrolling
Classic ADO made use of cursors for scrolling or paging through a results set on the
server, using server-side cursors. These cursors provided performance benefits over pure
T-SQL implementations for paging through a results set. For more information on pure TSQL implementations consult ASPFAQ.COM. In these tests the T-SQL provides
marginally better performance, but cursors are faster for larger results sets. In SQL Server
2005 you can also use Common Table Expressions (CTEs) for server-side paging.
By default cursors will query the base tables with each fetch and, as such, they are always
current with the most recently updated values. Here is an example:
CREATE TABLE test
(pk INT NOT NULL IDENTITY PRIMARY KEY, charcol CHAR(30))
GO
DECLARE @int INT
SET @int=1
WHILE @int <6
BEGIN
INSERT INTO test (charcol) VALUES ('this is a test
'+convert(VARCHAR(2),@int))
SELECT @int=@int+1
END
GO

DECLARE @pk INT


SET @pk=1
DECLARE @charcol varchar(30)
DECLARE testcursor cursor for select pk, charcol from test
OPEN testcursor
FETCH testcursor
INTO @pk, @charcol
WHILE @@fetch_status=0
BEGIN
SELECT @charcol
UPDATE test SET charcol=convert(VARCHAR(2),pk)+' '+charcol
SELECT charcol FROM test
FETCH testcursor
INTO @pk, @charcol
END
CLOSE testcursor
DEALLOCATE testcursor

Notice how each cursor fetch updates the value of charcol and the cursor picks up the
updates in the underlying base tables.
With advantages like these, you may wonder what are the disadvantages of using cursors.
Please, oh please, oh please keep reading.

Cursor disadvantages
Resources consumed by cursors
As I mentioned earlier, a cursor is a memory resident set of pointers -- meaning it
occupies memory from your system that may be available for other processes. Poorly
written cursors can completely deplete available memory. (See this example.)
If you are using AWE (Address Windowing Extensions) on 32-bit SQL Server 2000, the
cursor occupies available memory from the pool used by locks, cached procedure plans
and user connections, which may cause more memory pressure in the memory space.
This is not a problem in SQL Server 2005 or in 64-bit SQL Server 2000.
Speed and performance issues
Cursors can be faster than a while loop but they do have more overhead. If your cursor
will not be updating the base tables, use a FAST_FORWARD cursor for optimal
performance. The problem with cursor speed is that, in many cases, the operation can be
more efficiently written as a set operation or perhaps in a while loop. It's these cursor
rewrites that lead to the impression that cursors are evil or cursed.
Another factor affecting cursor speed is the number of rows and columns brought into the
cursor. Time how long it takes to open your cursor and fetch statements. If it's lengthy,
look carefully at your cursor logic; see if you can remove columns from the declare

statement, and change your where clause in the declare statement to only return rows the
cursor needs. If the fetch statements themselves are lengthy or consuming too much IO or
CPU, look at the cursor declare statement and ensure you have optimal indexes in place
on your base tables or temporary tables.
Wrong tool for the wrong task
Cursors are frequently the wrong tool for the wrong task. They're used for quick-anddirty programming when a developer does not have a good understanding of set
operations -- or they're used for the wrong task entirely.
For example, an operation is sometimes best done client side rather than server side.
Server-side cursors were supported in ADO; code solutions that required a read-only
view of data used static server-side cursors. ADO.NET uses the data reader or data
adapter, which operates disconnected client side. It grabs a results set and brings the data
client side, then typically disconnects from the server immediately resulting in greater
performance and scalability. Similarly if your requirements require a snapshot of the data,
and don't need a window into real-time updates, use ADO.NET's DataReader to blast the
data back to your client and cache it there. That way the client can page through the
results set on the Web page or Web server, as opposed to paging the results set on the
SQL Server, thus consuming resources.
Before you use a cursor, evaluate how the data will be consumed. Sometimes business
cases can be made to have the data manipulated on the middle tier as opposed to the data
tier.
Subtle errors
Cursors sometimes introduce subtle errors. We already looked at a few:
Failing to check the value of @@Fetch_Status
Improper indexes on the base tables in your results set or FETCH statement
Too many columns being dragged around in memory, which are never referenced in
the subsequent cursor operations (probably the result of legacy code)
WHERE clause that brings too many rows into the cursor, which are subsequently
filtered out by cursor logic.
However, there are also subtle errors that a cursor can introduce.For example, the
Halloween problem illustrated how a cursor update operation changes the order of rows
in the underlying base tables, so the same rows were retrieved in the cursor results set
each time and updated multiple times.

Cursor example
For the rest of this discussion I will be looking at the following cursor:

DECLARE authors_cursor CURSOR FOR


SELECT authors.au_id, au_lname, au_fname, phone, address, city, state, zip, contract,
title FROM authors
JOIN titleauthor ON titleauthor.au_id=authors.au_id
JOIN TITLES ON titleauthor.title_id=titles.title_id
ORDER BY authors.au_id
This is a poorly written cursor because I am returning all columns I am unlikely to need
and I am not using a where clause; chances are I won't need every row. Memory
structures created contain tracking information about the data in the base tables. With the
exception of a STATIC cursor, changes that occur in base tables underlying your results
set will reflect in the fetch statement used to iterate from one row to another.
STATIC cursors copy all rows into tempdb. The cursor's internal pointers reference this
read-only copy in tempdb and do not reference the base tables. Changes made to the base
tables will not be "seen" by the cursor fetches.
KEYSET cursors only copy key information into tempdb, so they will not see newly
inserted rows the way DYNAMIC and READONLY cursors do. If your fetch statement
tries to fetch a row that has been deleted from base tables it will return nothing;
@@FETCH_STATUS will be -2.
All this tracking activity requires system overhead; STATIC requires the most overhead.
This decreases resources available to your system as a whole and consequently degrades
your SQL Server performance. The cursor also lowers concurrency of data in the base
tables. Some DBAs and developers will slurp the data into a temporary table and build
the cursor off that. They also place indexes on the temporary table to prevent the cursor
from having to do a complete table scan.
By default a cursor is created as DYNAMIC, OPTIMISTIC and FORWARD_ONLY.
OPTIMISTIC concurrency means if you fetch the row and then update it, and someone
has updated the row between the time the cursor fetched it and the time you update it, the
cursor update will fail. In other words, the cursor is "optimistic" that the row it is going to
update has not been updated by another process. FORWARD_ONLY means you can only
use Fetch Next statements. Fetch Prior will result in the following error:
Server: Msg 16911, Level 16, State 1, Line 1
fetch: The fetch type prior cannot be used with forward only cursors.

Many DBAs do not test for @@FETCH_STATUS -2 which leads to errors in logic.
Summary
Cursors are frequently misused SQL lint programs (like the ones presented in Linchi
Shea's book, Real World SQL Server Administration with Perl). Quest Software's SQL
Spotlight looks for cursors in T-SQL code, as cursors are typically not the best code
solution and often degrade system performance. Make sure you know what your cursors

are doing and use the correct cursor default options. Ideally, you should use a STATIC
cursor off a minimal results slurped into tempdb with appropriate indexes. Finally,
consider how the data is consumed or manipulated. It is often better to send your results
set to the consuming application and have the iterative operation performed there.

Вам также может понравиться