Change How You Access Data
ADO.NET 2.0 lets you do data paging, make asynchronous database calls, leverage Multiple Active Result Sets, and more.
by Peter DeBetta

August 20, 2004

Technology Toolbox: ADO.NET 1.1 and 2.0, SQL Server 2005

Editor's Note: This article is excerpted from Chapter 6, "Client-Side ADO.NET," of Peter DeBetta's upcoming book, Introducing Microsoft SQL Server 2005 for Developers [Microsoft Press, ISBN: 073561962X]. It has been edited for length and format to fit the magazine. You can read a PDF of the full chapter here.

On initial glance, ADO.NET 2.0 looks much like its predecessor—ADO.NET 1.1. All the familiar objects are there; all the functionality you expect is still there. So what has changed? You might enjoy the built-in capability to do data paging, or perhaps Multiple Active Result Sets (MARS) appeals to you more, or maybe asynchronous database calls pique your interest—and those are just the features that are not specific to Microsoft SQL Server 2005 (code-named Yukon). ADO.NET 2.0 also includes features that are specific to SQL Server 2005—such as user-defined data types (UDTs), which are also called user-defined types, that are natively supported by the data access client.

ADO.NET 2.0 can really take your data access coding to a new level. In this article, I'll disclose information about the myriad of new features and changes to the existing features. You'll then want to further explore this latest version of data access technology.

A number of ADO.NET 2.0 features are related directly to SQL Server 2005. One new feature—server-side cursors through the SqlResultSet object—is meant to be used in the context of stored procedures and other database objects that are written in managed code. You could use SqlResultSet on the client, but you might pay a performance penalty for doing so without receiving any meaningful benefit. With few exceptions, server-side cursors belong on the server.

ADO.NET 2.0 also offers features designed specifically to work with SQL Server 2005. You can create a UDT in managed code and utilize it freely in your database. Suppose you did just that, creating a table that contains a column defined as a point UDT:

CREATE TABLE Location
(
        LocationID int Identity(1, 1) NOT NULL PRIMARY KEY,
        Description varchar(150) NOT NULL,
        Coordinate point NOT NULL
)

You can then create a stored procedure that selects a particular record from the table based on the LocationID field:

CREATE PROCEDURE prApp_Location_Select
   @LocationID int
AS
SELECT  LocationID, Description, Coordinate
FROM    Location
WHERE   LocationID = @LocationID
GO

This all seems innocent enough until you consider what happens when you call this stored procedure from the client. How can you use the Coordinate field on the client if the type definition exists in an assembly that is literally embedded in the database? As you're about to see, there are several answers to this question.

Access UDTs in SQL Server
If you're using the SqlClient to provide connectivity to SQL Server, you're in luck because it natively supports those UDTs you created in your SQL Server database. There are two methods for using these types: You can use an early-bound technique or an assembly in the client code. Either method can be a reasonable choice depending on the purpose and ultimate use of your software.

Perhaps you're creating a complete software solution for a client or for internal use using SQL Server and .NET. In situations such as these, when you or your group have control over all aspects of the development, you can easily use the same assembly code in and out of the database as needed. This early-bound technique is marvelous: It's easy to use, and it's cleaner from a coding standpoint.

To use this technique, you must use the same assembly in your data access code as you do in the database itself. This is simple enough to do because you had to create the assembly before registering, thus loading it into the database in the first place. You must reference that same assembly in your .NET client project. You then have the type at your disposal for use in your project.

An example of a UDT being used in client code shows a column (zero-based index value of 4) as the Email struct UDT (see Listing 1). You should address the column as a UDT by using the type itself. Almost any other attempt to address this UDT column will lead to either a compile or runtime failure.

Check out other attempts to read data from the UDT column (see Listing 2). The GetString method on line 8 of Listing 2 compiles without any problem, but it throws an InvalidCastException when it's executed because an Email is not a String. The second example on line 10 uses the Address property of the Email type directly against the column of the DataReader. This seems like it should work, but the compiler won't even compile the code because a DataReader column doesn't have any such property.

On the other hand, the last three lines not only compile, but also execute without a hitch. The example on line 14 simply casts the column to the UDT before trying to use the Address property. This is essentially the same code as that in Listing 1, but instead of creating the type ahead of time, the data is being cast to the type as needed. Lines 12 and 13, however, do no such cast to the Email type, yet they both work without any problems. The details of why is beyond the scope of this article, but it should suffice to say that it works because the ToString method is common to all objects, regardless of their type, including (and required for) all UDTs created for SQL Server 2005.

Accessing UDT Bytes
You can start with a couple of assumptions when it comes to accessing UDT bytes. First, assume the type is not available for use on the client. Second, assume that dynamically downloading the assembly that contains the UDT is not feasible, perhaps because it is part of a large assembly and you want to avoid the additional overhead of downloading such an assembly, or maybe security prevents the type from being downloaded. What do you do if you want to call a stored procedure that returns a resultset with a column that is in fact a UDT, and you don't have that type on the client?

Not to worry—you can access the UDT in the same way as the serialization methods of the UDT. More specifically, the column data is in the form of raw bytes, so you can read these raw bytes using the same technique as the Read method of the UDT itself (see Listing 3).

This code uses the same technique to read raw bytes and convert them into a string as the Read method of the Email type itself. The code varies in what it does with these values, but not in how it extracts them. A byte array is read from the raw bytes of the column that has the UDT. This is fed into a MemoryStream, which in turn is fed into a BinaryReader that calls its ReadByte and ReadString methods to evaluate the content of the UDT.

This technique is a viable solution, but it requires additional code management at the lines-of-code level (not at the assembly level), which many don't prefer. I suggest you use the early-bound technique and use the assembly in the client code. You still have to manage code in multiple locations, but at least it's a single unit of code—an assembly—that you can distribute to both server and client, as needed.

Caution: Using this technique of direct byte access requires that you know the method of serialization that the type is using. But because the implementation of the type's serialization can be changed in the database, your client code could all of a sudden be not only inaccurate, but also possibly dangerous, because any changes saved by this now rogue code could corrupt your data.

Page Data Results
Shortly after I devised a means of paging data in ASP.NET using data from SQL Server, I started digging into the new features of ADO.NET 2.0 and discovered that one of these new features was the capability to page data. As I learned about this nifty little feature, I realized that it will make obsolete many of the data-paging methods currently in use in the development community, including the data-paging methods I had devised. However, it will simplify development of Web-based applications that need to be able to page data results.

ExecutePageReader, a new method of the Command object, is not yet available to all variations of the Command object—including the OdbcCommand and OleDbCommand objects. As of this writing, it is available only to the SQL Server data access objects, specifically SqlCommand. Keep in mind, however, that this beta software will be enhanced before its final public release.

ADO.NET is doing something under the covers that I usually advise against when developing data-layer software: It opens a server-side cursor against the entire set of data, positions to the rows to fetch, fetches those rows, then closes that server-side cursor. Server-side cursors can affect your application adversely in several aspects, but it is used effectively in this case, and its implementation is hidden to prevent any misuse of the technology.

ExecutePageReader takes three parameters: One defines it behavior, one indicates the row position, and one tells it how many rows to fetch:

public SqlDataReader ExecutePageReader 
(CommandBehavior behavior, Int32 startRow , Int32 pageSize)

The first parameter, behavior, is one of the System.Data.CommandBehavior enumeration values. This enumeration is a part of ADO.NET 1.1 already and behaves in the same fashion in ADO.NET 2.0. The startRow parameter is a zero-based position that indicates where to start fetching data. No data is returned if this value exceeds the actual number of rows. The pageSize parameter tells the database how many rows should be fetched. If fewer rows are available, the remaining rows are returned.

A code sample shows a variation on a paging methodology published in asp.netPro magazine in June 2003 by Jeff Prosise (see Additional Resources) that you can accomplish currently in ADO.NET 1.1 (see Listing 4).

This method uses dynamically created T-SQL, a feature I often tout as a no-no, but the manner in which it is implemented protects it from SQL injection attacks because of the strong typing of the method parameters.

This method dynamically constructs a T-SQL statement that fetches a page of data from the requested table. The paging work is being done on the server, so it is often more efficient than other paging techniques. That was true until ADO.NET 2.0 came along. Now, the technique shown in Listing 4 will never be as efficient as this code sample, which uses the new ExecutePageReader method (see Listing 5).

It is true that both techniques can potentially access the entire set of data. However, the former technique must perform several sorts of the data, whereas ExecutePageReader sorts once and uses a forward-only, read-only server-side cursor to get to the data. It outperforms the more complex and bulky T-SQL statement of the former technique.

This is not necessarily the best way to implement a data-paging solution—other methodologies can be more efficient. Unfortunately, some of these techniques involve complex code and the need to make structural changes to the database design to get the job done right. ExecutePageReader, on the other hand, lets you create a quick and efficient paging solution while keeping the code simple—and without having to make any design changes to your database.

Make Asynchronous Calls
The concept of making asynchronous calls is not new to the world of application development, especially .NET development, but it is new to ADO.NET 2.0. Objects that support asynchronous execution offer several methods in which to implement this coding technique. I'll describe two of these methods: asynchronous polling and asynchronous callback.

The polling technique works by starting an asynchronous call and then testing to see whether the job is complete. If the job is not complete, it can do other tasks and then check again. You can accomplish this by using a loop statement that checks the completion state of the asynchronous job. This is comparable to a family car trip: A child keeps asking, "Are we there yet?" and watches a DVD or plays with a portable game system between each inquiry. Sound horrific? Jeffrey Richter thought polling wasted processor time that could otherwise be used by other processes.

Jeffrey and I tried to defend asynchronous polling as a solution and find a viable example that used this technique. However, for each example we conjured up, we realized it would be better implemented using callbacks. After a lengthy chat, we found that we couldn't find a good example of using polling because every polling example we thought of was better if it were written using callbacks. By the end of the discussion, I was in agreement with Jeffrey's first assessment: Polling is quite often the wrong technique to implement, and you should use it only if you cannot use callbacks to achieve the same results.

There are times when you will have no other choice but to use polling as a technique (client demands, lack of database support for this technique, and so on), but using callbacks instead of polling is almost always the better choice.

To continue the car trip analogy, the child makes one simple request at the beginning of the trip: "Let me know when we get there." You only have to tell your child when you actually arrive; in the meantime, the child keeps busy with DVDs or games. This is in essence how the callback technique works: You make a command request; when the command finishes, it notifies you in another method. Many factors can affect the decision to use callbacks (including the complexity of implementing a solution), but using asynchronous callbacks is the preferred technique when using asynchronous methodologies. When you decide that using callbacks is what you need, the sample code shows you how to do just that (see Listing 6).

A detailed explanation of the way callbacks work in .NET is beyond the scope of this article, but you should know that this technique can be useful when you need to have other completely different code segments running independently of the request for data. The classic example occurs when a client application needs to fetch data that might take time to retrieve and allows the user to perform other tasks while waiting for the command to finish its job on the database server.

The ability to execute commands asynchronously has many excellent benefits, but, as with other features I've discussed, it is only one type of solution. A prototype application, for example, is easier to implement by using synchronous calls. And although synchronous calls are certainly easier to implement in an application than asynchronous calls, a well-written application should use asynchronous calling techniques.

Access Multiple Resultsets
Multiple Active Result Sets (MARS) lets you have concurrent access to more than one resultset on the same connection. To think that this feature is similar to ADO.NET 1.1's current ability to get multiple resultsets using the NextResult method of a DataReader is selling MARS short. MARS isn't about retrieving sequential sets of results from a database server. It allows you to have multiple resultsets, each acting independently of the others as if they were all using separate connections, when in fact they're all using the same connection to get the job done.

For SQL Server 2005 beta 2, the connection string requires the additional setting async=true to use MARS. If this setting is not present, an InvalidOperationException exception is thrown when attempting to fetch data from the second SqlCommand.

It might not seem like such a big deal. However, consider a Web application with thousands or more users, each of whom needs to deal with simultaneous multiple sets of data that would normally require multiple connections. Then take into consideration the "cost" of a connection. It can make a major difference in the performance of the Web application if the simultaneous access of multiple resultsets can use a single connection.

Caution: Having access to multiple active sets of data on a single connection doesn't mean you can open multiple instances of a SqlDataReader using a single SqlCommand as the source. You can associate multiple SqlCommand objects with a single SqlConnection, but you must associate each SqlDataReader with a single SqlCommand.

You can also use this feature in conjunction with asynchronous commands. Imagine some process that needs to hook into the same SQL Server multiple times to process two distinct queries. This code demonstrates the concept—although the actual tables don't exist (see Listing 7). It starts two separate asynchronous commands on the same connection.

Both commands are executing on the same connection, so a negligible amount of time is needed to make the "second" connection. By synchronously executing the second command on the same connection as the first command that is executing asynchronously, you effectively get an execution time that is equivalent to the length of the query that takes more time to execute. Using some simplified accounting methods, if you assume that the first command requires one second to execute and the second requires a half second to execute, executing these two commands asynchronously on the same connection requires about one second because the second command executes within the timeframe of the first. The same process executed completely synchronously would take one and a half seconds. That's a potential saving of 33.33 percent over the equivalent synchronously executing code.

ADO.NET has certainly moved up to a higher plateau. The data access capabilities I've discussed in this article are a portion of the new features you'll see in version 2.0. These features will enable you to create more robust applications that will perform better as well.

You might think that you should be zealous in using all these new features. However, as far as real development projects are concerned, remember the old adage: Everything in moderation.

About the Author
Peter DeBetta teaches exclusively for Wintellect, in addition to consulting and developing enterprise-level software solutions for his own company, DeBetta Software.