Optimize Oracle Data Access
Use anonymous blocks and multiple parameter sets with the ODP.NET data provider to reduce round trips between client and database.
by Bob Beauchemin
November 2003 Issue
Technology Toolbox: VB.NET, C#, Oracle
You have two basic ways to improve data-access code performance in your database applications. The most straightforward way is to optimize the SQL statements that execute on the database server. A more subtle way, from a database client programming perspective, is to decrease the number of round trips from the client to the database server and do more database work in each round trip. I'll explain two techniques you can use to achieve this goal.
One way to decrease round trips is to use batches of statements known in Oracle as anonymous blocks (see Figure 1). All sets of PL/SQL statements are called blocks. You can catalog blocks to the database by name (for example, stored procedures or user-defined functions), or execute them directly from a client program without cataloging them. These directly executed blocks have no name in the database; that is, they're anonymous.
Another way to decrease round trips is to use parameterized queries or stored procedures with multiple sets of parameters (see Figure 2). Take this simple SQL statement:
SELECT * from emp
WHERE deptno = :deptno
AND sal > :sal;
You can execute the preceding statement with one instance of the parameters—for example, :deptno 10 and :sal 50000. However, you can also pass multiple instances of the parameters to the database—say, (10,50000), (20,40000), and (40,60000). Passing multiple parameter sets causes the same query to execute three times on the database server in a single round trip. Both the ODP.NET data provider and the DataDirect Technologies data provider for Oracle support multiple parameter sets; ODP.NET, DataDirect's provider, and Microsoft OracleClient support anonymous blocks. I'll show you how to use anonymous blocks and multiple parameter sets with ODP.NET version 1.0 in C#.
Suppose you want to add a new department and its employees in as few round trips as possible. This is easy to test because the Oracle test schema (named scott, after one of Oracle's first employees) is set up already. Another example of the same problem is adding an order and a number of line items. The "add 1-n items" problem pops up often in use cases for relational databases. You can solve this problem in two round trips by using multiple parameter sets to add the employees after you add the department. Suppose you have two packaged stored procedures: adddept and addemp (see Listing 1). You start using multiple parameter sets with ODP.NET by defining and adding the parameters to the collection as though you were using a "normal" OracleParameterCollection (see Listing 2).
The trick to using multiple parameter sets is to set the Parameter.Value for each parameter to a .NET array of the appropriate data type. Create an array for each parameter value instead of using a single parameter value for each parameter. The array's size is the number of employees you want to add:
Decimal[] empno = new Decimal[3];
String[] ename = new String[3];
// set the values into the arrays
Then, set each parameter value to the array rather than setting it to the single values:
insert_emp_cmd.Parameters[0].Value =
empno;
insert_emp_cmd.Parameters[1].Value =
ename;
Finally, you must set the OracleCommand's ArrayBindCount value to tell it how many parameter sets to use. Then, execute the command as usual:
insert_emp_cmd.ArrayBindCount = 3;
int rowcount =
insert_emp_cmd.ExecuteNonQuery();
The rowcount variable in the preceding code fragment returns the total number of affected rows (three) that the command inserts.
Use Scalar Parameters in PL/SQL
Note that although you pass an array as a parameter in ODP.NET client code, you're not using the VARRAY data type in Oracle. The implementation in ODP.NET of multiple parameter sets by using .NET arrays exists because ADO.NET doesn't define a standard way to use multiple parameter sets. The stored procedures in Listing 1 use "normal" scalar parameters, not VARRAYs.
Using multiple parameter sets accomplishes the operation in two round trips. You can optimize the operation a little differently by using a dynamically constructed anonymous block. You simply use string concatenation to construct it. You use CommandType.Text to pass an anonymous block into the database engine as a single string. You must bracket the anonymous block starts by BEGIN-END statements because it's really an ordinary PL/SQL block. You need to construct the block dynamically to solve the 1-n item problem, but you can hard-code PL/SQL blocks into programs:
// C# definition for an anonymous block
string some_block =
" BEGIN" +
" insert into samp values(:1, :2);" +
" insert into samp values(:3, :4);" +
" END;";
In the example, you start the anonymous block by inserting the department row because there's only one department:
static string ablock = " BEGIN " +
"payrollmaint.adddept(:1, :2, :3); ";
The string is static in this example because you use it from static methods. Note that you're taking advantage of the fact that ODP.NET supports positional parameters as well as named parameters. The parameter names don't match as in the first example, but they are simple monotonically increasing numbers.
You add the statements to add employees to the block one employee at a time. This approach doesn't use multiple parameter sets as in the first example, but it concatenates each command in turn to the long anonymous block. Using positional parameters is useful here because you execute the same stored procedure multiple times; you can't use the same parameter names in the collection more than once:
for (int i=0;i<3;i++)
{
ablock +=
String.Format(
"payrollmaint.addemp(:{0}, :{1},
:{2}, :{3}, :{4}, :{5},
:{6}, :{7}); ",
cmd.Parameters.Count + 1,
cmd.Parameters.Count + 2,
cmd.Parameters.Count + 3,
cmd.Parameters.Count + 4,
cmd.Parameters.Count + 5,
cmd.Parameters.Count + 6,
cmd.Parameters.Count + 7,
cmd.Parameters.Count + 8);
// code to add parameters
// code to set parameters
}
Note that you could optimize the code even further in this case by using a .NET Stringbuilder class rather than string concatenation. I'll leave that to you as an exercise. When you're finished adding command text to the anonymous block, you finish it off with the END statement and assign it to the CommandText property:
ablock += " END;";
cmd.CommandText = ablock;
int rowcount = cmd.ExecuteNonQuery();
Voilà—you've inserted one department and 1-n employees in a single database round trip. The code is a little ugly and difficult to maintain, so for the sake of code maintainability, you might consider spending the extra round trip that multiple parameters require.
Both examples I've presented use error-handling code in the client code rather than catching the exceptions in the stored procedures using PL/SQL's structured exception handling. The OracleException and OracleError classes in ODP.NET return the entire PL/SQL error message; OracleError also exposes the name of the stored procedure in which the error occurs. If you prefer more user-friendly messages, you can catch the error inside the stored procedure and reformat the text accordingly. Don't forget to rethrow the exception if you consume it inside the stored procedure.
Prevent Partial Failures
Both examples do multiple operations, so it's possible to produce partial-failure scenarios—for example, when one of the employees you add has an invalid MANAGER field value. This would prevent this employee from being inserted, but the rest of the department would be inserted correctly. You should wrap the entire statement or anonymous block inside a local transaction to prevent partial failures. The entire set of rows is rolled back if any statement in the block fails. You could use more sophisticated exception handling inside the stored procedure to determine exactly which row causes the failure.
The anonymous block uses dynamic SQL composed on the client to decrease the number of database round trips. When you combine dynamic SQL with user input, the possibility exists—especially in Web applications—that nefarious users will tack extra code onto the end of user input to use the dynamic SQL in ways you didn't intend. For example, suppose you use this anonymous block:
string ablock =
" BEGIN" +
" select * from emp where empno = ";
ablock += some_user_input;
ablock += " END";
If you take user input from a textbox and concatenate blindly whatever the user enters, a nefarious user could enter this command in the textbox:
8304;drop table emp;
If this data wound up as the some_user_input variable in the preceding code, the resulting dynamic SQL would list all employees named Smith and drop the table—not exactly what you intended.
A few simple precautions can prevent these kinds of problems. First, be sure to edit all user input before including it in dynamic SQL. In Oracle, semicolons cause problems with the block even if they're part of an innocent data entry. An easy way to prevent problems with dynamic SQL is always to use parameterized queries or stored procedures in it. Restricting edited user input to parameter values is also a good defense. Assume malicious users exist and include such input in your test suites. This problem won't occur at all if you use multiple parameter sets; the bad values are used only as parameters.
I've shown you two ways to optimize your client-side Oracle code for performance. You'll find it's a useful exercise in each case to use the Oracle trace facility to confirm that your carefully optimized client code is actually making a single database round trip. One round trip is assured with an anonymous block, but multiple parameter sets simply equate to multiple round trips with some databases and APIs. The TRC file in the code shows the exact trace option you need to use. You must read the raw trace file, but notice that four network round trips take place: one each to turn the trace on and off, one to add the department, and one to add multiple employees. Count the number of messages that read 'SQL*Net message to client' to confirm this.
The program sends a Transport Network Substrate (TNS)—the Oracle communications protocol encapsulated in the SQL*NET client libraries—message to the database, requesting that the database execute the same stored procedure three times with different parameters each time. This is a network optimization whose effectiveness depends on how long it usually takes for TNS messages to reach the server. If the client is co-located with the database, you'll have less of a gain than if the client is many miles away over a slow network connection. However, you can't lose in either case.
About the Author
Bob Beauchemin is an instructor, course author, and database curriculum course liaison for DevelopMentor (www.develop.com). Bob is the author of Essential ADO.NET (Addison-Wesley) and a speaker at VSLive! and other conferences. He has more than 25 years' experience as an architect, programmer, and administrator for data-centric distributed systems. Reach him at .
|