Search:
Locator+ Code:
FTP Home   Javapro Home   Archives   Customer Service   Site Map
 

Java in the
Database

Get the most from your database by extending it with Java

by Ken North

Java has increasingly become the tool of choice for extending database servers from Oracle, IBM, Sybase, and Informix. Since 1996 I've measured Java's database progress by polling developers attending my Java database seminars. In early 1996 many developers were asking how Java Database Connectivity (JDBC) compared with Open Database Connectivity (ODBC). Later that year developers were wondering if Java could handle porting a 10,000 seat client/server SQL application to an intranet architecture in 90 days. By the summer of 1997, more developers were writing applications and server-side code than applets. 1998 saw an increased interest in Java in the database. Throughout the entire period, 90-95 percent of those audiences were using one of the five best-selling DBMSs: Oracle, Microsoft SQL Server, Sybase, Informix, or IBM DB2.

Works With:
JDK 1.x, JDBC 1.x, Sybase or other Java-enabled database

the code
for this article, plus additional documentation for the code.

the code
for entire issue

Although my surveys weren't scientific, it appears that the first wave of Java database developers were primarily concerned with traditional types, transactions, and performance. The second wave were doing more server-side programming, but many were still using traditional SQL tables. I expect that the next wave will include Java developers using more complex types and multidimensional databases. This is because SQL vendors are moving to object-relational DBMSs that support database extensions (or plug-ins). I will discuss Java's role in extending databases, but to get started I'll provide some background information about extensible server architectures, active databases, and how we are using databases today.

OLTP, OLAP, OLCP-What's the difference?
Databases have a wide variety of uses and there are different requirements for online transaction processing (OLTP), online analytical processing (OLAP), and online complex processing (OLCP). Understanding these differences is important in selecting an appropriate architecture and DBMS, designing databases, and choosing an API. OLTP, OLCP, and OLAP databases each answer different questions. For example, to find orders for greater than 1,000 widgets you can use an OLTP query against an operational data store that contains current information. Amazon.com uses SQL queries against OLTP databases. OLTP programs place a premium on short-lived queries to support many concurrent database users. Transactions typically involve SQL tables containing rows of numbers and characters. If you want to find a picture containing a sunset image you can use an OLCP query against a database containing images. OLCP applications use databases that contain more complex types, such as audio, images, video, text, and spatial data. Finally, to get a list of the top-ten selling books in California for each month in the period 1995-1997, you can use an OLAP query against a multidimensional data source containing historical information. OLAP involves decision support queries against multidimensional data sets. OLAP programs typically sustain few users, doing long-running queries.

Java is a viable technology for OLTP, OLAP, or OLCP data access. However, the database design and data access APIs are often different for OLTP, OLAP, and OLCP. The JDBC API and SQL are well-established for OLTP, but developers creating decision support queries are likely to use an OLAP server with a different API. The major OLAP vendors endorse OLE DB for OLAP, and multidimensional expressions (MDX) that map to SQL. For OLCP, Java developers often use Java bindings for object DBMSs such as Gemstone and Versant. The OLCP picture is changing because SQL servers are becoming object-relational and JDBC 2.0 provides better support for object-relational databases.

DBMS Vendors Embrace Java  
Java's success has prompted Oracle, Microsoft, Sybase, Informix, and IBM to develop JDBC drivers for their databases. Oracle, Sybase, Informix, IBM, and Cloudscape are in a race to produce an SQL platform that best serves the needs of Java computing. However, SQL database vendors are not alone in their support for Java and JDBC. A variety of middleware vendors offer JDBC drivers and JDBC servers.

Click here.



SQL and Active Databases
To grasp the importance of the new Java capabilities, you need to understand some SQL DBMS concepts. SQL databases store logic and enforce rules about data. They are active databases because they can trigger events and block operations that compromise data integrity. Some SQL servers can be extended by programming and installing a dynamic link library (DLL), a shared library, or Java classes. The extensions are a part of the database, and they execute at the server. Java can also contribute to the database's role as a logic store and rule enforcer, and Java can add user-defined functions (UDFs), stored procedures, types, and behaviors to a database. Finally, Java extensions are accessible to any SQL client. The DBMS parses SQL statements and invokes Java classes, whether the client is Visual Basic, a middle-tier server, Netscape Communicator, or Internet Explorer.

Although many organizations are running legacy databases, SQL enjoys widespread support among Java database developers. Part of SQL's appeal is that it is an international standard that is updated every few years. Most SQL DBMSs conform to the SQL-92 standard (see for more information), but many vendors use proprietary SQL dialects for stored procedure programming. Oracle provides PL/SQL, Informix offers Stored Procedure Language (SPL), and Microsoft and Sybase support Transact-SQL (T-SQL) stored procedures. IBM DB2 Universal Database uses a different model, supporting external stored procedures written in programming languages such as Rexx, COBOL, C, and Java.

PL/SQL, T-SQL, and SPL are block-structured, procedural languages in their own right, but they aren't portable from one DBMS to another. On the other hand, Java represents the most portable programming solution available today. Java's portability is important to IBM, Informix, Oracle, and Sybase because they offer SQL servers for different operating systems. Processing statistical functions, string manipulation, and time-series analysis is more efficient when coded with a programming language like Java instead of extended SQL dialects.

Understanding the Architecture
There are several architectures used for Java programs that query a database with SQL. You can write an applet that downloads to the client when someone uses a Java-enabled browser. You can write Java programs, application and servlets, that do not execute in a browser context. Of course, before executing an application or servlet, you must pre-install the Java VM and runtime classes on the machine where your program will run.

To the database server, a client-side program, such as an applet or application, and a server-side program, such as a servlet. are all SQL clients. Some SQL servers are extensible with C/C++ and other languages, but Java is likely to become the standard for server-side programming because it's flexible and portable. In addition to applets, servlets, and middle-tier servers, Java developers can create classes to install in a database. Java-enabled browsers include the Java virtual machine and runtime classes. SQL servers that execute Java classes also include the Java VM and runtime. A database that embeds a VM is called a "Java-enabled" database. For example, a Sybase Java-enabled database includes JDBC classes, core classes for types, and other packages (see Table 1). Figure 1 illustrates a Sybase Central view of a Java-enabled database. The database manager views Java classes as schema objects just like tables, columns, views, triggers, and stored procedures.

 
Figure 1. The Java-Enabled Database Click here.
Embedding Java in Database Servers
Java-enabled SQL servers offer different capabilities, and not every vendor uses the same approach to embedding Java in the DBMS. Cloudscape is an embeddable DBMS that uses the Java VM of the host application or operating system. DB2 UDB uses different versions of the Java VM for different operating systems. Oracle and Sybase use their own Java VM across all operating system platforms. Sybase uses the same Java runtime and VM for both of its server products (Adaptive Server Anywhere and Adaptive Server Enterprise).

There are several benefits to using a VM and runtime classes optimized for use in a database server. Java in the database doesn't require classes, such as AWT, to support a GUI. A Java VM tailored to database server requirements should provide optimized garbage collection and thread management. Responsiveness for OLTP requires certain types of queries to be given priority for resources such as memory and threads. You also want the server to inhibit Java garbage collection when it is doing high-priority queries.

SQLJ and JDBC
SQLJ is a proposed standard for embedded SQL for Java. A SQLJ preprocessor reads a source file containing SQLJ code (.sqlj file) and emits a Java (.java) source file. SQLJ simplifies programming SQL databases and enables developers to use precompiled SQL. JDBC uses dynamic SQL, but SQLJ enables developers to use static SQL. SQLJ is important for developers writing Java stored procedures. Oracle included SQLJ with Oracle 8, IBM started shipping SQLJ with DB2 UDB version 5.2, and Informix and Sybase will release SQLJ in the first quarter of 1999.

JDBC consists of Java classes that abstract the process of writing SQL programs. JDBC provides connection objects, result set objects, metadata, and three types of statement objects. The first statement type supports ad hoc queries, the second supports prepared queries, and the third supports stored procedures. JDBC also includes classes that wrap SQL data types such as timestamps. (For more information about JDBC, consult JDBC Database Access with Java, Graham Hamilton, et al., Addison Wesley, 1997.) When Rick Cattell, Graham Hamilton, and Mark Hapner of JavaSoft specified JDBC 1.x, they were consistent with ODBC and SQL-92. However, SQL DBMSs are evolving into object-relational DBMSs, and JDBC 2.0 introduces SQL3 features that support the new object-relational databases.

Why Java?  
Here are a number of reasons why Java is an important tool for extending SQL databases:

Click here.



JDBC Drivers and Connections
A server-side JDBC program appears similar to client-side code, but there are differences. For client programming, JDBC developers use several types of drivers: the JDBC-ODBC bridge (type 1), a driver that maps JDBC to a native API (type 2), a pure Java driver for a middleware server that supports JDBC clients (type 3), and a pure Java driver that connects to a database server (type 4). Type 1 and 2 drivers use native libraries on the client. Type 3 and 4 drivers download to the client and do not use native libraries on the client. Middleware vendors offer type 3 drivers that support connections to disparate DBMSs. The major DBMS vendors offer type 4 drivers, and some offer more than one type. To connect to DB2, for example, an applet can load IBM's type 4 driver (COM.ibm.db2.jdbc.net .DB2Driver). An application can load COM.ibm.db2.jdbc.app .DB2Driver, a type 2 driver that uses the Client Application Enabler. The client uses a different connection string for applets and applications.

For a Java-enabled database, the JDBC driver architecture is different. A server-side program connects to a JDBC driver installed with the Java infrastructure on the database server. If you write a class for Sybase Adaptive Server, your code uses an internal JDBC driver instead of Sybase jConnect. Clients and server-side programs use different connection strings with JDBC's getConnection() method. An applet connects using the database URL and passes arguments such as userid, password, host name, and port number. This applet code connects to a DB2 database, where the url, userid, and password strings have been preset:

Connection con = getConnection(url, 
	userid, password);

A JDBC program that installs in the database uses a null connection string:

Connection con = getConnection();

User-Defined Functions
A SQL DBMS provides built-in or intrinsic functions for performing operations such as calculations, conversions, aggregation, and string processing. Many SQL DBMSs also support user-defined functions (UDF) in SQL statements. In some instances you create the UDF with SQL statements, and in others you install a library or Java class that implements the UDF. If you have an employee table that stores annual salary for some employees and hourly pay for others, and you want to execute a query that uses weekly salary as a filter, you can create a UDF that returns weekly salary as a function of classification (salaried or hourly). The query might look like this code, where weekly is the name of the UDF:

SELECT surname, first_name FROM employee 
WHERE ((weekly(salary, classification)) > 
	400.);

For a DBMS, such as DB2 or Adaptive Server, to recognize the UDF you must register it using a SQL CREATE FUNCTION statement. When you execute a CREATE FUNCTION the database manager includes the function in the database catalog (metadata). IBM DB2 enables you to use different programming languages, including Java, to create UDFs. When you register a function with CREATE FUNCTION you can specify execution options such as parameter-calling conventions and whether the function executes in the server address space.

Stored Procedures
A stored procedure is part of the logic of your application that executes on a database server. Depending on the DBMS, you code procedures in a SQL dialect or a programming language. Although the International Standards Organization (ISO) released a standard for procedures (or persistent stored modules) in 1996, database companies coalesced around Java as a stored procedure language.

Stored procedures use distributed processing. When a SQL client invokes a stored procedure, control is transferred to the server. The server executes the procedure and optionally returns data to the client. Procedures can return output parameters, and sometimes query result sets. Stored procedures minimize network round-trips between the client and server and they improve performance by decreasing network latency. Database administrators also use procedures to provide security. For example, you can give a user the right to execute a procedure that processes an order without giving the user rights to view or change order tables.

When you program procedures using Java, you must register them with the DBMS using the CREATE PROCEDURE statement. The syntax varies from one DBMS to another, but CREATE PROCEDURE typically identifies the type and size of input (IN) parameters, input/output (INOUT) parameters, and output (OUT) parameters. The syntax for executing a procedure varies, but there is a standard syntax you can use with JDBC drivers, ODBC drivers, and OLE DB providers. For example, to return a list of tennis players with a rating of 50, you could use this statement, where 50 is an input parameter:

{CALL bycatg(50)}

Sybase Adaptive Server Anywhere (ASA) lets you use stored procedure calls to the main method of classes installed in Java-enabled database. Therefore, you could execute this Sybase query to invoke the main method of a ClearTotals class:

CALL ClearTotals.main()

Example Class
Listing 1 is an example Java class (SrchCont) that installs in a Sybase ASA database. SrchCont searches the contact table of the example database (asademo.db) that installs with ASA 6.0. After you compile SrchCont and install it in asademo, you can use it to do a phonetic search of the city column by passing SrchCont a "sounds-like" value for a city name. It does a SOUNDEX search of the cities in the contact table, and returns the last name and first name of the person residing in that city. To install SrchCont.class in the asademo database, use a command such as this:

 
Figure 2. Sounds Like "Houston" Click here.

INSTALL JAVA NEW FROM FILE 
	'D:\\Examples\\Java\\SrchCont.class'
	;

Be sure to change the path information to match your path. After installing SrchCont, you can run SELECT queries. For example, use these statements to find someone living in a city that sounds like Houston:

CREATE VARIABLE jcity SrchCont;
SET jcity = NEW SrchCont('Houston');
SELECT jcity >>getName();

When you execute these statements, your results should look like Figure 2.

The Crystal Ball
Industry pundits seem to think that Java is good for nothing more than creating applets. However, problems such as download delays are irrelevant when it comes to server-side Java and Java in the database. Major database vendors are committed to Java as a solution for server extensibility. Even VM performance issues will not cause them to change course, as shown by IBM's native compilation of Java stored procedures on its OS/390 mainframes. If a peek into the crystal ball shows database projects in your future, then it's time to add Java to your skill set.



Ken North consults, writes, and teaches Expert Series seminars. He is the author of Windows Multi-DBMS Programming, John Wiley & Sons, 1995 and Database Magic with Ken North, Prentice Hall PTR, 1998. Reach him at or visit his Web site at .





What did you think of this article? Send your e-mail to the editors at .

© 1999 Fawcette Technical Publications.   Contact us at


| | | | |
| | | | | |