Conquer Class Loader Confusion
The ClassLoader solution is a one-time cost that provides a way out of class versioning conflicts
by Daniel F. Savarese
February 24, 2005
Of late, I've been hearing many complaints from colleagues and acquaintances about running into software versioning conflicts with J2EE application servers. The basic problem has existed for quite some time, but it would seem that there are now enough Java libraries shared between applications and application servers to exacerbate the problem. Versioning conflicts occur when an application server uses version A of a Java package, an application to be hosted by the server uses version B of the same package, and versions A and B are incompatible. When the application tries to use the package, the classes from version A are loaded instead of the ones from version B. If the classes behave differently, problems ensue.
This situation is common, in part, because so many application servers rely on open source software to some extent. The release cycle of commercial software is typically not as fast as that of open source software. Therefore, newly released application servers often do not include the latest versions of libraries. In addition, enterprise software upgrade cycles lag behind vendor release cycles, sometimes skipping releases for the sake of maintaining stability. Consequently, developers using the latest and greatest Java libraries run headlong into J2EE servers that incorporate older versions of the libraries.
The problem also emerges going the other direction. An enterprise application that has been around for several years may encounter an incompatibility when the application server is upgraded. If the program depends on an older version of a library packaged with the application server, the new library may cause run-time exceptions such as NoSuchMethodException when the program tries to access an API element that no longer exists.
Much of the time, developers don't notice that a different version of a library—here I consider a library to be a collection of Java packages—is being used because they aren't using parts of the API that have changed or aren't triggering bugs that may have been fixed in a later version. When the problem arises, developers have to figure out a way around it. Replacing the application server libraries can break the application server or other applications it hosts. If the developer doesn't have administrative control over the application server, that solution is not even possible.
Share and Share Alike
The crux of the problem is that most J2EE vendors have designed their products with a class-loading hierarchy that ultimately delegates to the application server's class loader. Even though separate class loaders are designated for each Web application—keeping Web applications from interfering with each other—the libraries used by the server itself are shared by all Web applications. I would speculate that there must be some products that don't have this problem, but I've heard reports of it surfacing on most of the major closed and open source products.
One approach programmers use to work around versioning conflicts is to change the names of the packages in the library source code to unique names used only by their code, recompile the library, and update all import statements and other references to package names in their source code. This solution works only if you have access to the library source code, which is not always the case. Although source code modification and recompilation is a reliable, short-term solution, it is time consuming in the long run. Writing shell scripts or using an IDE plug-in can automate package renaming.
Searching and replacing for all occurrences of package names within source code will catch the use of package names outside of package declarations and import statements, such as when fully qualified class names are used or names are embedded in strings used for reflection. Still, configuration and other support files may contain names, and some names used in reflection may be generated dynamically. You will always have to examine the code carefully to rename all instances of a name. More importantly, when you upgrade to a new version, you have to go through the whole process all over again. Patch files based on the previous set of modifications won't rename all package name occurrences in the new code, and shell scripts may require updating to account for code changes.
In the end, source code modification becomes a maintenance problem. You are now committing man-hours to maintain what is in effect a separate branch of a source code tree you would not normally maintain.
The two principal drawbacks of source code modification are that you need access to the source code, and maintaining the source code can require considerable effort. Maintaining a separate branch of one library may not seem overly difficult, but consider that some people working through this problem have to resolve conflicts with multiple libraries.
An alternative to source code package renaming is to rewrite binary class files. Rewriting class files has the advantages of not requiring you to maintain a separate source code branch and not requiring the source code. All you need are JAR files. There are few tools dedicated to renaming packages in JAR files, but most code obfuscation tools have the ability to rename packages and classes contained in a JAR file. Taking this approach makes upgrading libraries easy. All you do is process the JAR files with the rewriting tool, and you're done.
Just Do It
As effective as class file rewriting may seem, it is in fact an imperfect solution. The technique won't work when the library uses reflection and package names are embedded in strings or configuration files. Also, it does not relieve you from putting in the up-front effort of changing the use of package names in your application source code.
A more comprehensive solution is to do what the application server vendor should have done in the first place, and segregate your copy of the library from the server's copy of the library by using a custom class loader. To do this, you have to write some extra code, but you do not have to change the way your existing source code uses package names. Library upgrades are easy because you just replace the old JAR files with the new ones. How does this work?
The source of versioning conflicts is rooted in the application server's class-loading design. The Web application class loader delegates class loading to its parent class loader before trying to locate a class by itself. Therefore, if the application server's class loader can find the class in a system location, it will load that version instead of the one you packaged with your Web application. If you bootstrap your application with your own parentless class loader, you can bypass the libraries used by the application server.
As an example of this technique I have defined an interface called Printer and an implementing class called VersionPrinter that represent an application (see Listing 1). VersionPrinter depends on the Version class, but needs the specific 5.0.0 version. However, the application server uses version 1.0.2. Therefore, when VersionPrinter.print is called (see Listing 2), the string "version: 1.0.2" is printed.
You can bypass the application server libraries by defining a class loader that looks in your library directory first, before looking in the server library directory, which I've simulated by placing the two Version classes in two different build directories and placing first in the class loading path the directory containing version 5.0.0 of the Version class (see Listing 2). Then, I created a URLClassLoader instance initialized with the custom path and a null parent. The null parent ensures class loading is not delegated to a parent. Next, I load the class and use a dynamic proxy to map it to a known interface. When you run the sample program, the direct call to VersionPrinter.print will print "version: 1.0.2" and the dynamic proxy call will print "version: 5.0.0," showing that the desired version of the class was used instead of the default version.
Using the technique from the example, you do not have to change your application code at all. Sometimes programmers will custom load specific classes like Version, but you don't want to do that. If you were to do that, you would have to change VersionPrinter. Every conflicting class would have to be accessed with reflection. That makes for messy code. Instead, you want to establish an application entry point defined by an interface (for example, Printer) and custom load that application. Then the custom class loader will load all further classes used by the application.
One-Time Purchase
It is possible to implement a wrapper servlet that can be configured with a custom class path and a delegate servlet. The wrapper servlet will use the custom class path to load the delegate servlet and delegate all calls to the delegate servlet. Unfortunately, some servlet methods in some application servers require access to resources loaded by the application server class loader. Therefore, the wrapper servlet technique may not work in all cases. Still, you can use the technique by implementing a class loader that selectively delegates class loading to a parent. The class loader can be configured to not delegate to the parent when loading classes from specific packages.
The extra effort required by the class loader solution is a disadvantage, but it is a one-time cost. The use of a dynamic proxy should not degrade performance as long as you use it as close to the main application entry point as possible, which will minimize the number of reflective method calls. The loaded classes will consume extra memory, but that will always be the price for using two different versions of the same classes inside of the same JVM. Recent versions of some J2EE servers may provide their own solutions to versioning conflicts. I recall at least one server that renames the packages it uses so that you will not have to do so. However, should you be faced with class versioning conflicts, you now have a way out of the quandary.
About the Author
Daniel F. Savarese is an independent software developer and technology advisor. He has been the founder of ORO Inc., a senior scientist at Caltech's Center for Advanced Computing Research, and vice president of software development at WebOS. Daniel is the original author of the Jakarta ORO text processing packages and the Jakarta Commons Net network protocol library. He is also a coauthor of How to Build a Beowulf (MIT Press, 1999). Contact Daniel at www.savarese.org/contact.html.
|