You are on page 1of 8

A ZipClassLoader for automated

application distribution
Learn how to use ClassLoaders and exploit the zip library

By Kevin Pauli, JavaWorld.com, 04/21/00

Some developers consider application maintenance an afterthought. Naturally, you tend


to focus on the task at hand, which is, first and foremost, to make your program work!
Sure, there might be a few minor bugs here and there, but you can always fix those in the
next maintenance release, right? And no, you didn't have time to include every bell and
whistle your users requested, but you can include those features in a subsequent release.
When you start thinking along these lines, your project needs a maintenance plan, and the
sooner the better.

For a recent Java project at IBM, my team was forced to consider application
maintenance from the start. The situation had an interesting set of characteristics: First,
the users were mobile and needed disconnected access to the application. Second, the
users were dispersed worldwide. Third, the application was expected to evolve quickly.
And fourth, the department had limited support resources to help users install upgrades.

Because the users were mobile, we needed to install the application locally on the their
laptops, rather than having them retrieve the latest bytecodes from a server. Because the
users were worldwide and frequent updates were expected (not to mention that we had a
small support infrastructure with a limited travel budget), we needed a reliable means of
remotely updating the application. We wanted something more seamless than an email
with an attached jar file and installation instructions, or an executable packaged with
InstallShield. The solution we implemented, which solved our problem nicely, was to
store the application bytecodes in the same database the users replicate for the application
data. We installed on everyone's workstation one jar we termed the bootstrapper. This
unchanging piece of code locates the local database, loads the application-specific
bytecodes, and launches the application. When the developers add a new report or fix a
bug, we recompile the bytecodes and replace them in the database on the server. The next
time users replicate their local database with the one on the server, they receive the new
bytecodes along with the other new application data.

What makes this dynamic approach possible is the Java ClassLoader. A ClassLoader is
the part of the JVM responsible for finding classes as your application instantiates them.
Thanks to the foresight of the Java creators, you can replace the default class loader with
one of your own, which loads classes in just about any way you can imagine. Any array
of bytes can be interpreted as a class that your program may instantiate at runtime. The
bytes may come from files in the local machine or from a server halfway around the
world, delivered via a network connection. In our case, we put the bytes into a Lotus
Notes database. For convenience, and to minimize the amount of replicated data, we

1
packaged all application-specific files into a compressed zip file. The java.util.zip
package lets users read and write files in the zip format.

Meet the players

In this article, I show you how we designed and developed the infrastructure for our
application deployment. The components should be reusable for your own applications.
Here is a brief preview of each component, which I then describe in more detail:

• The BootStrapper: The class com.paulitech.bootstrap.BootStrapper is the


main class we installed on the users' workstations to let them receive the dynamic
application updates. It takes, as command-line arguments, the location of the
database and the class to instantiate.

• The Bridge: Our application at IBM happens to use Lotus Notes as the database,
but yours may use something else. To decouple the choice of database from the
rest of the application, I've used a common design pattern known as a bridge (as
documented in Design Patterns, by Erich Gamma, et al., see Resources). All
access to the database is via the abstract interfaces defined in
com.paulitech.bridge. The Lotus Notes-specific implementation of the bridge
interface is in com.paulitech.bridge.notes. If your database is something
other than Lotus Notes, you will need to create your own bridge implementation.
The interface is rather small and simple, and this task should not take long for an
experienced JDBC programmer.

• The ZipClassLoader: This is our custom ClassLoader, located in


com.paulitech.classloader.ZipClassLoader, which extends
java.lang.ClassLoader, the default class loader. Its constructor is passed an
array of bytes, retrieved from the database, that represent the zip file. When you
need a class, you ask the ZipClassLoader for the class by name. The
ZipClassLoader handles all the ugly details of retrieving the proper array of
bytes out of its zip file. To the outside world, the ZipClassLoader acts just like
the ClassLoader that it extends. This essential component is delivered along with
the BootStrapper to the users.

• The FileInstaller: Of course, you need a tool to actually put the zip file into
the database so that it may be replicated.
com.paulitech.classloader.BridgeFileInstaller is an abstract class that
talks to the Bridge database interface described above and writes records to it.
The com.paulitech.classloader.notes.NotesFileInstaller is a concrete
implementation of the BridgeFileInstaller, which, naturally, uses a
NotesBridge. If you are using something other than Lotus Notes for the database,
you should create your own installer tool implementation, subclassing
BridgeFileInstaller, as I have done for the Notes case.

2
Grab on to your bootstraps

Below is a class diagram depicting what happens on the user's workstation. There's quite
a bit of code, so rather than go through it line by line I will describe the algorithms with
prose. The full source is available in the sample file (see Resources for a download). At
this point, it would be a good idea to follow along with the source code.

Figure 1. Class diagram depicting the BootStrapper and related classes

A batch file (bootstrap.bat, in the example) kicks off the main() routine in the
BootStrapper class, passing it the database location, the key that identifies the particular
record that contains the bytes of interest, the name of the class you wish to instantiate as
the main class, and the prefixes of any application-specific classes (separated by
commas). For example, if you work for Widgets USA, all your application-specific
classes start with com.widgetsusa, and you've included some custom libraries written by
a business partner named Gunkle Media, whose classes are all under com.gunklemedia,
this fourth parameter to the Bootstrapper should be

3
com.widgetsusa,com.gunklemedia. The reason this last parameter is necessary is
somewhat obscure, and it is necessitated by something that took me a great deal of hair-
pulling to figure out. I'll explain more when I discuss custom class loaders.

Bridge over troubled data

Armed with the parameters, the BootStrapper can go to work. First, it creates a
NotesBridgeFactory, which is a concrete implementation of an abstract
BridgeFactory class. The BridgeFactory, as you might guess from the name, is
responsible for generating a Unified Field Theorem. Just kidding! As the name suggests,
its sole purpose is to create Bridge objects. What is a Bridge object? A Bridge provides
an abstract mechanism for getting data in and out of your system. The core system does
not actually care how the data is transferred, just that it does. The data could be going in
and out of a database, could be sent and retrieved via a message queuing system, or could
be transcribed by human operators onto scraps of pigeon-delivered paper. The point is,
your core application talks to something that adheres to the Bridge interface and doesn't
worry about the details. In this particular case, the data is ultimately stored in a local
Lotus Notes database that the users replicate periodically with a server. The vagaries and
peculiarities of the Notes API (and there are many, trust me!) are hidden behind the
abstract interface. This allows you to change the data transport at a later time with
minimal impact on your existing code.

The Bootstrapper can ask the BridgeFactory for an instance of a Bridge that
corresponds to the key passed in. In this example, the file you're interested in is
exampleapp.zip. The BridgeFactory (actually implemented as a
NotesBridgeFactory) goes off to the Notes database and attempts to find a document
(NotesSpeak for record) that matches that key. When it finds that record, it creates a
Bridge (actually implemented as a NotesBridge at runtime) corresponding to that record
and returns it. The Bootstrapper, using the getPayload() method of the Bridge
interface, grabs the string representing the contents of the file. Notice I said "string," not
"array of bytes." In a perfect world, Notes would handle raw streams of bytes better, but
since Notes was designed as an unstructured document database, it does not support pure
binary data. You can turn arbitrary arrays of bytes into strings and back again via old-
fashioned base-64 encoding. I was able to steal -- ahem, reuse -- some nice base-64
encoding classes from org.w3c.tools.codec.

Zipping right along ... with a load of class!

So, after the Bootstrapper gets the string from the Bridge and decodes it, you have an
array of bytes that represent a zip file. Now what? How do you retrieve the classes and
instantiate them? The Java runtime environment (JRE) loads new classes and instantiates
them via a ClassLoader. When you start the JRE and load your initial class with the
main() method, it creates a ClassLoader known as the primordial class loader, which
loads your class. Any subsequent classes referenced by that class will use the same
ClassLoader that loaded it. This primordial ClassLoader searches for class files in
directories and archives that are specified in your classpath. If you want to load classes in

4
another way, you must subclass ClassLoader and do it yourself. This is not as hard as it
sounds, as you'll see. The only requirement is that your subclass implement the
findClass() method. This method must acquire the bytes that represent the class, then
call defineClass() in the superclass (ClassLoader), which will parse the bytes to make
sure they represent a valid class and return the newly created class.

In this case, you don't want to get the class files from the file system, but rather from a
zip file that happens to exist in memory as an array of bytes. Therefore, your custom class
loader will be called ZipClassLoader.

The custom ZipClassLoader I have written maintains a cache of the classes it has
loaded, so it can quickly return classes that have been used previously. If the class you
want is not in the cache, you start to look for it, as I'll explain. First, you must make sure
the class is an application-specific class (com.paulitech.examples.ExampleApp, for
instance) and not a system class (such as java.lang.String). If the class is application-
specific, you need to get it directly from the zip file, and not the primordial class loader.
You must get it from the zip file because the JRE can cache, on disk, certain nonsystem
classes indefinitely (presumably for performance reasons). This means, even after a
reboot, the primordial class loader, if asked, will return the old version of the class, as if it
were a system class! I discovered this the hard way (luckily, during testing), when I
attempted to deploy changes and found that sometimes the application continued to
exhibit its old behavior no matter what I did (including a reboot). Then, suddenly, the
cache would clear out (to this day I'm not sure what triggers it) and the changes would
appear! I suspect it has something to do with hotspot compiling classes and caching them
somewhere on disk. By always using my custom ClassLoader and never relying on the
primordial one, I can ensure that the latest bytecodes from the database are always used.

Don't byte off more than you can chew

Java includes utility classes for dealing with zip files and, in Java 1.2, jar files. Despite
the built-in library support, working with zip files is very tricky. You'll find yourself
manipulating the contents byte by byte. The Java creators could have spent a little more
time making easy-to-use zips, but fortunately, I've done the dirty work for you. The
algorithm is basically this: First, wrap a ZipInputStream around a
ByteArrayInputStream created with the original array of bytes. Next, loop through the
entries in the ZipInputStream until you find an entry that matches the class you are
looking for. If you don't find a match, throw a ClassNotFoundException back to the
Bootstrapper. If you do find a matching entry, start reading bytes from the stream into a
temporary buffer and keep reading until no more exist. Create a new array of bytes of the
proper length and copy the bytes from the temporary buffer into it. Once you have the
array of bytes representing the class, you can call defineClass() on the superclass
ClassLoader. The superclass parses the bytes to make sure they really represent a valid
Java class, and return an instance of Class. This Class is what you return to the
Bootstrapper. Whew!

5
Once the Bootstrapper has the reference to the Class, it can call the newInstance()
method on it to instantiate it, and the Bootstrapper's work is complete. Any classes
referenced by the new instance will be loaded by the same class loader that loaded the
new instance, which is, of course, the ZipClassLoader.

All the Bootstrapper can do is start a new instance of the main class; the constructor of
the instantiated class must do the rest. The best way to do this is to make your class
implement Runnable and have the constructor create a new thread to start it. Then put
your main application logic in run(). For example:

public class ExampleApp implements Runnable


{
public ExampleApp()
{
Thread thread = new Thread(this);
thread.start();
}
public void run()
{
System.out.println("Mahir kisses you!");
}
}

This class only needs to be instantiated in order to perform its task.

Install a file, any file

Now that you've seen how the user can load the classes on the fly, you may wonder how
the classes got into the database in the first place. Figure 2 shows a diagram depicting the
various components of the file installer.

6
Figure 2. Class diagram depicting the NotesFileInstaller and related classes

Since the installer is just another program that needs to access the database, it will reuse
the Bridge interface designed for the Bootstrapper. Only this time, instead of reading
from the Bridge, you'll write to it (using the setPayload() method). The code that
actually talks to the Bridge and installs the file is placed in an abstract base class called
BridgeFileInstaller, in a method called install(), which takes the path of the file
as a parameter. The install() method reads the bytes from the file, encodes the bytes
into a base-64 string, asks the BridgeFactory for a Bridge corresponding to the file
name, calls setPayload() to send the string to the bridge, and finally calls complete()
on the bridge to commit the transaction. This is a generic implementation that uses the
Bridge interface, not the more specific NotesBridge interface.

You can then subclass BridgeFileInstaller to provide your own specific


implementation; I call this one NotesFileInstaller. The main() method of
NotesFileInstaller takes two arguments: the location of the Notes database and the
name of the file to install. NotesFileInstaller creates an instance of a

7
NotesBridgeFactory to initialize it as the BridgeFactory singleton. Then,
NotesBridgeFactory calls the install() method of the superclass (described above),
passing it the name of the file. Finally, you call dispose() on the BridgeFactory to tell
it you are finished, allowing it to free up any native resources, close database
connections, and so forth.

It is a good idea to create a batch file to automate the file installation, especially if you
must update your application frequently. I have provided one for the sample application,
installapp.bat.

Future possibilities

What about command-line arguments for your main class? In this case, you were able to
pass runtime information to your main application via a properties file contained in the
zip file, along with all the class files, so you had no need for command-line arguments. If
a properties file is not good enough and you absolutely must have command-line
arguments, here is an approach you might take:

1. Modify the Bootstrapper to take the command line arguments.


2. Place all the arguments destined for the main application into a global Singleton
instance of a class designed for holding arguments, such as a Vector (see Design
Patterns for a complete description of the Singleton pattern).
3. Have your main application reference the singleton to retrieve the arguments.

One thing that might be nice is a JDBCBridge implementation, in case you're using a
database that supplies a JDBC driver. Or, you can create a JMSBridge that could use the
Java Message Service to retrieve the classes, assuming your users are always connected.

My team chose zip instead of jar as the archive format because support for manipulating
jars was not included until Java 1.2. As of this writing, IBM supports Java 1.1.8 only for
its internal users, so we went with the zip format. The classes for manipulating zips and
jars are virtually identical, so you could create a JarClassLoader with fairly little effort.