JAVAWORLDXML

Programming XML in Java, Part 1
MORE LIKE THIS

XML APIs for databases
Process XML with JavaBeans, Part 1
Create Java apps with SAX appeal
By Mark Johnson
JavaWorld | Mar 13, 2000 12:00 AM
XML

SAX
So, you understand (more or less) how you would represent your
data in XML, and you're interested in using XML to solve many of
your data-management problems. Yet you're not sure how to use
XML with your Java programs.
FEATURED RESOURCE

Presented by Coverity
2013 Coverity Scan Open Source Report
Learn about the state of quality in the world's leading open source projects in the 2013
Coverity Scan
LEARN MORE
TEXTBOX: TEXTBOX_HEAD: Programming XML in Java: Read
the whole series!
Part 1. Use the Simple API for XML (SAX) to process XML in
Java easily
Part 2. Learn about SAX and XML validation through illustrative
examples
Part 3. DOMination: Take control of structured documents with
the Document Object Model
:END_TEXTBOX
This article is a follow-up to my introductory article, "XML for the
absolute beginner", in the April 1999 issue of JavaWorld (see
the Resources section below for the URL). That article described
XML; I will now build on that description and show in detail how
to create an application that uses the Simple API for Java (SAX),
a lightweight and powerful standard Java API for processing
XML.
The example code used here uses the SAX API to read an XML
file and create a useful structure of objects. By the time you've
finished this article, you'll be ready to create your own XML-
based applications.
The virtue of laziness
Larry Wall, mad genius creator of Perl (the second-greatest
programming language in existence), has stated that laziness is
one of the "three great virtues" of a programmer (the other two
being impatience and hubris). Laziness is a virtue because a lazy
programmer will go to almost any length to avoid work, even
going so far as creating general, reusable programming
frameworks that can be used repeatedly. Creating such
frameworks entails a great deal of work, but the time saved on
future assignments more than makes up for the initial effort
invested. The best frameworks let programmers do amazing
things with little or no work -- and that's why laziness is virtuous.
XML is an enabling technology for the virtuous (lazy)
programmer. A basic XML parser does a great deal of work for
the programmer, recognizing tokens, translating encoded
characters, enforcing rules on XML file structure, checking the
validity of some data values, and making calls to application-
specific code, where appropriate. In fact, early standardization,
combined with a fiercely competitive marketplace, has produced
scores of freely available implementations of standard XML
parsers in many languages, including C, C++, Tcl, Perl, Python,
and, of course, Java.
POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
Increase Velocity 40% by Using JRebel
SEE ALL
Go
The SAX API is one of the simplest and most lightweight
interfaces for handling XML. In this article, I'll use IBM's XML4J
implementation of SAX, but since the API is standardized, your
application could substitute any package that implements SAX.
SAX is an event-based API, operating on the callback principle.
An application programmer will typically create a
SAX Parser object, and pass it both input XML and adocument
handler, which receives callbacks for SAX events. The
SAX Parser converts its input into a stream
of eventscorresponding to structural features of the input, such
as XML tags or blocks of text. As each event occurs, it is passed
to the appropriate method of a programmer-defined document
handler, which implements the callback
interface org.xml.sax.DocumentHandler. The methods in this
handler class perform the application-specific functionality during
the parse.
For example, imagine that a SAX parser receives a document
containing the tiny XML document shown in Listing 1 below.
(See Resources for the XML file.)
<POEM>
<AUTHOR>Ogden Nash</AUTHOR>
<TITLE>Fleas</TITLE>
<LINE>Adam</LINE>
<LINE>Had 'em.</LINE>
</POEM>
Listing 1. XML representing a short poem
When the SAX parser encounters the <POEM> tag, it calls the
user-definedDocumentHandler.startElement() with the
string POEM as an argument. You implement
the startElement() method to do whatever the application is
meant to do when a POEM begins. The stream of events and
resulting calls for the piece of XML above appears in Table 1
below.
Item encountered Parser callback
{Beginning of document}
startDocument()
<POEM> startElement("POEM", {AttributeList})
"\n"
characters("<POEM>\n...", 6, 1)
<AUTHOR> startElement("AUTHOR", {AttributeList})
"Ogden Nash"
</AUTHOR> endElement("AUTHOR")
"\n"
<TITLE> startElement("TITLE", {AttributeList})
"Fleas"
</TITLE> endElement("TITLE")
"\n" characters("<POEM>\n...", 55, 1)
<LINE> startElement("LINE", {AttributeList})
"Adam" characters("<POEM>\n...", 62, 4)
</LINE> endElement("LINE")
<LINE> startElement("LINE", {AttributeList})
"Had 'em." characters("<POEM>\n...", 67, 8)
</LINE> endElement("LINE")
"\n" characters("<POEM>\n...", 82, 1)
</POEM> endElement("POEM")
{End of document} endDocument()
Table 1. The sequence of callbacks SAX produces while parsing Listing 1
YOU MIGHT ALSO LIKE
Java 101: The new Java Date & Time API

Java programming with lambda expressions
Java Tip: Write an SOA integration layer with Apache Camel
Daily updates on enterprise Java news, tips and techniques
You create a class that implements DocumentHandler to respond
to events that occur in the SAX parser. Theseevents aren't Java
events as you may know them from the Abstract Windowing
Toolkit (AWT). They are conditions the SAX parser detects as it
parses, such as the start of a document or the occurrence of a
closing tag in the input stream. As each of these conditions (or
events) occurs, SAX calls the method corresponding to the
condition in itsDocumentHandler.
So, the key to writing programs that process XML with SAX is to
figure out what the DocumentHandler should do in response to a
stream of method callbacks from SAX. The SAX parser takes
care of all the mechanics of identifying tags, substituting entity
values, and so on, leaving you free to concentrate on the
application-specific functionality that uses the data encoded in
the XML.
Table 1 shows only events associated with elements and
characters. SAX also includes facilities for handling other
structural features of XML files, such as entities and processing
instructions, but these are beyond the scope of this article.
The astute reader will notice that an XML document can be
represented as a tree of typed objects, and that the order of the
stream of events presented to theDocumentHandler corresponds
to an in-order, depth-first traversal of the document tree. (It isn't
essential to understand this point, but the concept of an XML
document as a tree data structure is useful in more sophisticated
types of document processing, which will be covered in later
articles in this series.)
The key to understanding how to use SAX is understanding
the DocumentHandlerinterface, which I will discuss next.
Customize the parser with org.xml.sax.DocumentHandler
Since the DocumentHandler interface is so central to processing
XML with SAX, it's worthwhile to understand what the methods in
the interface do. I'll cover the essential methods in this section,
and skip those that deal with more advanced topics.
Remember, DocumentHandler is an interface, so the methods I'm
describing are methods that you will implement to handle
application-specific functionality whenever the corresponding
event occurs.
Document initialization and cleanup
For each document parsed, the SAX XML parser calls
theDocumentHandler interface methods startDocument() (called
before processing begins) and endDocument() (called after
processing is complete). You can use these methods to initialize
your DocumentHandler to prepare it for receiving events and to
clean up or produce output after parsing is
complete. endDocument() is particularly interesting, since it's only
called if an input document has been successfully parsed. If
the Parser generates a fatal error, it simply aborts the event
stream and stops parsing, and endDocument() is never called.
Processing tags
The SAX parser calls startElement() whenever it encounters an
open tag, andendElement() whenever it encounters a close tag.
These methods often contain the code that does the majority of
the work while parsing an XML file. startElement()'s first
argument is a string, which is the tag name of the element
encountered. The second argument is an object of
type AttributeList, an interface defined in
packageorg.xml.sax that provides sequential or random access
to element attributes by name. (You've undoubtedly seen
attributes before in HTML; in the line <TABLE
BORDER="1">, BORDER is an attribute whose value is "1"). Since
Listing 1 includes no attributes, they don't appear in Table 1.
You'll see examples of attributes in the sample application later in
this article.
Since SAX doesn't provide any information about the context of
the elements it encounters (that <AUTHOR> appears
inside <POEM> in Listing 1 above, for example), it is up to you to
supply that information. Application programmers often use
stacks instartElement() and endElement(), pushing objects onto
a stack when an element starts, and popping them off of the
stack when the element ends.
Process blocks of text
The characters() method indicates character content in the XML
document -- characters that don't appear inside an XML tag, in
other words. This method's signature is a bit odd. The first
argument is an array of bytes, the second is an index into that
array indicating the first character of the range to be processed,
and the third argument is the length of the character range.
It might seem that an easier API would have simply passed
a String object containing the data, but characters() was
defined in this way for efficiency reasons. The parser has no way
of knowing whether or not you're going to use the characters, so
as the parser parses its input buffer, it passes a reference to the
buffer and the indices of the string it is viewing, trusting that you
will construct your own String if you want one. It's a bit more
work, but it lets you decide whether or not to incur the overhead
of String construction for content pieces in an XML file.
The characters() method handles both regular text content and
content inside CDATA sections, which are used to prevent blocks
of literal text from being parsed by an XML parser.
Other methods
There are three other methods in
the DocumentHandler interface:ignorableWhitespace(), processi
ngInstruction(),
and setDocumentLocator().ignorableWhitespace() reports
occurrences of white space, and is usually unused in
nonvalidating SAX parsers (such as the one we're using for this
article);processingInstruction() handles most things
within <? and ?> delimiters; andsetDocumentLocator() is
optionally implemented by SAX parsers to give you access to the
locations of SAX events in the original input stream. You can
read up on these methods by following the links on the SAX
interfaces in Resources.
Implementing all of the methods in an interface can be tedious if
you're only interested in the behavior of one or two of them. The
SAX package includes a class called HandlerBase that basically
does nothing, but can help you take advantage of just one or two
of these methods. Let's examine this class in more detail.
HandlerBase: A do-nothing class
Often, you're only interested in implementing one or two methods
in an interface, and want the other methods to simply do nothing.
The classorg.xml.sax.HandlerBase simplifies the implementation
of the DocumentHandlerinterface by implementing all of the
interface's methods with do-nothing bodies. Then, instead of
implementing DocumentHandler, you can subclass HandlerBase,
and only override the methods that interest you.
For example, say you wanted to write a program that just printed
the title of any XML-formatted poem (like TitleFinder in Listing
1). You could define a newDocumentHandler, like the one in
Listing 2 below, that subclasses HandlerBase, and only overrides
the methods you need. (See Resources for an HTML file
ofTitleFinder.)
012 /**
013 * SAX DocumentHandler class that prints the
contents of "TITLE" element
014 * of an input document.
015 */
016 public class TitleFinder extends HandlerBase {
017 boolean _isTitle = false;
018 public TitleFinder() {
019 super();
020 }
021 /**
022 * Print any text found inside a <TITLE> element.
023 */
024 public void characters(char[] chars, int iStart,
int iLen) {
025 if (_isTitle) {
026 String sTitle = new String(chars, iStart,
iLen);
027 System.out.println("Title: " + sTitle);
028 }
029 }
030 /**
031 * Mark title element end.
032 */
033 public void endElement(String element) {
034 if (element.equals("TITLE")) {
035 _isTitle = false;
036 }
037 }
038 /**
039 * Find contents of titles
040 */
041 public static void main(String args[]) {
042 TitleFinder titleFinder = new TitleFinder();
043 try {
044 Parser parser =
ParserFactory.makeParser("com.ibm.xml.parsers.SAXParser
");
045 parser.setDocumentHandler(titleFinder);
046 parser.parse(new InputSource(args[0]));
047 } catch (Exception ex) {
048 ; // OK, so sometimes laziness *isn't* a
virtue.
049 }
050 }
051 /**
052 * Mark title element start
053 */
054 public void startElement(String element,
AttributeList attrlist) {
055 if (element.equals("TITLE")) {
056 _isTitle = true;
057 }
058 }

Listing 2. TitleFinder: A DocumentHandler derived from
HandlerBase that prints TITLEs
1 2 3 4 NEXT NEXT
Page 2 of 4
This class's operation is very simple. The characters()method
prints character content if it's inside a <TITLE>. The private
boolean field _isTitle keeps track of whether the parser is in the
process of parsing a <TITLE>. ThestartElement() method
sets _isTitle to true when a <TITLE>is encountered,
and endElement() sets it to false when</TITLE> is encountered.
FEATURED RESOURCE

Coverity Scan
LEARN MORE
To extract <TITLE> content from <POEM> XML, simply create
a<Parser> (I'll show you how to do this in the sample code
below), call the Parser's setDocumentHandler() method with an
instance ofTitleFinder, and tell the Parser to parse XML. The
parser will print anything it finds inside a <TITLE> tag.
The TitleFinder class only overrides three
methods: characters(), startElement(), and endElement(). The
other methods of the DocumentHandler are implemented by
the HandlerBase superclass, and those methods do precisely
nothing -- just what you would have done if you'd implemented
the interface yourself. A convenience class like HandlerBase isn't
necessary, but it simplifies the writing of handlers because you
don't need to spend a lot of time writing idle methods.
As an aside, sometimes in Sun documentation you'll see
javadocs with method descriptions like "deny knowledge of child
nodes." Such a description has nothing to do with paternity suits
or Mission: Impossible; instead, it is a dead giveaway that you're
looking at a do-nothing convenience class. Such classes often
have the wordsBase, Support, or Adapter in their names.
A convenience class like HandlerBase does the job, but still isn't
quite smart enough. It doesn't limit you to a <TITLE> element
inside a <POEM>; it would print the titles of HTML files, too, for
example. And any tags inside a <TITLE>, such as <B>tags for
bolding, would be lost. Since SAX is a simplified interface, it's left
up to the application developer to handle things like tag context.
Now you've seen a useless, simple example of SAX. Let's get
into something more functional and interesting: an XML language
for specifying AWT menus.
An applied example: AWT menus as XML
Recently I needed to write a menu system for a Java program I
was developing. Writing menus in Java 1.1 is really quite easy.
The top-level object in a menu structure is either a MenuBar or
a PopupMenu object. A MenuBar contains sub-Menuobjects,
while PopupMenu and Menu objects can contain Menus, MenuItems,
andCheckboxMenuItems. Typically, objects of this type are
constructed manually in Java code, and built into a menu tree via
calls to the add() methods of the parent object.
Listing 3 shows the Java code that creates the menu shown in
Figure 1.
MenuBar menubarTop = new MenuBar();
Menu menuFile = new Menu("File");
Menu menuEdit = new Menu("Edit");

menubarTop.add(menuFile);
menubarTop.add(menuEdit);

menuFile.add(new MenuItem("Open"));
menuFile.add(new MenuItem("Close"));
menuFile.add(new MenuItem("And so on..."));
menuEdit.add(new MenuItem("Cut"));
menuEdit.add(new MenuItem("Paste"));
menuEdit.add(new MenuItem("Delete"));

Frame frame = new Frame("ManualMenuDemo");
frame.addWindowListener(new WindowAdapter()
{
public void windowClosing(WindowEvent e) {
System.exit(0);
}
});
frame.setMenuBar(menubarTop);
frame.pack();
frame.show();
POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
Listing 3. Creating a simple menu
Figure 1 below shows the simple menu that was handcoded in
Java from Listing 3.
Figure 1. The resulting menu of Listing 3 (below)
Simple enough, right? Well, not for me. Remember, I'm a lazy
programmer, and I don't like having to write all of this code to
create these menus. And I haven't even begun to write all of
the ActionListener andItemListener classes I need to actually
make these menus operate. No, I want something easier.
I'd much rather have a menu specification language that lets me
specify the menu structurally, and notifies my program through a
single interface when user events occur. I also want to be able to
reconfigure my menus without having to rewrite any code. I want
to create menu structures for naive or expert users simply by
changing the menu specification, and possibly rename the menu
items without changing any code. I want lots of functionality, and
I don't want to have to work for it.
Since I'm lazy, I'll choose an off-the-shelf SAX XML parser to do
my work for me. I'll specify the file format as an XML file. Then I'll
create a class called SaxMenuLoaderthat uses a SAX XML parser
to create menu structures defined by XML, stores the menus in
a Hashtable, and then returns the menus when I ask for them by
name.
This SaxMenuLoader will also listen for ActionEvents
and ItemEvents from the menu items it creates, and will call
appropriate handler methods to handle the actions. Once I've
written this SaxMenuLoader, all I need to do in the future is create
aSaxMenuLoader instance and tell it to load my XML menu
specification; then I can ask it by name for the MenuBars
and PopupMenus defined in the XML. (Well, I'll also have to write
and name the handlers, but that's application functionality. This
system can't do everything for me. Yet.)
Menu XML
For this example, I've created a little language I'll call Menu
XML. Depending on your application, you may want to implement
a standard XML dialect, defined in a document type definition
(DTD) by a standards organization or some other group. In this
case, I'm just using XML for controlling the configuration of my
application, so I don't care if the XML is standardized.
I'll introduce Menu XML with an example, which appears in
Listing 4. (SeeResources for an HTML file for Menu XML.)
001 <?xml version="1.0"?>
002
003 <Menus>
004
005 
006 <MenuBar NAME="TopMenu">
007
008 <Menu NAME="File"
HANDLER="FileHandler">
009 <MenuItem NAME="FileOpen"
LABEL="Open..."/>
010 <MenuItem NAME="FileSave"
LABEL="Save"/>
011 <MenuItem NAME="FileSaveAs"
LABEL="Save As..."/>
012 <MenuItem NAME="FileExit"
LABEL="Exit"/>
013 </Menu>
014
015 <Menu NAME="Edit"
HANDLER="EditHandler">
016 <MenuItem NAME="EditUndo"
LABEL="Undo"/>
017 <MenuItem NAME="EditCut"
LABEL="Cut"/>
018 <MenuItem NAME="EditPaste"
LABEL="Paste"/>
019 <MenuItem NAME="EditDelete"
LABEL="Delete"/>
020 <CheckboxMenuItem
NAME="EditReadOnly" LABEL="Disable Button
1"021 HANDLER="Button1Enabler"/>
022 </Menu>
023
024 <Menu NAME="Help"
HANDLER="HelpHandler">
025 <MenuItem NAME="HelpAbout"
LABEL="About"/>
026 <MenuItem NAME="HelpTutorial"
LABEL="Tutorial"/>
027 </Menu>
028
029 </MenuBar>
030
031 <PopupMenu NAME="Pop1"
HANDLER="PopupHandler">
032 <Menu NAME="Sub Menu 1"
HANDLER="SubMenu1Handler">
033 <MenuItem NAME="Item 1"
COMMAND="Item One"/>
COMMAND="Item Two"/>
035 </Menu>
COMMAND="Item Three"/>
COMMAND="Item Four"/>
COMMAND="Item Five"039
HANDLER="com.javaworld.feb2000.sax.DynamicMenu
ItemHandler"/>
040 </PopupMenu>
041
042 </Menus>

f
Listing 4. Sample Menu XML to be processed by sample
code
This language has just a few tags and attributes:
<Menus>: This is the document element for this language.
The <menus> tag simply groups all of the menus below it.
<MenuBar NAME="name">: The <MenuBar> tag defines a
new java.awt.MenuBar object. When parsing is completed,
the menu bar will be accessible by the given name.
<PopupMenu NAME="name">: The <PopupMenu> tag defines a
new java.awt.PopupMenuobject. When parsing is
completed, the popup menu will be accessible by the given
name.
<MenuItem NAME="name" [LABEL="label"]
[COMMAND="command"]>: This tag defines
ajava.awt.MenuItem. The item's label defaults to its name,
but can be set with theLABEL attribute. The
default actionCommand for the item is also the item's name,
but may be set with the COMMAND attribute.
<CheckboxMenuItem NAME="name" [LABEL="label"]
[COMMAND="command"]>: This tag defines
a java.awt.CheckboxMenuItem. It's just like a MenuItem,
except that the menu item checks and unchecks when
selected, instead of executing an action.
Any of these tags may optionally take an
attribute HANDLER="handlerName", which indicates the name of the
handler for that object and all of its children (unless one
of its children overrides the current handler by defining its own
handler). The handler name indicates what object and method
are to be called when the menu item is activated. The
mechanism for associated handler names with their handler
objects is explained in the implementation discussion below.
The containment relationship among the tags directly reflects the
containment relationship of the resulting objects. So, for example,
the PopupMenu called Pop1defined in Listing 4, line 31, contains a
single Menu and three MenuItems. As theSaxMenuLoader class
parses the XML file, it creates appropriate Java menu objects
and connects them to reflect the XML structure. Let's look at the
code forSaxMenuLoader.
YOU MIGHT ALSO LIKE

Load Menu XML with SAX: The SaxMenuLoader class
The following is a list of SaxMenuLoader's responsibilities:
Parses the Menu XML file using a SAX parser.
Builds the menu tree.
Acts as a repository for the MenuBar and PopupMenuitems
defined in the Menu XML.
Maintains a repository of event handler objects that are
called when the user selects menu items. An event handler
object is any object that implements interface
MenuItemHandler, defined by this package to unify action
and item events from menu items. Any object that
implements this interface can receive events
from MenuItems defined in Menu XML. (I'll cover
the MenuItemHandler in more detail shortly.)
Acts as an ActionListener and ItemListener for all menu
items.
Dispatches ActionEvents and ItemEvents to the
appropriate handlers for the menu items.
Use SaxMenuLoader
The MenuDemo class takes two arguments: the name of the Menu
XML file to parse, and the name of the MenuBar to place in the
application. MenuDemo.main() simply creates a MenuDemo instance,
and calls that instance's runDemo() method. The
methodMenuDemo.runDemo(), shown in Listing 5, demonstrates
how to use the SaxMenuLoaderin use. (See Resources for an
HTML file of SaxMenuLoader and MenuDemo.)
094 public void runDemo(String[] args) {
095 SaxMenuLoader sml = new
SaxMenuLoader();
096
097 // Bind names of handlers to the
MenuItemHandlers they represent
098
sml.registerMenuItemHandler("FileHandler",
this);
099
sml.registerMenuItemHandler("EditHandler",
this);
100
sml.registerMenuItemHandler("HelpHandler",
this);
101
sml.registerMenuItemHandler("PopupHandler",
this);
102
sml.registerMenuItemHandler("SubMenu1Handler",
this);
103
sml.registerMenuItemHandler("Button1Enabler",
this);
104
105 // Parse the file
106 sml.loadMenus(args[0]);
107
108 // If menu load succeeded, show the
menu in a frame
109 MenuBar menubarTop =
sml.menubarFind(args[1]);
110 if (menubarTop != null) {
111 Frame frame = new Frame("Menu demo
1");
112 frame.addWindowListener(new
WindowAdapter() {
113 public void
windowClosing(WindowEvent e) {
114 System.exit(0);
115 }
116 });
117 frame.setMenuBar(menubarTop);
118 _b1 = new Button("Button");
119 _b1.addMouseListener(new
MenuPopper(_b1, sml, "Pop1"));
120 frame.add(_b1);
121 frame.pack();
122 frame.show();
123 } else {
124 System.out.println(args[1] + ": no
such menu");
125 }
126 }

Listing 5. Using the SaxMenuLoader in the MenuDemo class
In Listing 5, line 95 creates the SaxMenuLoader. Then, lines 98
through 103 register the MenuDemo instance (this) as
theMenuItemHandler for all of the handler names referenced in
the Menu XML file. Since MenuDemo implementsMenuItemHandler,
it can receive callbacks from the menu items created in the Menu
XML. These registrations are what associate the symbolic menu
item handler names with the application objects that actually do
the work. Line 106 tells the SaxMenuLoader to load the file, and
line 109 gets the menu named MenuTop from the SaxMenuLoader.
The rest of the code is straightforward AWT, except for line 119,
which uses aMenuPopper object to associate a Button object with
a pop-up menu. MenuPopper is a convenience class I wrote that
looks up a named pop-up menu from a givenSaxMenuLoader, and
associates the pop-up menu with the given AWT component.
AMenuPopper is also a MouseListener, so that when the user
clicks the center or left mouse button on the MenuPopper's
component, the MenuPopper shows the pop-up menu on top of
that component.
This is all the code necessary to get menus from a Menu XML
file. You might have noticed that this is about as many lines of
code as it took to create a small menu manually. But this
technique provides much more power. You can reconfigure the
menus without recompiling or redistributing any class files.
What's more, you can extend the application with new menu
items and handlers for those items withoutrecompiling. (I'll
discuss how to dynamically extend a running application with
dynamic menu item handlers in the "Dynamic Menu Item
Handlers" section later in the article.) From now on, creating
extensible application menus is a lazy person's job!
So far, I've shown you how to use the SaxMenuLoader. Now let's
take a look at how it works.

rogramming XML in Java, Part 1
MORE LIKE THIS
XML APIs for databases
Create Java apps with SAX appeal
By Mark Johnson
JavaWorld | Mar 13, 2000 12:00 AM
XML

SAX
Page 3 of 4
Parse the XML with SAX
You'll remember that an object that
implementsDocumentHandler can receive events from a SAX
parser. Well, the SaxMenuLoader has a SAX parser and
it also implementsDocumentHandler, so it can receive events from
that parser.SaxMenuLoader's loadMenus() method is overloaded
for multiple types of inputs (File, InputStream, and so forth), but
all eventually call the method shown in Listing 6.
279 public void loadMenus(Reader reader_) {
280 if (_parser == null)
281 return;
282 _parser.setDocumentHandler(this);
283 try {
284 _parser.parse(new
InputSource(reader_));
285 } catch (SAXException ex) {
286 System.out.println("Parse error: "
+ ex.getMessage());
288
System.err.println("SaxMenuFactory.loadMenus()
: " + ex.getClass().getName() +
289 ex.getMessage());
290 ex.printStackTrace();
291 }
292 }

Listing 6. loadMenus() uses a SAX parser to parse menus
There's not much to this method -- it simply sets the
parser's DocumentHandler tothis, calls the
parser's parse() method, and handles any exceptions. How
could this possibly build a menu?
The answer is in the implementation of DocumentHandler.
Since SaxMenuLoaderimplements DocumentHandler, all of the
menu-building functionality (which is specific to this application)
occurs in the DocumentHandler implementation methods --
primarily in startElement().
SaxMenuLoader.startElement()
Listing 7 shows the implementation of startElement() that
creates the MenuBar,PopupMenu, Menu, MenuItem,
and CheckboxMenuItem objects and associates them with one
another. As the parser parses the XML, it
calls SaxMenuLoader.startElement()each time it encounters an
opening XML tag, passing the tag name and the list of attributes
for the tag. startElement() simply calls an
appropriate protected method within SaxMenuLoader based on
the tag name.
445 public void startElement(String sName_,
AttributeList attrs_) {
446
447 // Anything may override handler for
its context
448 String sHandler =
attrs_.getValue("HANDLER");
449 pushMenuItemHandler(sHandler);
450
451 // If "menubar", we're building a
MenuBar
452 if (sName_.equals("MenuBar")) {
453 defineMenuBar(attrs_);
454 }
455
456 // If "popupMenu", we're building a
PopupMenu
457 else if (sName_.equals("PopupMenu")) {
458 definePopupMenu(attrs_);
459 }
460
461 // If "menu", then create a menu.
462 else if (sName_.equals("Menu")) {
463 defineMenu(attrs_);
464 }
465
466 else if (sName_.equals("MenuItem")) {
467 defineMenuItem(attrs_);
468 }
469
470 else if
(sName_.equals("CheckboxMenuItem")) {
471 defineCheckboxMenuItem(attrs_);
472 }
473 }

POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
Listing 7. SaxMenuLoader.startElement()
This method does one additional thing: as noted above, any tag
in Menu XML can include an optional HANDLER name, which
defines the handler for all items that element contains. For
example, line 8 of Listing 4 definesFileHandler as the name of
the handler to call when any item in the File menu is
selected. startElement()implements this functionality in lines
448 and 449 by detecting the HANDLER attribute on any tag and
callingpushMenuItemHandler, which pushes the
namedMenuItemHandler onto a stack maintained by
the SaxMenuLoader. Therefore, whatever is on top of
the MenuItemHandler stack is always the appropriate handler for
any item to be created. startElement() always pushes a handler
(unless none has ever been specified); when an element doesn't
specify a HANDLER, pushMenuItemHandler pushes another copy of
whatever is on top of the stack. Later, endElement() always pops
a handler off of the stack if it can, so the balance between stack
pushes and pops is always maintained.
The methods called by startElement do the actual work of
creating the menu tree. I'll cover those next.
Create the Menu tree
Listing 8 shows defineMenuBar, which is called
when startElement receives aMenuBar element.
154 protected void defineMenuBar(AttributeList
attrs_) {
155 String sMenuName =
attrs_.getValue("NAME");
156 _menubarCurrent = new MenuBar();
157 if (sMenuName != null) {
158 _menubarCurrent.setName(sMenuName);
159 }
160 register(_menubarCurrent);
161 }
...
190 protected void
definePopupMenu(AttributeList attrs_) {
192 _popupmenuCurrent = new PopupMenu();
194
_popupmenuCurrent.setName(sMenuName);
195 }
196 register(_popupmenuCurrent);
197 }
FEATURED RESOURCE

Coverity Scan
LEARN MORE
Listing 8. defineMenuBar() and definePopupMenu()
As you can see, defineMenuBar() does very little: it simply
creates a new MenuBar, assigns it a name if one is provided, and
then registers it. The register() method simply stores
the MenuBar in a protected hash table, so that you can retrieve it
by name using the method menuBarFind() (as in Listing 5, line
109). definePopupMenu() works just likedefineMenuBar(), except
it creates a PopupMenu object and registers it so
that popupmenuFind() can return the newPopupMenu by name.
The private static
fields _menubarCurrent and _popupmenuCurrent contain a
reference to the current MenuBar or PopupMenu being built, to
which subsequent menus or menu items are added. Listing 9
shows the definition of a new Menu object.
052 protected void add(Menu menu_) {
053 Menu menuCurrent = menuCurrent();
054 if (menuCurrent != null) {
055 menuCurrent.add(menu_);
056 } else {
057 if (_menubarCurrent != null) {
058 _menubarCurrent.add(menu_);
059 }
060 if (_popupmenuCurrent != null) {
061 _popupmenuCurrent.add(menu_);
062 }
063 }
064 }
...
130 protected void defineMenu(AttributeList
attrs_) {
132
133 Menu menuNew = new Menu(sMenuName);
135 menuNew.setName(sMenuName);
136 } else {
137 sMenuName = menuNew.getName();
138 }
139 System.out.print("Created menu " +
sMenuName);
140
141 // Add to current context and make new
menu the current menu to build
142 add(menuNew);
143 pushMenu(menuNew);
144 }
Listing 9. add(Menu) and defineMenu()
defineMenu() is only slightly more complicated
than defineMenuBar(), because the menu being created is added
to whatever is currently being built, whether that is aMenuBar,
a PopupMenu, or another Menu. The defineMenu() method creates
a new Menuobject, sets its name, and then calls add(Menu), which
adds the given Menu either to the top menu of the menu stack (a
private field), the current MenuBar, or thePopupMenu under
construction. After adding the new Menu to the appropriate
parent,defineMenu() pushes the new menu onto the menu stack.
Anything contained inside the current menu in the XML file will be
added to the current menu at the top of the stack, so the resulting
menu structure reflects the XML structure. endElement()always
calls popMenu() when it receives a <Menu> tag, so the top of the
stack always refers to the menu currently under construction.
These stacks are necessary because, as stated before, SAX
doesn't keep track of tag context; that's part of the application-
specific functionality that SAX leaves up to you.
MenuItem and CheckboxMenuItems are created by the code shown
in Listing 10, and work in a fashion very similar to defineMenu().
104 protected void
defineCheckboxMenuItem(AttributeList attrs_) {
105
106 // Get attributes
107 String sItemName =
108 String sItemLabel =
attrs_.getValue("LABEL");
109
110 // Create new item
111 CheckboxMenuItem miNew = new
CheckboxMenuItem(sItemName);
112
113 if (sItemName != null) {
114 miNew.setName(sItemName);
115 } else {
116 sItemName = miNew.getName();
117 }
118
119 // Set menu attributes
120 if (sItemLabel != null) {
121 miNew.setLabel(sItemLabel);
122 } else {
123 miNew.setLabel(sItemName);
124 }
125
126 // Add menu item to whatever's currently
being built
127 add(miNew);
128 miNew.addItemListener(this);
129 }
...
155 protected void
defineMenuItem(AttributeList attrs_) {
156
157 // Get attributes
158 String sItemName =
159 String sItemLabel =
attrs_.getValue("LABEL");
160
161 // Create new item
162 MenuItem miNew = new
MenuItem(sItemName);
163 if (sItemName != null) {
164 miNew.setName(sItemName);
165 } else {
166 sItemName = miNew.getName();
167 }
168
169 // Set menu attributes
170 if (sItemLabel != null) {
171 miNew.setLabel(sItemLabel);
172 } else {
173 miNew.setLabel(sItemName);
174 }
175
176 // Add menu item to whatever's currently
being built
177 add(miNew);
178 miNew.addActionListener(this);
179 }
YOU MIGHT ALSO LIKE

Listing 10. defineMenuItem() and defineCheckboxMenuItem()
Both of these methods create an object of the appropriate type
(MenuItem or CheckboxMenuItem), set the new object's name and
label, and add() the object to whatever is on top of the menu
stack (or to the PopupMenu under construction -- MenuItems can't
be added to MenuBars). The only real difference between the two
is that a MenuItem notifiesActionListeners of user actions, and
a CheckboxMenuItemnotifies ItemListeners. In either case,
the SaxMenuLoaderinstance itself is listening for the item events,
so that it can dispatch them to the
appropriate MenuItemHandler when
the ActionEvent or ItemEvent occurs.
When the parser successfully completes parsing,
the MenuItemHandler stack and theMenu stack will both be empty,
and the hash tables will hold all of the
the MenuBarand PopupMenu objects, indexed by their names. You
can ask for MenuBars andPopupMenus by name, since they're built
and waiting to be requested.
Let's now turn our attention to the runtime behavior of these
menus.
SaxMenuLoader menus at runtime
I've described how menus are created when the menu is parsed,
but how do menus actually appear in an application?
You'll recall that the top-level menu bar comes from
the SaxMenuLoader, when you fetch it by name from
the SaxMenuLoader (if you don't recall, see Listing 5 and the
following discussion).
When the user selects a menu item, all of that menu item's
listeners are notified of the selection, right? Well, Listing 10
above shows that it is the SaxMenuLoader itself that is listening for
these events. A menu item selected by a user notifies
theSaxMenuLoader by calling
its actionPerformed() or itemStateChanged() method
(depending on whether the item was a regular or checkbox menu
item). Listing 11
shows actionPerformed() and itemStateChanged() of SaxMenuLo
ader.
038 public void actionPerformed(ActionEvent e)
{
039 Object oSource = e.getSource();
040 if (oSource instanceof MenuItem) {
041 MenuItem mi = (MenuItem) oSource;
042 MenuItemHandler mih =
menuitemhandlerFind(mi);
043 if (mih != null) {
044 mih.itemActivated(mi, e,
mi.getActionCommand());
045 }
046 }
047 }
...
203 public void itemStateChanged(ItemEvent e)
{
204 Object oSource = e.getSource();
205 if (oSource instanceof MenuItem) {
206 MenuItem mi = (MenuItem) oSource;
menuitemhandlerFind(mi);
208 if (mih != null) {
209 if (e.getStateChange() ==
ItemEvent.SELECTED) {
210 mih.itemSelected(mi, e,
211 } else {
212 mih.itemDeselected(mi, e,
213 }
214 }
215 }
216 }

Listing 11. actionPerformed() and itemStateChanged()
receive notification from menu items
actionPerformed() gets the source object that caused the action;
if that action was a MenuItem, it looks up that item's handler and
calls the
handler's itemActivated() method.itemStateChanged() is
similar, except that it calls the
handler's itemSelected() or itemDeselected() methods,
depending on the state change indicated by
the ItemEventpassed in.
Notice that in both cases, menuitemHandlerFind() is used to find
a handler for the menu item. Remember that you register the
menu item handlers with theSaxMenuLoader (Listing 5, lines 098
through 103). But reexamine for a moment Listing 4, lines 038
through 039:
COMMAND="Item Five"039
HANDLER="com.javaworld.feb2000.sax.DynamicMenu
ItemHandler"/>

PREVIOUS PREVIOUS | 1 2 3 4 NEXT NEXT

age 4 of 4
Instead of a registered handler name, the value of the
attributeHANDLER is a class name. This is how I implemented
menu item handlers that are loaded at runtime, so that
menus can be extended without recompiling the application.
Dynamic menu item handlers
Just a few lines of code allow a Menu XML file to specify any
Java class name as a MenuItemHandler (assuming that the
class is accessible and indeed implements that interface).
Listing 12 shows how to do this.
340 protected MenuItemHandler
menuitemhandlerFind(String sName_) {
341 if (sName_ == null)
342 return null;
(MenuItemHandler)
_htMenuItemHandlers.get(sName_);
344
345 // Not registered. See if it's a class
name, and if it is, create an
346 // instance of that class and register
it.
347 if (mih == null) {
348 try {
349 Class classOfHandler =
Class.forName(sName_);
350 MenuItemHandler newHandler =
(MenuItemHandler)classOfHandler.newInstance();
351
registerMenuItemHandler(sName_, newHandler);
352 mih = newHandler;
354 System.err.println("Couldn't
find menu item handler '" + sName_ +
355 ": no such registered
handler, and couldn't create");
356 System.err.println(sName_ + ":
" + ex.getClass().getName() + ": " +
ex.getMessage());
357 }
358 }
359 return mih;
360 }
...
406 protected void pushMenuItemHandler(String
sName_) {
407 MenuItemHandler l =
menuitemhandlerFind(sName_);
408 if (l == null)
409 l = menuitemhandlerCurrent();
410 pushMenuItemHandler(l);
411 }

FEATURED RESOURCE

Coverity Scan
LEARN MORE
POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
Listing 12. Implementation of dynamic item handlers
Remember that each time SaxMenuLoader encounters
aHANDLER attribute, it calls pushMenuItemHandler (see Listing
7). Listing 12 (lines 406 through 411) shows
thatpushMenuItemHandler(String) usesmenuitemhandlerFind(St
ring) to look up the handler by
name. menuitemhandlerFind(String) tries to find the item
handler in the protected _htMenuItemHandlers hash table. If
no such handler is registered, it assumes the name of the
handler is a class name. It tries to load the class whose
name is the handler name; if it succeeds, it creates an
instance of that class.menuitemhandlerFind(String) returns
the resulting handler, which was either found in the hash
table or loaded on the fly.
The Menu XML package now provides a flexible, easy facility
for defining the menus of an application, and extending
them without recompiling. I can add items to the menu at
will, and define handlers for those new menu items that are
dynamically loaded at runtime. Menus are now easy!
Conclusion
SAX is a powerful tool for simple XML processing. With a
little headwork, it's easy to create applications that take
advantage of XML's extensibility, flexibility, and
standardization. In this article, you've seen how SAX works,
and have been introduced to a useful example of XML in
action.
In the next article in this series, I'll show how to use a
validating SAX parser, which detects errors in the input XML
by checking its structure against a grammar called
adocument type definition (DTD). I'll also present a
special DocumentHandler class called LAX (the Lazy API for
XML), which makes writing document handler classes a
piece of cake. Tune in next month!
Mark Johnson works in Ft. Collins, Colo., as a designer and developer for Velocity by day, and
as a JavaWorld columnist by night -- very late at night.
Learn more about this topic
"XML for the Absolute Beginner," Mark Johnson (JavaWorld, April 1999)
http://www.javaworld.com/javaworld/jw-04-1999/jw-04-xml.html
David Megginson, creator of SAX, has an excellent SAX site
http://www.megginson.com/SAX/index.html
"Portable Data/Portable CodeXML & Java Technologies," JP Morgenthal -- Sun whitepaper on
the combination of XML and Java
http://java.sun.com/xml/ncfocus.html
"XML and JavaA Potent Partnership, Part 1," Todd Sundsted (JavaWorld, June 1999) gives an
example of how XML and SAX can be useful for enterprise application integration
http://www.javaworld.com/javaworld/jw-06-1999/jw-06-howto.html
"Why XML is Meant for Java," Matt Fuchs (WebTechniques, June 1999) is an excellent article
on XML and Java
http://www.webtechniques.com/archives/1999/06/fuchs/
Download the source files for this article in one of the following formats:
In jar format (with class and java files)
http://www.javaworld.com/javaworld/jw-03-2000/xmlsax/SAXMar2000.jar
In tgz format (gzipped tar)
http://www.javaworld.com/javaworld/jw-03-2000/xmlsax/SAXMar2000.tgz
In zip format
http://images.techhive.com/downloads/idge/imported/article/jvw/2000/03/saxmar2000.zip

MORE LIKE THIS
Bean Markup Language, Part 1
XML JavaBeans, Part 3
Interconnect JavaBeans to process XML
By Mark Johnson
JavaWorld | Nov 20, 1999 12:00 AM
XML

javabeans
beans

IBM
java.bean
The expression "eating your own dog food" has gained currency
over the last few years. It means taking the product you're selling
in your daily business and using it yourself, so that you
understand it from the consumer's point of view. I've been
churning out columns on JavaBeans (my particular brand of dog
food) for the last couple of years, teaching readers how to create
new JavaBeans and use them in novel ways. But
Ihaven't focused on using JavaBeans in applications -- I haven't
been eating my own dog food. And it's about time I sat down to a
big chunky bowl of it.
FEATURED RESOURCE

Coverity Scan
LEARN MORE
Process XML with JavaBeans: Read the whole series!
Part 1. Interconnect JavaBeans to process XML
Part 2. How IDEs interconnect components
Part 3. Simplify XML processing with XMLConvenience Beans
With that in mind, this month I am going to cover not how to
create JavaBeans, but how to use them. Using JavaBeans to
create applications in an integrated development environment
(IDE) is a great way to learn how to think in components.
Component designers and implementers who are forced to chow
down on what they've been dishing up quickly learn what makes
a component useful -- or useless. Many developers who
assemble JavaBeans into running applications have experience
primarily with GUI components; thus, in this series, I'll particularly
focus on components that do data processing and
have no runtime user interface.
The software package I'll be using for this discussion is IBM's
XML Bean Suite, available for free from IBM's alphaBeans site
(see Resources for a link). This package is very different from the
XML JavaBeans and BML I've covered in the past. Those
discussions dealt with converting JavaBean components to XML,
or creating JavaBeans from XML. The XML Bean Suite, on the
other hand, is a set of JavaBean components designed for
processing XML data. The suite contains JavaBeans that a
developer interconnects visually in an IDE in order to read, write,
display, search, and filter XML data. Many of these JavaBeans
have no user interface at runtime; they do most of the
application's work internally. They're also excellent design
examples of how to encapsulate functionality into a component.
This article assumes that you're familiar with the basics of
JavaBeans and XML. Links to background material for this article
appear in the Resources section.
This month's column is mostly an overview of the XML Bean
Suite, which contains a large number of classes for processing
XML. I'll also discuss how IDEs interconnect JavaBeans in
response to your input, and I'll point out useful design principles
as we go along. Columns to follow will use the XML bean classes
to create applications (such as an XML file editor) that process
XML data.
Contents of the XML Bean Suite
The alphaBeans site (see
Resources
) is the JavaBeans section of IBM's alphaWorks site, which
provides "early adopter developers direct access to IBM's
emerging 'alpha-code' technologies." This means that the code is
freely downloadable from the site. Some of the code is even
available for free commercial use, but the licensing restrictions
vary by package. The designation
alpha
also means that the software is not ready for prime time. APIs
are not guaranteed to be stable, the software may be updated
erratically, and IBM makes no guarantees about ever turning the
material on alphaWorks (and alphaBeans) into commercial
products. Still, several projects that began on alphaWorks have
graduated to full commercial status. Most, if not all, alphaWorks
technologies have online discussion forums where users can get
advice from the developers creating the software, and can make
suggestions for improving the products.
POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
The XML Bean Suite is a set of 39 Java classes available for free
from the alphaBeans site. Since it's alpha software, it doesn't yet
work with the latest version of Swing (it requires Swing 1.0.2),
and doesn't even work with the newest version of IBM's XML
processor xml4j (it requires version 1.1.4). The license
agreement that appears at download time grants redistribution
rights to the code (though you shouldn't take my word for it --
read the license yourself).
The 39 classes in the suite are divided into five sets of related
JavaBeans. Many of these beans are nonvisual; that is, they may
have a design-time user interface (such as a property sheet), but
have no user interface at runtime. The five sets of XML beans
appear in Table 1.
Bean Set # of Description
Beans
XMLCoreBean
4
Nonvisual beans that convert XML between text and DOM
representations and manage DOM Nodes
Table 1. Five sets of XML beans in the XML Bean Suite
XMLViewer
5 Visual beans that display XML documents or DTDs in various ways
XMLEditor
12
Nonvisual operator beans that allow construction of DTD-
directed XML editors
XMLProcessing 5
Nonvisual beans that provide filtering,
tokenizing, searching, and other
operations on XML data
XMLConvenience 13
Beans that implement common XML editing subfunctions by
combining XMLEditor beans and java.awt GUI objects
Each of these bean sets provides a domain of XML processing.
You can wire instances of the beans from these sets together to
create XML applications. Let's look at the XMLCoreBean set first.
XMLCoreBean set
The most basic set of XML beans is
XMLCoreBean
. This set of beans lets you convert XML text to a Document
Object Model (DOM) representation of the XML and convert the
DOM to XML text. These XML beans all operate on DOM
documents or parts of documents, so they act as the gateway
between the DOM and XML. The most central of these is the
XML parser
DOMGenerator
.
Talkin' 'bout DOM generation
The
DOMGenerator
bean is a JavaBean encapsulation of an XML parser, as shown
in Figure 1. The bean has three properties:
inputXmlFileLocation
(a string),
inputXmlText
YOU MIGHT ALSO LIKE

(also a string), and
inputXmlURLLocation
(a
URL
, which may specify the source of the XML data to be parsed).
When any of these properties are set,
DOMGenerator
immediately reads the text from the XML source and produces a
result of type
org.w3c.dom.Document
, which is the root of a DOM-object tree that represents the input
XML. This
Document
can then be passed for processing to any object that receives a
Document
as input.
Figure 1. DOMGenerator
produces a DOM result from XML input
DOMGenerator also fires several events to let any interested
listener know how the parsing is coming
along. DOMGenerator fires aDOMGenerationEvent before it starts
parsing, as well as after it completes parsing, or after an error
occurs. The event contains a code that indicates which type it is.
An object that needs to know what a DOMGenerator is up to
implements the interface DOMGenerationListener, which has
methods generationStarted(), generationError(),
and generationOver(). The object registers itself as a listener
with the DOMGeneratorin which it's interested, and
the DOMGenerator then fires events at the listening object(s) by
calling the appropriate DOMGenerationListener interface
methods.
What does it mean for the DOMGenerator to fire an event? And
what does its firing an event provide to the application
developer? The answers to these questions lie in an explanation
of how IDEs hook objects together.
Event listener interfaces
This is a good opportunity to discuss how IDEs interconnect
objects. When you're writing a class, it's common to want an
instance of your class to be notified when an event source fires
an event. A good example is a label that changes when a
DOMGenerator
object begins parsing, which we'll set up shortly. To catch an
event (because you're interested in the object firing the event),
you write the class to implement the
listener interface
for that event, and then
register
yourself with the event source by calling
addEventtypeListener
, the event source's method.
So, some object (which object depends on your IDE) in the
system implements DOMGenerationListener, and then
callsDOMGenerator.addDOMGenerationListener(this).
TheDOMGenerator calls that object's generationStarted()method
before it starts parsing, and that object sets the label's text value
to reflect the fact that parsing has started.
A visual IDE sets up such a listener interface by letting you
visually indicate the source and target objects, usually by drawing
a line between them. Once the information about the event type,
the event source, and the event target are indicated in the IDE,
the IDE automatically generates the calls
toaddEventtypeListener and implementations for the listener
interface.
Let's look at a quick example using DOMGenerator in IBM's
VisualAge for Java. Note that these XML beans will work in any
JavaBeans-compliant IDE, not just VisualAge. Any IDE worthy of
the name will let users create event connections graphically. The
diagrams may look a bit different, but the basic idea is the same
across the different IDEs.
Figure 2 shows a tiny application that uses
the DOMGenerator class to parse an XML file. This figure is a
screen shot taken directly from the IDE (except that I added the
letters A through D to aid discussion below.
Figure 2. Wiring up a
DOMGenerator in an IDE to create an application
I created four event connections in this application, identified by
four capital letters, corresponding to the letters in the list below:
A: When the text box produces
anactionEvent (callingactionPerformed()), the DOMGenerator's
propertyinputXmlFileLocation is set to the string in the text box.
B: When the DOMGenerator fires a DOMGenerationEvent to
indicate that it has started parsing, the text label's value is set
to Started parsing...
C: When the DOMGenerator fires a DOMGenerationEvent to
indicate that it has finished parsing, the text label's value is set
to Parsing complete.
D: If the DOMGenerator fires a DOMGenerationEvent indicating that
an error occurred, the text label's value is set to Parse Error!
This means that, when the user types a filename into the text box
and hits Return, the DOMGenerator gets a filename as its input,
automatically starting a parse. TheDOMGenerator fires an event to
indicate that it is beginning to parse, parses the file, and throws
an event to indicate either successful completion of the parse or
an error condition. In either case, you can see the label changing
as the parse proceeds. If the DOMGenerator succeeds in parsing,
it makes the Document object it created available via
its result property.
But how did just drawing a few lines cause the events to hook
together? The lines I drew indicate an event listener relationship
between the source (the object the line comes from) and
the target (the object the line goes to). Let's look at case B for a
simple example. In my IDE, I connected the DOMGenerator to
the Label by selecting
the DOMGenerator's generationStarted event (on a popup menu)
and linking it to setting the Label's text property. I entered
the Started Parsing... string as the value for the label text by
editing the properties on the relationship (the line between
elements) itself.
When I drew this relationship, the IDE did two things. First, it
chose an object to be the DOMGenerationListener. VisualAge for
Java makes the application class (calledAWTDOMGeneratorDemo)
be the listener, and then implements the listener methods. So,
the IDE added a new implements clause to the class definition,
like this:
public class AWTDOMGeneratorDemo
extends Frame
implements
com.ibm.xml.generator.event.DOMGenerationListe
ner, WindowListener, ...
Then, it implemented the methods for that listener interface, one
of which wasgenerationStarted():
/**
* Method to handle events for the
DOMGenerationListener interface.
* @param arg1
com.ibm.xml.generator.event.DOMGenerationEvent
*/
/* WARNING: THIS METHOD WILL BE REGENERATED.
*/
public void
generationStarted(com.ibm.xml.generator.event.
DOMGenerationEvent arg1) {
// user code begin {1}
// user code end
if ((arg1.getSource() ==
getDOMGenerator1()) ) {
connEtoM2(arg1);
}
// user code end
}
POPULAR ON JAVAWORLD

Git smart! 20 essential tips for Git and GitHub users
The wait is over: JDK 8 is here!

Code in JavaScript the smart, modular way
In plain English, this method translates to, "When the
DOMGenerator
calls
generationStarted()
, call the method
connEtoM2()
." The latter method actually sets the label, as shown in the listing
below. Now, the application class sets the label's
text
property (the string that the label displays) to
Started Parsing...
whenever the
DOMGenerator
fires a
generationStarted
event. This is the most common method IDEs use to create
relationships between JavaBeans without modifying the their
source code.
/**
* connEtoM2:
(DOMGenerator1.domGeneration.generationStarted
(
*
) --> Label2.text)
* @param arg1
*/
/* WARNING: THIS METHOD WILL BE REGENERATED.
*/
private void
connEtoM2(com.ibm.xml.generator.event.DOMGener
ationEvent arg1) {
try {
// user code end
getLabel2().setText("Started
parsing...");
// user code end
} catch (java.lang.Throwable ivjExc) {
// user code end
handleException(ivjExc);
}
}
The event listener interface was set up in an
init
function called when the object was constructed. Your IDE may
do things differently. Remember, I said above that some object
must be able to respond to the incoming events; in VisualAge's
case, that object is the application class itself. Other IDEs may
generate tiny
adapter
1 2 NEX

MORE LIKE THIS
HOW-TO
HOW-TO
NEWS
Interconnect JavaBeans to process XML
By Mark Johnson
JavaWorld |Nov 20, 1999 12:00 AM
XML

javabeans
beans

IBM
JavaBean
Page 2 of 2
classes whose sole purpose is to listen for a message on one
object (the source) and call some method (or set a property, or
whatever) on another object (the target) when a message arrives.
Still other IDEs create a single multiplexing adapter that listens
for all events in the application and dispatches all of the resulting
calls. The adapter approach to handling event listeners is a much
more object-oriented solution to connecting objects together than
the "if-then-else" approach used by VisualAge. You can play with
this little application yourself; it's available for download from the
FEATURED RESOURCE

Coverity Scan
LEARN MORE
Resources
section at the end of this article. Included in the source archive is
a small XML file,
example.xml
, that I'll use throughout the article as test input. What's
fascinating about this application is that I created it without writing
a single line of code. All I did was drop instances of some
JavaBean classes and wire them together by hooking up event
relationships. I selected properties and drew lines, and the IDE
wrote all the code that instantiated the JavaBeans and glued
them together. It's possible to create more complex (and actually
YOU MIGHT ALSO LIKE

useful
) applications in the same way. Now that you've got some
background on how beans talk to one another, let's return to the
XMLCoreBean
set and see what other tools it offers.
XMLFileGenerator and XMLStringGenerator
The
XMLFileGenerator
and
XMLStringGenerator
beans perform the inverse operation of the
DOMGenerator
. Where the
DOMGenerator
parses XML data and produces a DOM tree,
XMLFileGenerator
encodes a DOM tree into an XML file, and
XMLStringGenerator
encodes a DOM tree as a
String
. Figure 3 shows a schematic diagram of what these two classes
do.
Figure 3. XMLFileGenerator
and XMLStringGenerator
Input can be given to an
POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
XMLFileGenerator
or
XMLStringGenerator
by setting the object's
inputDocument
property, which is of type
org.w3c.Document
, and which is the root of the DOM tree to convert to XML. Setting
the property causes the object to immediately convert the DOM
tree into XML and write the XML to its output.
XMLFileGenerator
writes its output to a file whose name is the value of the
XMLFileGenerator
's
xmlSaveLocation
property.
XMLStringGenerator
has a string property called
result
, which contains the XML representation of the DOM document
most recently converted. Figure 4 shows a quick sample
application,
XMLStringGeneratorDemo
, that demonstrates
XMLStringGenerator
in action. Again, the capital letters in the diagram don't appear in
the IDE; I added them to facilitate discussion.
Figure 4.
XMLStringGeneratorDemo demonstrates XMLStringGenerator
XMLFileGeneratorDemo
demonstration application extends our previous demo
application. I've added a (nonvisual)
XMLFileGenerator
to the application, and added a
TextArea
to the window to receive the string result that the
XMLFileGenerator
produces. The application event flow is typically something like
the following:
A: The user enters a filename and hits Return, triggering
anactionEvent. This event sets the DOMGenerator's
inputinputXmlFileLocation property. Setting this property
causes the DOMGenerator to parse the XML file, which
theDOMGenerator makes available as
its result property. B: TheDOMGenerator, after completing its
parse and setting itsresult property, sets
the inputDocument property of theXMLStringGenerator.
The XMLStringGenerator then translates the DOM to a string,
which the XMLStringGenerator makes available as
its result property. C: The XMLStringGenerator'sresult property
is bound to the TextArea's text property, causing the TextArea to
fill with the XML from the XMLStringGenerator.
This may seem like an awful lot of work just to get a string into a
TextArea
, but this tiny sample application is intended to prove a concept
more than anything else. All three of the beans described so far
have some common properties and methods you should know
about. The
autoAction
property of each of these beans indicates whether the bean
should begin processing as soon as its input property is set.
autoAction
is
true
by default; if it's
false
, an external object can set the bean's input property, and the
bean won't produce any output until its
triggerAction()
method is called. This would allow you to, say, enter a filename
for a
DOMGenerator
bean and delay parsing until the user clicks a button. The button,
when clicked, would call the
DOMGenerator
's
triggerAction()
method, starting the parsing process. The final class in the
XMLCoreBean
set,
NodeArray
, is a nonvisual container class that contains an
org.w3c.dom.Node
MORE LIKE THIS
HOW-TO
HOW-TO
NEWS
array, of which every node in a DOM subtree is a subclass. This
class is a simple container to which messages may be sent,
directing the container to add or delete nodes from itself. I'll go
over
NodeArray
when the time comes to use it in a sample application.
Meanwhile, let's get a head start on next month's column by
taking an introductory peek at the set of XML beans that process
the DOM structure you've just learned to load.
XMLProcessing set
The
XMLProcessing
set of beans handles the processing of an XML document once it
has been parsed into a DOM tree. These five beans are
extremely flexible and powerful. Their names and short
descriptions appear in Table 2.
Bean Description
XMLSearch
Given a DOM structure representing an XML document, and another DOM
structure representing a query, this bean searches the input structure for
object structures matching the query, firing an event for each match.
XMLFilter
Given a DOM structure representing an XML document, and another DOM
structure representing a query, this bean passes through only those
substructures that match the query.
XMLTokenizer
This bean performs an in-order traversal of the input DOM tree, firing an
event for each node. Event type is based on the type of document node
encountered.
ElementSelector
This bean filters events from XMLTokenizer, only passing through those
events corresponding to particular elements.
AttributeSelector
This bean filters events from XMLTokenizer, only passing through those
events corresponding to a particular attribute.
Table 2. The five beans of the XMLProcessing bean set
This month's final example demonstrates how to traverse the
DOM structure you've just loaded into memory with a
DOMGenerator
.
XMLTokenizer
One of the more lightweight (and therefore popular) methods of
traversing an XML file is by using the SAX interface, first created
by David Megginson, principal of Megginson Technologies Inc.
(A link to the Megginson Web site appears in the
Resources
section at the bottom of this article.) SAX, the simple API for
XML, is an event-based document traversal mechanism, and
many XML parsers, including many parsers that produce DOM
structures, are based internally on SAX. SAX provides a
framework for parsing XML, making callbacks to programmer-
defined
handler
YOU MIGHT ALSO LIKE

methods when particular tokens are encountered during a parse.
For example, when the SAX parser finds an XML element at its
input (like, say,
<beer>
), the parser calls the programmer-defined handler
beginElement()
, passing the tag name (
beer
) as an argument. The result is an in-order traversal of the DOM
structure, which the programmer may use in clever ways to
process the entire document
without
building a huge tree of DOM objects in memory. (For more on
SAX, see
Resources
.) The
XMLTokenizer
bean traverses its input DOM structure in much the same way
that SAX traverses an XML document. For each
Node
in the tree, it fires an event based on the type of
Node
encountered. The tokenizer fires events for elements, attributes,
XML processing instructions, text items, comments, and special
symbols. These events can be sent to other beans to guide
processing of the input DOM. The events are guaranteed to be
delivered in the same order as the tokens in the original XML
data. The name
XMLTokenizer
is a bit misleading: it should really be called
POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
DOMTraverser
.
XMLTokenizer
doesn't really tokenize XML -- it traverses a tree of objects. But,
since the DOM represents an XML document, and since the
result looks very much like a SAX tokenization of an XML
document, the name makes sense -- sort of. It's a pity that the
XMLTokenizer
won't accept a string or file name as input, since that would
essentially make SAX XML document processing available to
XML beans. The example I wrote for
XMLTokenizer
is again based on our first example with the
DOMGenerator
. This time, though, when the
DOMGenerator
finishes parsing the document, it passes the DOM to a
XMLTokenizer
, which fills a
List
widget with the names of the tags in the document. The wiring
diagram of the
XMLTokenizer
demo appears in Figure 5, with the now familiar (I hope) capital
letters labeling the wires.
Figure 5. XMLTokenizer
pushes all of its tokens into a list
This simple application works very much like previous examples.
A: The DOMGenerator parses the file whose name appears at its
input, and sets its result property. B:The DOM structure created
by the DOMGenerator passes to the XMLTokenizer, which
immediately begins traversing the DOM tree. C: When
the XMLTokenizerfirst begins processing the DOM tree, it fires
astartOfDocument event, which here is hooked to
theList's removeAll() method. This means that the list is cleared
whenever the XMLTokenizer starts to traverse a document. D: For
eachorg.w3c.dom.Element encountered in the input,
the XMLTokenizer fires anelementStartTagFound event, which
here is hooked to the List's addString() event.
The XMLTokenizer's currentXMLToken property always reflects the
tag name of the element that caused the most
recent elementStartTagFound event. The event relationship
passes the currentXMLToken string to
the List's addString() event. As a result, each time
an Element is traversed by the tokenizer, its tag name is
appended to the list.
The result of running this application against the sample xml file
in the sample code for this article,
example.xml
, appears in Figure 6. The first few lines of that file look like this:
<?xml version="1.0"?>
<Recipe>
<Name>Lime Jello Marshmallow Cottage Cheese
Surprise</Name>
<Description>My grandma's favorite (may she
rest in peace.)</Description>
<Ingredients>
<Ingredient>
<Qty unit="box">1</Qty>
...
Note that the order of the strings in the
List
in Figure 6 is the same as the order of the tag names in the
sample file.
Figure 6. XMLTokenizer at work
More XML beans
This month, you've learned how to start building XML processing
applications using the JavaBeans in alphaBeans' XML Beans
suite. You've learned how IDEs connect JavaBeans together with
event listeners, and how to parse, write, and process XML
documents without ever writing a single line of code. Though the
applications shown here are pretty basic, next month's column
will have examples that gradually grow in complexity and power.
I'll go over the rest of the
XMLProcessing
bean set next month, and start showing you how to use the
XMLEditor
bean set to edit XML documents.
Mark Johnson has a BS in computer and electrical engineering from Purdue University (1986),
and has been writing for JavaWorld since August 1997. By day, he works as a designer and
developer for OrganicNet in Fort Collins, CO.
Download the source code for this article
In jar format (with class files)
http://www.javaworld.com/jw-11-1999/beans/JWBeansNov99.jar
In gzipped tar format
http://www.javaworld.com/jw-11-1999/beans/JWBeansNov99.tar.gz
In zip format
http://images.techhive.com/downloads/idge/imported/article/jvw/1999/11/jwbeansnov99.zip
Instructions on how to use the code
http://www.javaworld.com/jw-11-1999/beans/Addendum.html
XML and XML JavaBeans Suite resources
For a readable quick-start to XML, try reading my April 1998 JavaWorld feature article, "XML
for the absolute beginner"
http://www.javaworld.com/javaworld/jw-04-1999/jw-04-xml.html
My October 1997 JavaWorld article, "Keep listening for upcoming events," provides a tutorial
introduction to the event listener interface concept
http://www.javaworld.com/jw-10-1997/jw-10-beans.html
There's also an example of event listener interfaces in my September 1997 article on
customization, entitled "'Double Shot, Half Decaf, Skinny Latte' -- Customize your Java"
http://www.javaworld.com/jw-09-1997/jw-09-beans.html
To download IBM's XML JavaBeans Suite from the alphaWorks alphaBeans site, go to this site
and click on the XML Beans link at the bottom of the list in the leftmost frame
http://www.alphaworks.ibm.com/alphabeans
IBM's alphaBeans site has a large number of high-quality JavaBeans you can play with
http://www.alphaWorks.ibm.com/alphaBeans
The parser from IBM's xml4j package is available free for noncommercial use. It's even free for
commercial use, but be sure to read the license agreement first
http://www.alphaWorks.ibm.com/formula/XML
The World Wide Web Consortium (W3C) maintains a page covering ongoing efforts in the XML
community
http://www.w3.org/XML
The undisputed mother of all XML news sites is Robin Cover's SGML/XML Web page
http://www.oasis-open.org/cover/
A good site for XML news, tutorials, and information
http://www.xml.com
David Megginson, creator of the SAX parser, has a Web site
http://www.megginson.com
IBM's developerWorks site includes excellent XML resources
http://www.ibm.com/developer/xml
Microsoft's data management is increasingly based on XML. Read about Microsoft's XML
strategies
http://msdn.microsoft.com/xml/default.asp
Ss
MORE LIKE THIS
Server-side Java: Advanced form processing using JSP
Test for fun and profit, Part 3: The XML test framework
Experience the joy of SAX, LAX, and DTDs
By Mark Johnson
JavaWorld | Apr 7, 2000 1:00 AM
XML

SAX
If you read last month's article, you already understand how you
can use SAX (the Simple API for XML) to process XML
documents. (If you haven't read it yet, you may want to start
there; see "Read the Whole Series!" below). In that article, I
explained how application writers implement the
SAXDocumentHandler interface, which takes a specific action
when a particular condition (such as the start of a tag) occurs
during the parsing of an XML document. But what good is that
function? Read on.
FEATURED RESOURCE

Coverity Scan
LEARN MORE
TEXTBOX: TEXTBOX_HEAD: Programming XML in Java: Read
the whole series!
Part 1. Use the Simple API for XML (SAX) to process XML in
Java easily
Part 2. Learn about SAX and XML validation through illustrative
examples
Part 3. DOMination: Take control of structured documents with
the Document Object Model
:END_TEXTBOX
You'll also remember that an XML parser checks that the
document is well formed (meaning that roughly all of the open
and close tags match and don't overlap in nonsensical ways). But
even well-formed documents can contain meaningless data or
have a senseless structure. How can such conditions be
detected and reported?
This article answers both questions through an illustrative
example. I'll start first with the latter question: once the document
is parsed, how do you ensure that the XML your program is
processing actually makes sense? Then I'll demonstrate an
extension to XML that I call LAX (the Lazy API for XML), which
makes writing handlers for SAX events even easier. Finally, I'll tie
all of the themes together and demonstrate the technology's
usefulness with a small example that produces both formatted
recipes and shopping lists from the same XML document.
Garbage in, garbage out
One thing you may have heard about XML is that it lets the
system developer define custom tags. With
a nonvalidating parser (discussed in Part 1 of this series), you
certainly have that ability. You can make up any tag you want
and, as long as you balance your open and close tags and don't
overlap them in absurd ways, the nonvalidating SAX parser will
parse the document without any problems. For example, a
nonvalidating SAX parser would correctly parse and fire events
for the document in Listing 1.
Listing 1. A well-formed, meaningless document
001 <?xml version="1.0">
002 <Art CENTURY="20">
003 <Dada>
004 <Author CENTURY="18"
NOMDEPLUME="Voltaire">
005 Franois-Marie Arouet
006 </Author>
007 <Tree SPECIES="Maple">
008 <Yes/>
009 <Book AUTHOR="Musashi, Miyamoto">
010 <Title LANG="English">The Book of
Five Rings</Title>
011 <Title LANG="Nihongo">Go Rin No
Sho</Title>
012 <Filter POLY="Chebyshev"
POLES="2"/>
013 <Title LANG="Espanol">El Libro de
Cinco Anillos</Title>
014 <Title LANG="Francais">Le Livre de
Cinq Bagues</Title>
015 </Book>
016 <Bahrain FORMAT="MP3">
017 <Cathedral CITTA="Firenze">
018 <Nome>Santa Maria del
Fiore</Nome>
019 <Architetto>Brunelleschi,
Filippo (1377-1466)</Architetto>
020 <Ora
FORMAT="DMY24">22032000134591</Ora>
021 </Cathedral>
022 </Bahrain>
023 <Phobias>
024 <Herbs NAME="Ma Huang"/>
025 <Appliance COLOR="Harvest
Gold">Yuck</Appliance>
026 </Phobias>
027 </Tree>
028 </Dada>
029 </Art>

POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
A nonvalidating SAX parser would produce a valid event stream
for the document in Listing 1 because the input document is well
formed. It's really stupid input, but it iswell formed. Every opening
tag has a corresponding close tag, and the tags don't overlap
(meaning there are no combinations of tags like <A><B></A></B>).
So a nonvalidating SAX parser will have no problem with Listing
1.
Unfortunately, if you write a program that, for example,
summarizes museum collections, formats architectural
information, or prints multilingual card catalogs for libraries, your
program could read this really stupid XML and produce really
stupid output, because it might pull out tags it recognizes
(like <Dada>, <Cathedral>, or <Book>). As the saying goes,
"Garbage in, garbage out."
To minimize the chance that your program produces garbage you
should devise a way to detect and reject garbage in the input.
Then, given meaningful input, you can focus on creating
reasonable output.
Think of a document as having three levels of correctness:
lexical, syntactic, and semantic. Lexical correctness is what I
mean when I say "well formed": the basic structure of the
document is reasonable and correct, but nothing about the
content of the tags is checked. Any tag can occur inside any
other tag any number of times, any tag can take any attribute,
and attributes can take on any value. So, Listing 1 is well formed,
but it makes no sense, because there is no control over what
tags and attributes appear in the structure, and where.
Syntactic correctness means that the document is not only well
formed, but that it also contains certain tags, in certain
combinations. An XML document can include a section, called
a document type definition (DTD), that specifies the rules for
syntactic correctness.
A DTD lets a system designer create a custom markup language,
a dialect of XML. A DTD indicates which tags may (or must)
occur inside other specified tags, what attributes a tag may have,
the required order of the tags, and so on. Avalidating parser uses
a DTD to check the document it is parsing for syntactic
correctness. The parser prints error and warning messages for
any problems it finds, and then rejects any document that doesn't
conform to the DTD. The application programmer can then write
code assuming that the structure of the document is correct,
because the parser already checked it.
So, for example, in Listing 1 a designer might write a DTD that
defines a <Book> tag as containing only one or
more <Title> tags. The parser would report the presence of
the <Filter> tag in line 12 as an error, because the DTD doesn't
allow it.
A DTD is also an excellent way to specify the input to your
program. An XML input document either corresponds to a
particular DTD or it doesn't. Your program can correctly process
any input that conforms to a given DTD. A DTD also lets you test
your application for correctness or completeness; if an input
document conforms to the DTD, but your program doesn't
process it properly, then you have a bug or a missing feature.
XML parsers don't provide much in the way of checking
for semantic correctness. Semantic correctness means that the
actual instance data is true for the purposes of the application. A
validating parser could report an error when it finds
a FORMATattribute on a <Bahrain> tag (as occurs in line 16, Listing
1). But it's a lot to ask any parser to check whether the Cathedral
of Santa Maria del Fiore is in Bahrain or in Italy. Semantic
correctness remains the domain of your application: it's up to you
to add meaning to the XML document you've defined. A
validating XML parser and a DTD help to automate the detection
of gross lexical and syntactic errors in the input to your program,
allowing you to focus on the data's meaning.
YOU MIGHT ALSO LIKE

As a side note, the HTML used to create Web pages is specified
in an SGML DTD, which is considerably more complex and
powerful than an XML DTD. XML DTDs are essentially a subset
of these SGML DTDs, with some minor notational differences.
The HTML DTD clearly specifies what kind of input an HTML-
processing program can accept. XHTML, an XML-compatible
version of HTML, specifies an XML DTD for HTML. It has just
been released by the World Wide Web Consortium (W3C).
In the next section, I'll create a DTD for a small XML dialect for
describing recipes.
Parlez-vous DTD?
Two people generally can't talk to one another unless they speak
a mutually understood language. Likewise, two programs can't
communicate via XML unless the programs agree on the XML
language they use. A DTD defines a set of rules for the allowable
tags and attributes in an XML document, and the order and
cardinality of the tags. Programs using the DTD must still agree
on what the tags mean (semantics again), but a DTD defines the
words (or, the tags) and the grammatical rules for a particular
XML dialect.
Listing 2 shows a simple DTD for a tiny XML language I
call Recipe XML.
Listing 2. The DTD for Recipe XML
001 <!ELEMENT Recipe (Name, Description?,
Ingredients?, Instructions?)>
002
003 <!ELEMENT Name (#PCDATA)>
004
005 <!ELEMENT Description (#PCDATA)>
006
007 <!ELEMENT Ingredients (Ingredient)*>
008
009 <!ELEMENT Ingredient (Qty, Item)>
010 <!ATTLIST Ingredient
011 vegetarian CDATA "true">
012
013 <!ELEMENT Qty (#PCDATA)>
014 <!ATTLIST Qty
015 unit CDATA #IMPLIED>
016
017 <!ELEMENT Item (#PCDATA)>
018 <!ATTLIST Item
019 optional CDATA "0">
020
021 <!ELEMENT Instructions (Step)+>
022 <!ELEMENT Step (#PCDATA)>
The DTD in Listing 2 defines a complete, tiny language for
transmitting recipes. Programs that use this DTD can count on
the structure of conforming files to match the rules in the DTD.
I'll go over this file, line by line:
This line defines a tag using <!ELEMENT. The entire line from the
opening <!ELEMENT to the closing > is called an element type
declaration. The declaration says that a Recipe is composed of
a Name, followed by the optional occurrence of
a Description, Ingredients, and Instructions. The comma
operator (,) indicates the valid tags the defined tag may contain,
and the order in which those tags must appear. The question
mark operator (?) indicates that the item to its left is optional.
Since Name has only a comma operator after it, a Recipe must
have precisely one Name. The parentheses are for grouping, and
don't appear in the input document.
Therefore, the sequence:
<Recipe><Name>Zabaglione</Name></Recipe>
is a valid Recipe, because it matches the DTD (that is, it consists
of a <Name>followed optionally by a <Description>.) However:
<Recipe>
<Description>Italian dessert</Description>
<Name>Zabaglione</Name>
</Recipe>
is not a valid Recipe, because the Description comes before
the Name.
This line states that a Name tag (or element) contains no other tag
types, and may contain text between its open and close tags. A
validating parser will mark any tag within aName tag as an error.
This line states that an Ingredients tag may contain zero or
more Ingredient tags. The asterisk or star operator (*) indicates
the tag's zero-or-more cardinality.
An attribute list declaration, which uses <!ATTLIST, defines the
attributes for a tag. Only attributes within the attribute list
declaration for a tag are allowed. This line says that
the Ingredient tag previously defined has a single
attribute, vegetarian, which is character data (CDATA), and whose
default value is "true". Attribute list declarations all follow this
pattern; one may define multiple attributes, each with a type and
default value, following the tag name.
014 <!ATTLIST Qty
This attribute list declaration defines the default value for
the unit attribute as#IMPLIED. That means that the attribute may
or may not appear with the tag; if it doesn't appear, the
application supplies the value. This is how you create an optional
attribute.
This line states that an Instructions tag, if present, must contain
at least one Step. The plus-sign operator (+) indicates one or
more occurrences of the item to its left.
DTDs have more operators and conventions, but this example
covers the basics. (You can find out the whole scoop on DTDs in
XML in the XML recommendation; seeResources.)
DTDs are meta-information; that is, they are information about
information. You may already be familiar with this concept. A
table in a relational database has a schema describing such
things as the column names, data types, sizes, and default
values for its data. But the table description doesn't contain data
values, it contains a description of the values. Likewise, a DTD is
a simple sort of schema that defines what may be in a particular
document type. (There is currently an effort underway to create
an XML schema that is much more like a database schema;
see Resources.)
DTDs are also a bit like BNF, or Backus-Naur Form
(see Resources for a discussion), which describes transformation
rules for grammars; however, BNF can express structures that
XML DTDs cannot.
An XML document declares its DTD with
a <!DOCTYPE declaration, as shown in Listing 3. The document
type specifies the external DTD used to validate the document.
The top-level tag of the document must be the same as the
document defined by the <!DOCTYPE (in this case, it's Recipe.)
MORE LIKE THIS
By Mark Johnson
XML

SAX
Page 2 of 3
Listing 3. Using an external DOCTYPE declaration
002
003 <!DOCTYPE Recipe SYSTEM "example.dtd">
004
005 <Recipe>
006 <Name>Lime Jell-O Marshmallow Cottage
Cheese Surprise</Name>
007 <Description>My grandma's favorite (may
she rest in peace.)</Description>
008 <Ingredients>
...

Line 3 in Listing 3 states that the document that follows must
conform to the DTD contained in the given file, or the file itself is
syntactically invalid.
A DTD may also be specified internally, as shown in Listing 4.
Note that in Listing 4, the DTD is terminated by the ]> in line 19.
Listing 4. Internal document type declaration
002
003 <!DOCTYPE Recipe [
006 <!ELEMENT Description (#PCDATA)>
008 <!ELEMENT Ingredient (Qty, Item)>
011 <!ELEMENT Qty (#PCDATA)>
012 <!ATTLIST Qty
014 <!ELEMENT Item (#PCDATA)>
015 <!ATTLIST Item
016 optional CDATA "0">
018 <!ELEMENT Step (#PCDATA)>
019 ]>
020 <Recipe>
021 <Name>Lime Jell-O Marshmallow Cottage
Cheese Surprise</Name>
022 <Description>My grandma's favorite (may
she rest in peace.)</Description>
...

The full-text versions of these sample files, which I'll use later,
are included in the source code archives for this article.
Download the source files from Resources and experiment with
the SimpleValidatingSaxReporter class, which creates a
validating SAX parser and then parses and validates the
document against the DTD. The main program for this class
appears in Listing 5.
Listing 5. Using a validating SAX parser
084 SimpleValidatingSaxReporter ssr =
new SimpleValidatingSaxReporter();
085 try {
086 ssr.parseDocument(true,
ssr, args[0]);
088 System.err.println(ex);
089 }
090 }
...
099 protected void parseDocument(boolean
isValidating, HandlerBase handler, String
sFilename) {
100 try {
101 // Get a "parser factory",
an object that creates parsers
102 SAXParserFactory
saxParserFactory =
SAXParserFactory.newInstance();
103
104 // Set up the factory to
create the appropriate type of parser
105
saxParserFactory.setValidating(isValidating);
106
saxParserFactory.setNamespaceAware(false); //
Not this month...
107
108 SAXParser parser =
saxParserFactory.newSAXParser();
109
110 parser.parse(new
File(sFilename), handler);
112
System.err.println("Exception: " + ex);
113 System.exit(2);
114 }
115 }

POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
Line 102 in Listing 5 creates a SAXParserFactory, an object that
creates parsers. Lines 105 to 106 tell the parser factory what kind
of parser to create, and line 108 creates the parser. Line 110
then tells the parser to parse the file, using the handler passed
from main() to handle the events. The handler is
the SimpleValidatingSaxReporter itself, since that class
implements HandlerBase. The result is a stream of SAX events,
as long as the input is valid with respect to its DTD. Experiment
with the code by adding and deleting items from the sample XML
files in the source archive.
You'll notice that there are errors in example2.xml, as the parser
reports:
e: file:C:/mj-java/XMLSAX2/example2.xml: line
30: org.xml.sax.SAXParseException: Element
"Ingredient" does not allow "Item" here.
The parser recognizes that the order of the Qty and Item tags is
reversed. If you remove the <!DOCTYPE declaration
from example2.xml, you get the following error message, printed
by the error handler:
w: file:C:/mj-java/XMLSAX2/example2.xml: line
6: org.xml.sax.SAXParseException: Valid
documents must have a <!DOCTYPE declaration.
e: file:C:/mj-java/XMLSAX2/example2.xml: line
6: org.xml.sax.SAXParseException: Element type
"Recipe" is not declared.
FEATURED RESOURCE

Coverity Scan
LEARN MORE
It prints this error message because it has no DTD to check
against, so it can't find the definition of the Recipe tag. Play
around with this class to get a feel for the kind of errors the
parser can catch.
Notice that since the ErrorHandler instance you're using merely
reports errors, and doesn't exit when it receives them, the parser
continues to try parsing the file. Whoever writes the error handler
(that's you) is responsible for deciding what to do when errors
occur.
Now you finally know enough about using SAX and a validating
parser to create an XML application. I decided to make the
process easy on myself, so I created LAX, which I'll explain next.
LAX: The Lazy API for XML
Writing a document handler for a SAX parser is pretty easy: just
subclassHandlerBase, override the appropriate methods, and do
whatever you like in response to the events coming from the
parser. Being a lazy, and therefore virtuous, programmer (see
Part 1 for an explanation), I decided to do some extra work only
once in order to simplify programming in SAX for subsequent
projects. Writing SAX handlers is just too much work.
You see, when you override startElement(), endElement,
or characters(), you always have to check the tag name to
decide what to do. So, these methods typically become large if-
then-else blocks. It requires a lot of typing, which always opens
the door for errors, plus I simply can't be bothered to do it. So, I
created LAX, the Lazy API for XML.
YOU MIGHT ALSO LIKE

LAX lets any class use only naming conventions to handle SAX
events, in much the same way JavaBeans introspection identifies
properties and event sets by examining a class's method
signatures. A class can become an event handler simply by
defining methods with the appropriate name and signature.
There's no need for a class to overrideHandlerBase, since LAX
does that for you. To use LAX, simply create a LAX object,
register it as a document handler for the parser, then
register your handler objects with LAX. LAX translates the stream
of XML events into method calls on your objects.
LAX uses Java reflection to find methods in your classes to
handle tags in the XML being parsed. When LAX encounters a
tag called, say, <Tag>, it searches through all of its handlers
(instances of classes that you've written), looking for objects that
have either method void startTag() or void
startTag(AttributeList list), and calls that method on any
such object it finds. When it encounters the end tag</Tag>, it
searches all of its handlers for a method called void endTag(),
and calls any such methods it finds. When LAX encounters
characters (in its own characters()method), it remembers the
current tag, and searches for and calls all methods with the
signature void textOfTag(String string).
As a result, you don't need to write huge if-then-else statements,
implementDocumentHandler, or extend HandlerBase. Simply write
methods with the appropriate signatures, register an instance of
your class with LAX, and parse the input document with the LAX
object as a document handler. What could be easier?
The source code for LAX is in the source archive for this article,
available inResources below. Now I'll develop a sample program
using LAX.
Managing recipes with LAX
I often get email from readers who want to know how to use Java
to process XML into HTML (or other formats) for display. The
following example shows one way to use the same XML to create
HTML files for different purposes, with different formatting, in a
single processing step. The popularity of CGI, ASP (Active
Server Pages), and JSP (JavaServer Pages) notwithstanding, I
oppose writing any code that has hard-coded print statements
spitting out HTML. Style languages such as CSS (Cascading
Style Sheets), DSSSL (Document Style Semantics and
Specification Language, SGML's style language), and XSL
(Extensible Stylesheet Language) are more appropriate for the
task of transforming data into something presentable. (Why that's
true is material for another time -- it's a big topic.) Nevertheless, I
understand that using coded programs to create HTML is
commonly used, and it makes for an enlightening example of
using SAX to turn XML into some other useful form.
For this example, I have two valid Recipe XML
files:example4.xml, my standard heinous lime Jell-O creation; and
a new recipe for Nanner Pah, example3.xml, a big hit at all the
Lutheran church dinners I went to as a kid.
I decided I wanted to use LAX to write a program that produces
two files: a well-formatted recipe page for a cookbook, and a
shopping list for the recipe, also attractively formatted. To
accomplish that, I created two
classes:RecipeWriter and ShoppingListWriter. I'll go over each
class in turn, and then show how you can use them both with
LAX.
Formatting a recipe
The RecipeWriter class has start, end, and textOf methods for
each tag type in the Recipe DTD. I'll discuss how a couple of
them work so you can get a feel for what the class does. You can
follow along in the source code for RecipeWriter.java.
The RecipeWriter constructor, which takes a filename as an
argument, creates the named file and opens it for writing.
Subsequent method calls cause HTML to be written to the output
file, and endRecipe() eventually closes it.
The top-level tag of the Recipe XML is <Recipe>,
but RecipeWriter doesn't have astartRecipe() method, so that
event is skipped. When LAX encounters characters inside
a <Name> tag, though, it finds
the RecipeWriter's textOfName() method, which it calls with the
text of the recipe name. textOfName() calls titlePrint(), which
sets up the HTML page, sets the body background image, and
opens up a TABLE(which will be closed
by endRecipe()). startDescription(), startIngredients(),
andstartInstructions() all produce rows in the table with
attractive background colors and large header text.
This cookbook is designed to be used by both vegetarians and
nonvegetarians, so notice that RecipeWriter has a boolean
variable called _isVegetarian, which is set to "false" if any
nonvegetarian ingredient is encountered bystartIngredients.
After parsing is completed, endRecipe()checks this flag, and
places an indication after the recipe of whether the recipe is
vegetarian. Likewise, startItemchecks for the OPTIONAL attribute,
and prints "(optional)" after each optional ingredient.
You can see the results of running
the RecipeWriter onexample4.xml in example4-recipe.html and
on example3.xml in example3-recipe.html.
Formatting a shopping list
At the same time that the recipe is being formatted
by RecipeWriter, LAX also maintains an instance
of ShoppingListWriter, which is creating a different file. You can
follow along in the source code for ShoppingListWriter.java.
Like RecipeWriter, ShoppingListWriter creates and opens its
output file in its constructor. Since a shopping list is primarily
concerned with <Ingredients>, it doesn't print anything until LAX
calls startIngredients() (startName() saves the name in an
instance field for use in startIngredients). The program builds
an HTML table on top of a spiral-notebook background, and
prints all optional attributes in red (so if you don't bring enough
money to the grocery store, you'll know what you can do without.)
You can see the results of the ShoppingListWriter in example4-
list.html andexample3-list.html. Currently, you can't merge,
sort, or add the contents of the two lists -- n recipes gives
you n lists. But there's no reason you couldn't write a class that
does any or all of those things.
The main LAX program
Listing 6 shows the main() method for LAX. You can read the full
source code inLax.java.
Listing 6. LAX main() method
134 if (args.length < 1) {
135 System.err.println("Usage:
lax inputFile.xml [parserClass]");
136 System.exit(1);
137 }
138
139 String sInputFile = args[0];
140 String sRecipeFile;
141 String sShoppingListFile;
142 String sBase = sInputFile;
143
144 if (sBase.length() > 4 &&
sBase.toLowerCase().endsWith(".xml")) {
145 sBase = sBase.substring(0,
sBase.length() - 4);
146 } else {
147 sInputFile = sBase +
".xml";
148 }
149 sRecipeFile = sBase + "-
recipe.html";
150 sShoppingListFile = sBase + "-
list.html";
151
152 Lax lax = new Lax();
153
154 ShoppingListWriter slw = new
ShoppingListWriter(sShoppingListFile);
155 lax.addHandler(slw);
156
157 RecipeWriter rw = new
RecipeWriter(sRecipeFile);
158 lax.addHandler(rw);
159
160 lax.parseDocument(true, lax,
sInputFile);
161
162
163 }
MORE LIKE THIS
By Mark Johnson
XML

SAX
Page 3 of 3
As LAX receives each event, it searches both of the handler
objects for appropriate methods, and calls the methods when
they are found. As a result, both output files are written in one
pass of the parser. That's all there is to it! All of the application
logic (to create the HTML) occurs in the handler objects, and LAX
handles dispatching the tag names in the events to the methods
in the handlers.
FEATURED RESOURCE

Coverity Scan
LEARN MORE
Using XML and SAX in this way opens a lot of doors. With a little
imagination, it's easy to envision a servlet that reads a directory
of XML files and creates pages of links to the formatted recipes
and shopping lists. The formatted recipes and shopping lists
could even be created on the fly by the servlet from the XML.
Updating an XML file with new information would then
automatically update both the recipe and the shopping list -- they
would never get out of sync. This data consistency is one of the
benefits of using XML to represent information and then styling
the XML for various presentations.
Final notes about XML and DTDs
Note that, in this Recipe example, the recipe's content was
separate from its presentation; that is, the XML represented
the information in the recipe, while the LAX handler classes
formatted and displayed that information. A different LAX class
could display the document in an entirely different way, or even
read it aloud, yet the underlying XML document would not have
changed at all. This separation of content and presentation is one
of the key themes in the architecture of modern document-
processing systems.
In addition, while creating a DTD might seem like a simple
proposition, it's actually one of the most difficult parts of creating
large integrated document-management systems. The syntax of
DTDs isn't too difficult once you get accustomed to it, but getting
the DTD right requires a great deal of analysis, and the
consequences of poor DTD design can haunt a project forever.
DTD design has quite a lot in common with database design,
especially in terms of normalization and denormalization of the
information being represented.
DTDs are particularly useful for describing standard document
formats for information interchange between open systems.
Many groups and consortia are currently working out XML DTDs
for everything from vector graphics to chemical formulas and
molecules to shoe inventories.
POPULAR RESOURCES

WHITE PAPER

WHITE PAPER
SEE ALL
Go
Conclusion
This article has covered a lot of ground: you've learned what
DTDs are and how they work, you've discovered LAX (which I
think will make SAX even easier for you to use), and you've seen
how you can use a single XML document in different contexts. I
hope these small examples get you thinking about how you can
use the technology.
SAX is an excellent way to process XML for many applications,
but for complex transformations of XML data, it's sometimes
necessary to get at nodes all over the document "tree." You can
also use XML as a serialization mechanism; that is, for creating
documents that represent arbitrary structures of objects, and
recreating those object structures from documents. For those
purposes, the Document Object Model (DOM) can be useful. In
the next article in this series, you'll learn how to use the DOM to
do more sophisticated processing of XML documents.
Mark Johnson works as a designer and developer for OrganicNet in Fort Collins, Colo., by day,
and as a JavaWorld columnist by night -- very late at night.
Download the source code and class files for this article
In jar format (with class and java files)
http://www.javaworld.com/jw-04-2000/advsax/advsax.jar
In .tgz format (gzipped tar)
http://www.javaworld.com/jw-04-2000/advsax/advsax.tgz
In zip format
http://images.techhive.com/downloads/idge/imported/article/jvw/2000/04/advsax.zip
Additional resources
This month's sample code uses Sun's Java extension for XML; download it at
http://java.sun.com/products/xml/index.html
Find out more about SAX at the SAX Website
http://www.megginson.com/SAX/index.html
The latest version of the W3C XML Recommendation (currently 1.0) can always be found on the
W3C Website at
http://www.w3.org/TR/REC-xml
Read about XML Schema athttp://www.w3.org/TR/xmlschema-1/ and
http://www.w3.org/TR/xmlschema-2/
An article on Extended Backus-Naur form as it relates to the parsing of XML appears on
XML.com at
http://www.xml.com/pub/98/10/guide5.html
For more on XML and Java together, see "Why XML is Meant for Java," Matthew Fuchs (Web
Techniques, June 1999)
http://www.webtechniques.com/archives/1999/06/fuchs/
The spiral-notebook graphic included in the recipe example was used by permission of its
creator, Stephanie Baker-Thomas. See her Ulead PhotoImpact Tutorial site at
http://www.eastofthesun.com/pi/index.htm
The cow and alligator graphics in the recipe example are courtesy of Tony Martin of Noetic
Clipart at
http://www.noeticart.com/na_clipart.html
PREVIOUS PREVIOUS | 1 2 3Lines 139 through 150 build the
output filenames, which are used to create
theShoppingListWriter (line 154) and the RecipeWriter (line
157) for LAX. Line 152 creates an instance of LAX, and in lines
155 and 158, LAX gets
theShoppingListWriter and RecipeWriter objects. Line 160 then
parses the file with a copy of parseDocument() taken directly
from SimpleValidatingSaxParser.
PREVIOUS PREVIOUS | 1 2 3 NEXT NEXT

JAVAWORLDXML

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

JAVAWORLDXML

Hochgeladen von

Copyright:

Verfügbare Formate

Programming XML in Java, Part 1

MORE LIKE THIS

Das könnte Ihnen auch gefallen