Beruflich Dokumente
Kultur Dokumente
The following APIs provide a Java application with access to a parsed XML document:
DOM API, which parses XML documents and builds a tree representation of the documents
in memory. Use either a DOMParser object to parse with DOM or the
XMLDOMImplementation interface factory methods to create a pluggable, scalable DOM.
SAX API, which processes an XML document as a stream of events, which means that a
program cannot access random locations in a document. Use a SAXParser object to parse
with SAX.
JAXP, which is a Java-specific API that supports DOM, SAX, and XSL. Use a
DocumentBuilder or SAXParser object to parse with JAXP.
The sample XML document in Example 4-1 helps illustrate the differences among DOM, SAX, and JAXP.
Example 4-1 Sample XML Document
<?xml version="1.0"?>
<EMPLIST>
<EMP>
<ENAME>MARY</ENAME>
</EMP>
<EMP>
<ENAME>SCOTT</ENAME>
</EMP>
</EMPLIST>
DOM Creation
In Java XDK, there are three ways to create a DOM:
Parse a document using DOMParser. This has been the traditional XDK approach.
Create a scalable DOM using XMLDOMImplementation factory methods.
Use an XMLDocument constructor. This is not a common solution in XDK.
Scalable DOM
With Oracle 11g Release 1 (11.1), XDK provides scalable, pluggable support for DOM. This relieves
problems of memory inefficiency, limited scalability, and lack of control over the DOM configuration.
For the scalable DOM, the configuration and creation are mainly supported using the
XMLDOMImplementation class.
These are important aspects of scalable DOM:
Plug-in Data allows external XML representation to be directly used by Scalable DOM
without replicating XML in internal representation.
Scalable DOM is created on top of plug-in XML data through the R eader and
InfosetWriter abstract interfaces. XML data can be in different forms, such as Binary
XML, XMLType, and third-party DOM, and so on.
Transient nodes. DOM nodes are created lazily and may be freed if not in use.
Binary XML
The scalable DOM can use binary XML as both input and output format. Scalable DOM can
interact with the data in two ways:
Through the abstract InfosetReader and InfosetWriter interfaces.
Users can (1) use the BinXML implementation of InfosetReader and
InfosetWriter to read and write BinXML data, and (2) use other
implementations supplied by the user to read and write in other forms of XML
infoset.
Through an implementation of the InfosetReader and InfosetWriter
adaptor for BinXMLStream.
Description of "Figure 4-2 Comparing DOM (Tree-Based) and SAX (Event-Based) APIs"
02
<employee id="111">
03
<firstName>Rakesh</firstName>
04
<lastName>Mishra</lastName>
05
<location>Bangalore</location>
06
</employee>
07
<employee id="112">
08
<firstName>John</firstName>
09
<lastName>Davis</lastName>
10
<location>Chennai</location>
11
</employee>
12
<employee id="113">
13
<firstName>Rajesh</firstName>
14
<lastName>Sharma</lastName>
15
<location>Pune</location>
16
</employee>
17
</employees>
And the obejct into which the XML content is to be extracted is defined as below:
01
class Employee{
02
String id;
03
String firstName;
04
String lastName;
05
String location;
06
07
@Override
08
public String toString() {
09
return firstName+" "+lastName+"("+id+")"+location;
10
}
11
}
There are 3 main parsers for which I have given sample code:
DOM Parser
SAX Parser
StAX Parser
I am making use of the DOM parser implementation that comes with the JDK and in my example I am
using JDK 7. The DOM Parser loads the complete XML content into a Tree structure. And we iterate
through the Node and NodeList to get the content of the XML. The code for XML parsing using DOM
parser is given below.
01
public class DOMParserDemo {
02
03
public static void main(String[] args) throws Exception {
04
//Get the DOM Builder Factory
05
DocumentBuilderFactory factory =
06
DocumentBuilderFactory.newInstance();
07
08
//Get the DOM Builder
09
DocumentBuilder builder = factory.newDocumentBuilder();
10
11
//Load and Parse the XML document
12
//document contains the complete XML as a Tree.
13
Document document =
14
builder.parse(
15
ClassLoader.getSystemResourceAsStream("xml/employee.xml"));
16
17
List<Employee> empList = new ArrayList<>();
18
19
//Iterating through the nodes and extracting the data.
20
NodeList nodeList = document.getDocumentElement().getChildNodes();
21
22
for (int i = 0; i < nodeList.getLength(); i++) {
23
24
//We have encountered an <employee> tag.
25
Node node = nodeList.item(i);
26
if (node instanceof Element) {
27
Employee emp = new Employee();
28
emp.id = node.getAttributes().
29
getNamedItem("id").getNodeValue();
30
31
NodeList childNodes = node.getChildNodes();
32
for (int j = 0; j < childNodes.getLength(); j++) {
33
Node cNode = childNodes.item(j);
34
35
//Identifying the child tag of employee encountered.
36
if (cNode instanceof Element) {
37
String content = cNode.getLastChild().
38
getTextContent().trim();
39
switch (cNode.getNodeName()) {
40
case "firstName":
41
emp.firstName = content;
42
break;
43
case "lastName":
44
emp.lastName = content;
45
break;
46
case "location":
47
emp.location = content;
48
break;
49
}
50
}
51
}
52
empList.add(emp);
53
}
54
55
}
56
57
//Printing the Employee list populated.
58
for (Employee emp : empList) {
59
System.out.println(emp);
60
}
61
62
}
63
}
64
65
class Employee{
66
String id;
67
String firstName;
68
String lastName;
69
String location;
70
71
@Override
72
public String toString() {
73
return firstName+" "+lastName+"("+id+")"+location;
74
}
75
}
The output for the above will be:
1
Rakesh Mishra(111)Bangalore
2
John Davis(112)Chennai
3
Rajesh Sharma(113)Pune
SAX Parser is different from the DOM Parser where SAX parser doesnt load the complete XML into the
memory, instead it parses the XML line by line triggering different events as and when it encounters
different elements like: opening tag, closing tag, character data, comments and so on. This is the reason
why SAX Parser is called an event based parser.
Along with the XML source file, we also register a handler which extends the DefaultHandler class. The
DefaultHandler class provides different callbacks out of which we would be interested in:
startElement() triggers this event when the start of the tag is encountered.
endElement() triggers this event when the end of the tag is encountered.
characters() triggers this event when it encounters some text data.
The code for parsing the XML using SAX Parser is given below:
01
import java.util.ArrayList;
02
import java.util.List;
03
import javax.xml.parsers.SAXParser;
04
import javax.xml.parsers.SAXParserFactory;
05
import org.xml.sax.Attributes;
06
import org.xml.sax.SAXException;
07
import org.xml.sax.helpers.DefaultHandler;
08
09
public class SAXParserDemo {
10
11
public static void main(String[] args) throws Exception {
12
SAXParserFactory parserFactor = SAXParserFactory.newInstance();
13
SAXParser parser = parserFactor.newSAXParser();
14
SAXHandler handler = new SAXHandler();
15
parser.parse(ClassLoader.getSystemResourceAsStream("xml/employee.xml"),
16
handler);
17
18
//Printing the list of employees obtained from XML
19
for ( Employee emp : handler.empList){
20
System.out.println(emp);
21
}
22
}
23
}
24
/**
25
* The Handler for SAX Events.
26
*/
27
class SAXHandler extends DefaultHandler {
28
29
List<Employee> empList = new ArrayList<>();
30
Employee emp = null;
31
String content = null;
32
@Override
33
//Triggered when the start of tag is found.
34
public void startElement(String uri, String localName,
35
String qName, Attributes attributes)
36
throws SAXException {
37
38
switch(qName){
39
//Create a new Employee object when the start tag is found
40
case "employee":
41
emp = new Employee();
42
emp.id = attributes.getValue("id");
43
break;
44
}
45
}
46
47
@Override
48
public void endElement(String uri, String localName,
49
String qName) throws SAXException {
50
switch(qName){
51
//Add the employee to list once end tag is found
52
case "employee":
53
empList.add(emp);
54
break;
55
//For all other end tags the employee has to be updated.
56
case "firstName":
57
emp.firstName = content;
58
break;
59
case "lastName":
60
emp.lastName = content;
61
break;
62
case "location":
63
emp.location = content;
64
break;
65
}
66
}
67
68
@Override
69
public void characters(char[] ch, int start, int length)
70
throws SAXException {
71
content = String.copyValueOf(ch, start, length).trim();
72
}
73
74
}
75
76
class Employee {
77
78
String id;
79
String firstName;
80
String lastName;
81
String location;
82
83
@Override
84
public String toString() {
85
return firstName + " " + lastName + "(" + id + ")" + location;
86
}
87
}
The output for the above would be:
1
Rakesh Mishra(111)Bangalore
2
John Davis(112)Chennai
3
Rajesh Sharma(113)Pune
StAX stands for Streaming API for XML and StAX Parser is different from DOM in the same way SAX
Parser is. StAX parser is also in a subtle way different from SAX parser.
The SAX Parser pushes the data but StAX parser pulls the required data from the XML.
The StAX parser maintains a cursor at the current position in the document allows to extract
the content available at the cursor whereas SAX parser issues events as and when certain
data is encountered.
XMLInputFactory and XMLStreamReader are the two class which can be used to load an XML file. And
as we read through the XML file using XMLStreamReader, events are generated in the form of integer
values and these are then compared with the constants in XMLStreamConstants. The below code shows
how to parse XML using StAX parser:
01
import java.util.ArrayList;
02
import java.util.List;
03
import javax.xml.stream.XMLInputFactory;
04
import javax.xml.stream.XMLStreamConstants;
05
import javax.xml.stream.XMLStreamException;
06
import javax.xml.stream.XMLStreamReader;
07
08
public class StaxParserDemo {
09
public static void main(String[] args) throws XMLStreamException {
10
List<Employee> empList = null;
11
Employee currEmp = null;
12
String tagContent = null;
13
XMLInputFactory factory = XMLInputFactory.newInstance();
14
XMLStreamReader reader =
15
factory.createXMLStreamReader(
16
ClassLoader.getSystemResourceAsStream("xml/employee.xml"));
17
18
while(reader.hasNext()){
19
int event = reader.next();
20
21
switch(event){
22
case XMLStreamConstants.START_ELEMENT:
23
if ("employee".equals(reader.getLocalName())){
24
currEmp = new Employee();
25
currEmp.id = reader.getAttributeValue(0);
26
}
27
if("employees".equals(reader.getLocalName())){
28
empList = new ArrayList<>();
29
}
30
break;
31
32
case XMLStreamConstants.CHARACTERS:
33
tagContent = reader.getText().trim();
34
break;
35
36
case XMLStreamConstants.END_ELEMENT:
37
switch(reader.getLocalName()){
38
case "employee":
39
empList.add(currEmp);
40
break;
41
case "firstName":
42
currEmp.firstName = tagContent;
43
break;
44
case "lastName":
45
currEmp.lastName = tagContent;
46
break;
47
case "location":
48
currEmp.location = tagContent;
49
break;
50
}
51
break;
52
53
case XMLStreamConstants.START_DOCUMENT:
54
empList = new ArrayList<>();
55
break;
56
}
57
58
}
59
60
//Print the employee list populated from XML
61
for ( Employee emp : empList){
62
System.out.println(emp);
63
}
64
65
}
66
}
67
68
class Employee{
69
String id;
70
String firstName;
71
String lastName;
72
String location;
73
74
@Override
75
public String toString(){
76
return firstName+" "+lastName+"("+id+") "+location;
77
}
78
}
The output for the above is:
view sourceprint?
1
Rakesh Mishra(111) Bangalore
2
John Davis(112) Chennai
3
Rajesh Sharma(113) Pune