XML (Day1) V1.2

IBM Global Business Services
Welcome!
XML Day 1
Copyright IBM Corporation 2009
XML Introduction Day 1
XML Day 1
Day 1: Objectives
After completing this course, you should be able to: Define what is XML Identify the document type definitions and validity Describe attribute declarations in DTDs Explain entities and external DTD subsets Define embedding non XML data Describe XML namespaces and parsers Identify SAX parser
Explain XML schemas
XML Day 1
Housekeeping
Breaks Washrooms Transportation / parking No pagers or cell phones
Participation
Parking lot issues Questions
XML Day 1
Course map
Module 1: Document type definitions and validity Module 2: Attribute declarations in DTDs Module 3: Entities and external DTD subsets
Module 4: Embedding non XML data

Module 5: XML namespaces and parsers Module 6: XML schema Module 7: XPath Module 8: XSL
XML Day 1
Document type definition (DTD)

Data sent along with a DTD is known as valid XML.
In this case, an XML parser could check incoming data against the rules defined in the DTD to make sure the data was structured correctly.
Data sent without a DTD is known as well-formed XML.

Here an XML-based document instance, such as the hierarchically structured weather data shown, can be used to implicitly describe itself.
With both valid and well-formed XML, XML encoded data is self-describing since descriptive tags are intermixed with the data. DTDs help ensure that different people and programs can read each others files. The DTD defines exactly what is and is not allowed to appear inside a document.
XML Day 1
DTD for our simple XML

A DTD consists of a left square bracket character ([) followed by a series of markup declarations, followed by a right square bracket character (]).
<?xml version="1.0" standalone="yes" ?> <!DOCTYPE Simple [ <!ELEMENT Simple ANY> ] > <Simple> This is the most simplest XML document I have ever seen </Simple>
XML Day 1
DTD declarations
Element type declarations Attribute-list declarations Entity declarations Notation declarations
Processing declarations
Comments Parameter entity references
XML Day 1
Element type <!ELEMENT Name Contentspec>

<!ELEMENT Title (#PCDATA)>
Title permitted to have only char data
<!ELEMENT General ANY> You can use any key word which is legal under General Root tag <!ELEMENT Image EMPTY> Element must be empty. It can not have anything <!ELEMENT Book (Title, Author, Publisher) Book element must have the 3 elements in the same order <!ELEMENT Prerequisite ( BE | ME | MS) Prerequisite element can have either only one of the above <!ELEMENT Candidate (Qualification+, XMLExposure?, OtherSkills*)
Qualification could be one or more, XMLExposure is optional and OtherSkills could be zero or more.
9 XML Day 1
Example
<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE Collection [ <!ELEMENT Collection (CD)+> <!ELEMENT CD (#PCDATA)> ] > <Collection> <CD>Devotional Songs by Pankaj</CD> <CD>Kajal by Pankaj</CD> <CD>Classical Songs by Pankaj</CD> </Collection>
10 XML Day 1
Hello XML with DTD

<?xml version=1.0 standalone=yes?> <! DOCTYPE GREETING [
<! ELEMENT GREETING (#PCDATA)>

]> <GREETING> Hello XML! </GREETING>
11
XML Day 1
Validating against a DTD

A valid document must meet the constraints specified by the DTD. Furthermore, its root element must be the one specified in the document type declaration. Valid Document: <GREETING> Good Morning </GREETING>
12
XML Day 1
Validating against a DTD (continued)

Documents that are not valid: <GREETING> <GREETING> Some Text </GREETING> </GREETING> <GREETING> <sometag> some text </sometag> <someEmptyTag/> </GREETING>
13
XML Day 1
Listing the elements

The first step to creating a DTD appropriate for a particular document is to understand the structure of the information that will be encoded using the elements defined in the DTD. <?xml version=1.0 standalone=yes ?> <Root> <Element 1> <Element 11> </Element 11> </Element 1> </Root>
14
XML Day 1
Element declarations
Each tag used in a valid XML document must be declared with an element declaration in the DTD. This specifies the name and possible contents of an element. This list of contents is also called the content specification. * - may occur more than once (Zero or More Children) ? may or may not occur (Zero or One Children) + - must occur at least once (One or More Children)
15
XML Day 1
Element declarations (continued)

<! ELEMENT SEASON ANY>
All element type declarations begin with <! ELEMENT>. They include the name of the element being declared followed by the content specification. The ANY keyword says that all possible elements as well as parsed character data can be children of the SEASON element.
<! ELEMENT YEAR (#PCDATA) >

This declaration says that a YEAR may contain only parsed character data, i.e., text thats not markup. It may not contain children of its own.
16
XML Day 1
CDATA sections
May contain text, reserved characters and whitespace
Reserved characters need not be replaced by entity references
Not processed by XML parser Commonly used for scripting code (e.g., JavaScript) Begin with <![CDATA[ Terminate with ]]>
17
XML Day 1
CDATA section: Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
18
<?xml version = "1.0"?>  -->
<book title = "C++ How to Program" edition = "3">
<sample> // C++ comment if ( this->getX() < 5 && value[ 0 ] != 3 ) cerr << this->displayError(); </sample> <sample> <![CDATA[
Entity references required if not in CDATA section
XML does not process CDATA section
// C++ comment if ( this->getX() < 5 && value[ 0 ] != 3 ) cerr << this->displayError(); Note the simplicity offered ]]> by CDATA section </sample> C++ How to Program by Deitel & Deitel </book>
XML Day 1
Using a CDATA section
19
XML Day 1
Sharing common DTDs among documents

The real power of XML comes from common DTDs that can be shared among many documents written by different people. If the DTD is not directly included in the document but is linked in from an external source, changes made to the DTD automatically propagate to all documents using that DTD. On the other hand, backward compatibility is not guaranteed when a DTD is modified.
<!DOCTYPE root_element_name SYSTEM DTD_URL>
Example:
<!DOCTYPE SEASON SYSTEM http://ibm/xml/dtds/sample.dtd>
20
XML Day 1
Public DTDs
The SYSTEM keyword is intended for private DTDs used by a single author or group. DTDs designed for writers outside the creating organization use the PUBLIC keyword instead of SYSTEM keyword. <! DOCTYPE root_element_name PUBLIC DTD_name DTD_URL >
Example:
<! DOCTYPE HTML PUBLIC -//W3C//DTD HTML //EN >
21
XML Day 1
Course map

22
XML Day 1
What is an Attribute?
Attributes are intended for extra information associated with an element (like an ID number) used only by programs that read and write the file, and not for the content of the element thats read and written by humans. The Attribute contains information about the content of the element, rather than the content itself. Example: <GREETING LANGUAGE=English> Hello XML! <MOVIE SOURCE=WavingHand.mov /> </GREETING> Attribute ( key = value)
23
XML Day 1
Examples
<RECTANGLE WIDTH=30 HEIGHT=45 /> <SCRIPT LANGUAGE=javascript ENCODING=8859_1>
.
</SCRIPT> Note: End Tags cannot possess Attributes. <SCRIPT>
</SCRIPT LANGUAGE=javascript ENCODING=8859_1>

The above mentioned syntax is illegal.
24
XML Day 1
Declaring Attributes in DTDs

In a valid XML document you must also explicitly declare all attributes that you might intend to use with the documents elements. You define this by using a type of DTD markup known as an attribute-list declaration. This declaration does the following
Defines the names of the attributes associated with that element Specifies the data type of each attribute Specifies for each attribute whether that attribute is required.
25
XML Day 1
Attribute list Declaration

Attribute-list declaration has the following form:
<!ATTLIST Element_name Attribute_name Type Default_value> Element_name - is the name of the element associated with this attribute Attribute_name is the name of the Attribute. Type is the kind of Attribute. Default_value is the value the attribute takes on if no value is specified for the attribute.
26
XML Day 1
Attribute types
Type CDATA Enumerated ID IDREF IDREFS ENTITY ENTITIES
27 XML Day 1
Meaning Character Data text that is not markup A list of possible values from which exactly one will be chosen A unique name not shared by any other ID type attribute in the document The value of an ID type attribute of an element in the document Multiple IDs of elements separated by whitespace The name of an entity declared in the DTD The name of multiple entities declared in the DTD, separated by whitespace.
Specifying default values for Attributes

Instead of specifying an explicit default attribute value, an attribute declaration can be provided a value, allow the value to be omitted completely, or even always use the default values. These requirements are specified with the three keywords
#REQUIRED #IMPLIED #FIXED
28
XML Day 1
#REQUIRED
Instead of providing default values for the attributes, if you want to force anyone posting a document on the intranet to identify themselves, then we go for #REQUIRED. Example:
<!ELEMENT AUTHOR EMPTY>
<!ATTLIST AUTHOR NAME CDATA #REQUIRED>

<!ATTLIST AUTHOR EMAIL CDATA #REQUIRED> <!ATTLIST AUTHOR EXTENSION CDATA #REQUIRED>
29
XML Day 1
#IMPLIED
Sometimes you may not have a good option for a default value, but you do not want to require the author of the document to include a value, either. For example, some of the people posting documents to your intranet are offsite freelancers who have email addresses but lack phone extensions. Therefore, you dont want to require them to include an extension attribute in their <AUTHOR/> tags.
<AUTHOR NAME=Harish EMAIL=harish.modadugu@in.ibm.com/> <!ELEMENT AUTHOR EMPTY>
<!ATTLIST AUTHOR NAME CDATA #REQUIRED>

<!ATTLIST AUTHOR EMAIL CDATA #REQUIRED> <!ATTLIST AUTHOR EXTENSION CDATA #IMPLIED>
30
XML Day 1
#FIXED
Used to provide a default value for the attribute without allowing the author to change it. For Example:
<AUTHOR NAME=Harish COMPANY=IBM EMAIL=hmodadug@in.ibm.com EXTENSION=57536 /> <!ELEMENT AUTHOR EMPTY> <!ATTLIST AUTHOR NAME CDATA #REQUIRED>
<!ATTLIST AUTHOR EMAIL CDATA #REQUIRED>

<!ATTLIST AUTHOR EXTENSION CDATA #IMPLIED> <!ATTLIST AUTHOR COMPANY CDATA #FIXED IBM>
31
XML Day 1
Some examples
<!ATTLIST Film Class CDATA >
Simple form of defining an attribute. Attribute contains characters
<!ATTLIST Film Year CDATA #REQUIRED>

You must specify an attribute value
<!ATTLIST Film Color CDATA #IMPLIED>

You can either include or omit the attribute, no default value supplied
<!ATTLIST Film Language CDATA #FIXED "Hindi>

You can either include or omit. If you omit, the processor will use a special default value. If you include, you must specify.
32
XML Day 1
Example
<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE VideoLibrary [ <!ELEMENT VideoLibrary (Film, Class, (Hero | Director | Heroine)+)> <!ATTLIST Film Color CDATA #IMPLIED Language CDATA #FIXED "Hindi" Year CDATA #REQUIRED> <!ELEMENT Film (#PCDATA)> <!ELEMENT Class (#PCDATA)>
<VideoLibrary>
<Film Year = "1994"> Hum Aapke Hain Kaun </Film> <Class>Love Story </Class> <Heroine>Madhuri Dixit</Heroine> </VideoLibrary>
<!ELEMENT Hero (#PCDATA)>

<!ELEMENT Heroine (#PCDATA)> <!ELEMENT Director (#PCDATA)> ]
>
33 XML Day 1
Predefined Attributes
XML has two predefined Attributes. They are identified by a name that begins with xml:.
xml:space describes how whitespace is treated in the element. xml:lang describes the language in which the element is written.
34
XML Day 1
Course map

35
XML Day 1
What is an Entity?
The storage units that contain particular parts of an XML document are called entities. An entity may consist of a file, a database record, or any other item that contains data. The primary purpose of an entity is to hold content: well-formed, other forms of text, or binary data. A CSS style sheet is not an entity. Every XML has at least one entity.
36
XML Day 1
Kinds of Entities
There are two kinds of entities. Internal Entity

They are defined completely within the document entity. Since the document itself is one such entity, all XML documents have at least one internal entity.
External Entity
They draw their content from another source located via a URL. In HTML, an IMG element represents an external entity while the document itself contained between the <HTML> and </HTML> tags is an internal entity.
37
XML Day 1
Internal general Entities

An <!ENTITY> tag in the DTD defines the abbreviation and the text the abbreviation stands for. Suppose, Instead of typing the same footer at the bottom of each page, we can simply define that text as footer entity in the DTD and then type &footer; at the bottom of each page. Suppose if you decide to change the footer block, you only need to make the change once in the DTD instead of on every page that shares the footer. General entity references begin with an ampersand (&) and end with a semicolon (;), with the entitys name between these two characters. For instance, < is a general entity reference for the less than sign (<). The name of the entity is lt.
38
XML Day 1
Example
<?xml version=1.0 standalone=yes?> <!DOCTYPE DOCUMENT [ <!ENTITY ELTP ENTRY LEVEL TRAINING PROGRAM> <!ELEMENT DOCUMENT (TITLE, COURSE)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT COURSE (COURSE_CODE, DATE)> <!ELEMENT COURSE_CODE (#PCDATA)> <!ELEMENT DATE (#PCDATA)>
]>
<DOCUMENT> <TITLE> &ELTP; </TITLE> <COURSE> <COURSE_CODE> HYD_010 </COURSE_CODE>
<DATE> JULY 23, 2007 </DATE>

</COURSE> </DOCUMENT>
39 XML Day 1
External general Entities

External entities are data outside the main file containing the root element / document entity. With XML, we can use an external general entity reference to embed one document in another.
<!ENTITY name SYSTEM URI>
40
XML Day 1
Example
An XML signature file
<?xml version=1.0?>
External general entity reference

<?xml version=1.0 standalone=no?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (TITLE, SIGNATURE)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT SIGNATURE (NAME, EMPNO)> <!ELEMENT NAME (#PCDATA)>
<SIGNATURE>
<NAME> HARISH </NAME> <EMPNO> 034518 </EMPNO> </SIGNATURE>
<!ELEMENT EMPNO (#PCDATA)>

<!ENTITY SIG SYSTEM signature.xml> ]> <DOCUMENT> <TITLE> ELTP </TITLE> &SIG;</DOCUMENT>
41 XML Day 1
Parameter Entities
General entities become part of the document, not the DTD. They can be used in the DTD but only in places where they become part of the document body. Parameter entity references differs from general entity references in the following:
Parameter entity references begin with a percent sign (%) rather than an ampersand (&).
Parameter entity references can only appear in the DTD, not the document content.
42
XML Day 1
Syntax
<!ENTITY % name replacement text> Example:
<!ENTITY % IBM International Business Machines> <!ENTITY ACRON IBM stands for %IBM;>
Note: Parameter entity references must be declared before theyre used.

They work only with External DTDs.
43
XML Day 1
External parameter Entities

External parameter Entities enable us to build large DTDs from smaller ones. Although cycles are prohibited DTD 1 may not refer to DTD 2 if DTD 2 refers to DTD 1. Such nested DTDs can become large and complex. Breaking a DTD into smaller, more manageable chunks makes the DTD easier to analyze. Both the document and its DTD become much easier to understand when split into separate files.
44
XML Day 1
Example
Sign.dtd <!ELEMENT EMP (EMPNO, NAME)> <!ELEMENT EMPNO (#PCDATA)>
Emp.xml
<?xml version=1.0 standalone=no?> <!DOCTYPE EMP SYSTEM Sign.dtd> <EMP> <EMPNO> 12121 </EMPNO> <NAME> Akash </NAME> </EMP>
<!ELEMENT NAME (#PCDATA)>
45
XML Day 1
Entities and DTDs in well-formed documents

Internal Entities:
The primary advantage of using a DTD in invalid well-formed XML documents is that we may use internal general entity references other than the 5 pre-defined references >, <, ", ' and &.
We simply declare the entities we want as normal; then use them in our documents.
46
XML Day 1
DTD yielding a well-formed yet invalid document

<?xml version=1.0 standalone=yes?> <!DOCTYPE DOCUMENT [ <!ENTITY XML eXtensible Markup Language> ]> <DOCUMENT> <TITLE> &XML; </TITLE>
<EMP>
<EMPNO> 112233 </EMPNO> <NAME> Abhishek </NAME> </EMP>
</DOCUMENT>
47
XML Day 1
External Entities
Sign.dtd <?xml version=1.0?> <EMP> A File that uses Sign.dtd
<?xml version=1.0 standalone=no?> <DOCTYPE DOCUMENT [ <!ENTITY % EMPS SYSTEM Sign.dtd> ]> <DOCUMENT> <TITLE> XML </TITLE>
<EMPNO> 112233 </EMPNO>

<NAME> HARISH </NAME> </EMP>
&EMPS;
</DOCUMENT>
48
XML Day 1
Course map

49
XML Day 1
Notations
The first problem that we encounter when working with non-XML data in an XML document is identifying the format of the data and telling the XML application how to read and display the non-XML data. For ex., it would be inappropriate to try to draw an MP3 sound file on the screen. Furthermore, no application understands all possible file formats.
Ideally, we want documents to tell the application the format of the external entity so you dont have to rely on the application recognizing the file type by a magic number or a potentially unreliable file formats.
50
XML Day 1
Notations

It is used to provide a fixed and mandatory value to an attribute. The value is declared in the notation which can have a path using SYSTEM or a string using PUBLIC.
51
XML Day 1
Notations
<?xml version="1.0" encoding="UTF-8"?> <!ELEMENT IMAGES (IMAGE+) > <!ELEMENT IMAGE (#PCDATA) > <!NOTATION iPATH SYSTEM "C:\windows\a.bmp" >
<!ATTLIST IMAGE SRC NOTATION (iPATH) #REQUIRED>
52
XML Day 1
Using notations
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE IMAGES SYSTEM "C:\N.dtd">

<IMAGES> <IMAGE SRC="iPATH">abc</IMAGE> </IMAGES>
Note: Because XML processor cannot parse bmp files, we need to use an external program for displaying or editing them. When the parser encounters a usage of the notation line name, it will simply provide the path of the application.
53
XML Day 1
Conditional sections
Include declarations
Keyword INCLUDE
Exclude declarations
Keyword IGNORE
Often used with entities

Parameter entities
Preceded by percent character (%)

Creates entities specific to DTD Can be used only inside DTD in which they are declared
54
XML Day 1
Entities and strings

Entities accept and reject represent strings INCLUDE and IGNORE, respectively
1  2  3 Entities accept and reject 4 <!ENTITY % reject "IGNORE"> represent strings INCLUDE and IGNORE, respectively 5 <!ENTITY % accept "INCLUDE"> 6 Include this element 7 <![ %accept; [ message declaration 8 <!ELEMENT message ( approved, signature )> 9 ]]> Exclude this element 10 message declaration 11 <![ %reject; [
55
XML Day 1
Entities and strings (continued)

12 <!ELEMENT message ( approved, reason, signature )>
13 ]]> 14 15 <!ATTLIST <!ELEMENT approved flag EMPTY> 16 ( true | false ) "false"> 17 18 <!ELEMENT reason ( #PCDATA )>
56
19 <!ELEMENT signature ( #PCDATA )>

XML Day 1
56
XML Day 1
Example conditional section (continued)

1 <?xml version = "1.0" standalone = "no"?> 2
3 

4  5 6 <!DOCTYPE message SYSTEM "conditional.dtd"> 7 8 <message> 9 <approved flag = "true"/>
10
<signature>Chairman</signature>
11 </message>
57 XML Day 1
XML document that conforms to conditional.dtd.
58
XML Day 1
Processing instructions
A processing instruction is a string of text between <? And ?> marks. The only required syntax for the text inside the processing instruction is that it must begin with an XML name followed by white space followed by data. Note: Processing Instructions may be placed almost anywhere in an XML document except inside a tag or a CDATA section.
59
XML Day 1
Processing instructions (continued)

Special instructions to the XML consumer application Example:
<?xml version="version" [standalone="DTDflag"] ?>
version
A string in the form n.n specifying the XML level of the file. Use the value 1.0.
DTDflag Optional.
A Boolean value indicating whether the XML file includes a reference to an external Document Type Definition (DTD). Script component XML files do not include such a reference, so the value for this attribute is always "yes."
60
XML Day 1
Course map

61
XML Day 1
Conflicting issues
Namespaces ensure that element names do not conflict, and clarify who defined which term. Namespaces do not give instructions on how to process the elements. Readers still need to know what the elements mean and decide how to process them.
Namespaces simply keep the names straight.
62
XML Day 1
XML Namespaces
Naming collisions
Two different elements have same name
<subject>Math</subject>
<subject>Thrombosis</subject>
Namespaces
Differentiate elements that have same name <school:subject>Math</school:subject> <medical:subject>Thrombosis</medical:subject> school and medical are namespace prefixes Prepended to elements and attribute names Tied to uniform resource identifier (URI) Series of characters for differentiating names
63 XML Day 1
XML Namespaces (continued)
Creating namespaces Use xmlns keyword

xmlns:text = urn:deitel:textInfo
xmlns:image = urn:deitel:imageInfo Creates two namespace prefixes text and image urn:deitel:textInfo is URI for prefix text
urn:deitel:imageInfo is URI for prefix image
Default namespaces
Child elements of this namespace do not need prefix xmlns = urn:deitel:textInfo
64
XML Day 1
Introduction to XML parsers

XML tags are custom-defined thus enabling the generation of domain specific markup languages for diverse fields such as vector graphics, mathematics, music and technical documentation. A parser is a piece of software that makes sure the XML document is valid or at least well-formed. We use an XML parser to dissect XML documents and gain access to the data in them.
XML Parsers are software packages that comes as part of an application or as part of our own programs.
There are two types of XML Parsers.
DOM Parser.
SAX Parser.
65
XML Day 1
XML Document Object Model (DOM)

W3C standard recommendation Build tree structure in memory for XML documents DOM-based parsers parse these structures
Exist in several languages (Java, C, C++, Python, Perl, etc.)
66
XML Day 1
DOM (continued)
DOM tree Each node represents an element, attribute, etc.

<?xml version = "1.0"?> <message from = "Paul" to = "Tem"> <body>Hi, Tim!</body> </message>
Node created for element message

Element message has child node for body element
Element body has child node for text "Hi, Tim!" Attributes from and to also have nodes in tree
67
XML Day 1
DOM implementations
DOM-based parsers
Microsofts msxml
Sun Microsystems JAXP
68
XML Day 1
Some DOM-based parsers

Parser JAXP Description Sun Microsystems Java API for XML Parsing (JAXP) is available at no charge from java.sun.com/xml. IBMs XML Parser for Java (XML4J) is available at no charge from www.alphaworks.ibm.com/tech/xml4j. Apaches Xerces Java Parser is available at no charge from xml.apache.org/xerces. Microsofts XML parser (msxml) version 2.0 is built-into Internet Explorer 5.5. Version 3.0 is also available at no charge from msdn.microsoft.com/xml. 4DOM is a parser for the Python programming language and is available at no charge from fourthought.com/4Suite/4DOM. XML::DOM is a Perl module that we use in Chapter 17 to manipulate XML documents using Perl. For additional information, visit www4.ibm.com/software/developer/library/xm l-perl2.
XML4J
Xerces
msxml
4DOM
XML::DOM
69
XML Day 1
DOM and JavaScript
We use JavaScript and MSXML parser

XML document marks up article Use DOM API to display documents element names/values
70
XML Day 1
Example
1 <?xml version = "1.0"?> 2 3  4  5 6 <article> 7 8 <title>Simple XML</title> 9 10 <date>December 6, 2000</date> 11 12 <author> 13 <fname>Tem</fname> 14 <lname>Nieto</lname> Article marked up 15 </author> 16 with XML tags 17 <summary>XML is pretty easy.</summary> 18 19 <content>Once you have mastered HTML, XML is easily 20 learned. You must remember that XML is not for 21 displaying information but for managing 22 </content> information. 23 24 </article>
71 XML Day 1
Example - DOM
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
72
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html>
  Element script allows for including scripting code Instantiate Microsoft XML DOM object
<script type = "text/javascript" language = "JavaScript"> var xmlDocument = new ActiveXObject( "Microsoft.XMLDOM" ); Load article.xml into memory; msxml parses article.xml and xmlDocument.load( "article.xml" ); stores it as tree structure
XML Day 1
Example (continued)
21 22 23 24 25 26 27 28 29 30 document.writeln( "<br>The following are its child elements:" ); document.writeln( // get the root element var element = xmlDocument.documentElement; Assign article as root element Place root elements name in element strong and write it to browser
"<p>Here is the root node of the document:" ); document.writeln( "<strong>" + element.nodeName + "</strong>" );
31
32
73
document.writeln( "</p><ul>" );
XML Day 1
Example (continued)
33 34 35 36 37 38 39 40 41 42 43 44 45 46
74 XML Day 1
// traverse all child nodes of root element for ( i = 0; i < element.childNodes.length; i++ ) { var curNode = element.childNodes.item( i ); Assign index to each child node of root node // print node name of each child element document.writeln( "<li><strong>" + curNode.nodeName + "</strong></li>" ); }
document.writeln( "</ul>" );
Retrieve root nodes first child node (title)
// get the first child node of root element var currentNode = element.firstChild;
Example (continued)
47 48 49 50 51 52 53 54 55 56 document.writeln( "<strong>" + nextSib.nodeName + "</strong>." ); document.writeln( "<p>The first child of root node is:" ); document.writeln( "<strong>" + currentNode.nodeName + "</strong>" ); document.writeln( "<br>whose next sibling is:" Siblings ); are nodes at same level in document (e.g., title, date, author, summary and content) // get the next sibling of first child Get first childs next sibling (date) var nextSib = currentNode.nextSibling;
75
XML Day 1
Example (continued)
57 58 59 60 61 62 63 64 65 // print the text value of the sibling document.writeln( "<em>" + value.nodeValue + "</em>" ); document.writeln( "<br>Parent node of " ); document.writeln( "<string>" + nextSib.nodeName var value = nextSib.firstChild; document.writeln( "<br>Value of <strong>" + nextSib.nodeName + "</strong> element is:" );
Get first child of date (December 6, 2000)
66
67 68 69
+ "</strong> is:" );
document.writeln( "<strong>" + nextSib.parentNode.nodeName + "</strong>.</p>" ); Get parent of date (article)
70 </script> 71 72 </body> 73 </html>

76 XML Day 1
Traversing article.xml with JavaScript
77
XML Day 1
Course map

78
XML Day 1
XML schema
To define the structure of an XML document. defines the list of elements and attributes than can be used in an XML Document It also specifies the order in which these elements appear in the XML document and their datatypes
Microsoft has developed this XML Schema Definition (XSD) language

It has become w3c recommendation for creating valid XML documents.
79
XML Day 1
Advantages of XML schemas over DTDs

Both are very similar and they are used to define the structure of an XML document.
Syntax for defining an XSD is the same as the syntax of XML document. It is easier to learn syntax It has more control over the type of the data
It enables the user to create own data types

It allows user to specify restrictions on data
80
XML Day 1
XML schema - Datatypes

Primitive : String, decimal, float, boolean Derived : Integer, long, positiveInteger Atomic : List : These are datatypes that cannot be broken down into smaller units. These can be primitive or derived. These are derived datatypes that contain a set of values of an atomic data type.
- Example: pointlist 5,25,75
81
XML Day 1
XML schema Custom defined datatypes

Simple data type
A data type that contains only value
Complex data type

A data type that contains child elements, attributes and also the mixed content
82
XML Day 1
XML schema custom data types

The elements PRODUCTNAME, DESCRIPTION, PRICE and QUANTITY are simple type elements, which do not contain any child elements or attributes. The elements only contain textual value. The elements PRODUCTDATA and PRODUCT are complex type elements that contain child elements, attributes and mixed content.
83
XML Day 1
Declaring a simple type element

Syntax
<xs:element name=element-name type=data type minOccurs=nonNegativeInteger maxOccurs=nonNegativeInteger>
Example:
<xs:element name=PRODUCTNAME type=xs:string/> <xs:element name=PRICE type=xs:positiveInteger/>
84
XML Day 1
Declaring a simple type based on existing simple datatype

<xs:simpleType name=phoneno> <xs:restriction base=xs:string> <xs:length value=10/> <xs:pattern value=\d{3}-\d{3}-\d{4}/>
</xs:restriction>
</xs:simpleType>
It defines a simple datatype called phoneno. The string value can be 10 character long and must match the pattern ddd-ddd-dddd.
85
XML Day 1
Various options of string data type

Length minLength maxLength pattern
enumeration
86
XML Day 1
Creating user-defined simple datatype
num is the user-defined simple datatype <xs:simpleType name="num"> <xs:restriction base="xs:positiveInteger"> <xs:maxInclusive value="400"/>
<xs:minInclusive value="10" />

</xs:restriction> </xs:simpleType>
87
XML Day 1
Associating an element with a simple data type

<xs:element name=EMPNAME type=xs:string/> <xs:element name=EMPPHONE type=phoneno/>
88
XML Day 1
Declaring a complex type element

Syntax <xs:complexType name=data type name> . . </xs:complexType> Example:
<xs:complexType name=prddata>  Comment nodes 4  5 6 <book title = "C++ How to Program" edition = "3"> 7 Attribute nodes 8 <sample> 9 <![CDATA[ 10 Element nodes 11 // C++ comment 12 if ( this->getX() < 5 && value[ 0 ] != 3 ) 13 cerr << this->displayError(); Text nodes 14 ]]> 15 </sample> 16 17 C++ How to Program by Deitel & Deitel 18 </book>
102 XML Day 2
Xpath tree for figure simple.Xml

Root Comment Fig.: simple.xml Comment Simple XML document Element book Attribute Title C++ How to Program Attribute edition 3 Element sample Text // C++ comment if (this -> getX() < 5 && value[ 0 ] != 3 ) cerr << this->displayError(); Text C++ How to Program by Deitel & Deitel
103 XML Day 2
Example - Nodes
1 <?xml version = "1.0"?> Root node 2 3  Comment nodes 4  5 6 <html xmlns = "http://www.w3.org/TR/REC-html40"> 7 8 <head> 9 <title>Processing Instruction and Namespace Nodes</title> 10 </head> Namespace nodes 11 Processing instruction node 12 <?deitelprocessor example = "fig11_03.xml"?> 13 Element nodes 14 <body> 15 Text nodes 16 <deitel:book deitel:edition = "1" 17 xmlns:deitel = "http://www.deitel.com/xmlhtp1"> 18 <deitel:title>XML How to Program</deitel:title> 19 </deitel:book> 20 Attribute nodes 21 </body> 22 23 </html>
104 XML Day 2
Tree diagram of an XML document with a processinginstruction node

Root Comment Fig.: simple2.xml Comment Processing instructions and namespaces Element html
Namespace http://www.w3.org/TR/REC-html40
Element head Element title Text Processing instructions and Namespace Nodes
105 XML Day 2
Tree diagram of an XML document with a processinginstruction node (continued)
Processing Instruction deitelprocessor example = "fig.xml" Element body Element book Attribute edition 1 Namespace http://www.deitel.com/xmlhtp1 Element title Text XML How to Program
106 XML Day 2
XPath node types

Node Type root string-value expanded-name Description Represents the root of an XML document. This node exists only at the top of the tree and may contain element, comment or processorinstruction children. Represents an XML element and may contain element, text, comment or processorinstruction children. Represents an attribute of an element.
Determined by None. concatenating the string-values of all textnode descendents in document order. Determined by The element tag, concatenating the including the namespace string-values of all text- prefix (if applicable). node descendents in document order. The normalized value of the attribute. The name of the attribute, including the namespace prefix (if applicable).
element
attribute
107
XML Day 2
XPath node types (continued)

Node Type text string-value The character data contained in the text node. expanded-name Description None. Represents the character data content of an element.
comment
The content of the comment None. (not including ).
Represents an XML comment.
processing instruction namespace
The part of the processing instruction that follows the target and any whitespace.
The target of the processing instruction.
Represents an XML processing instruction. Represents an XML namespace.
The URI of the namespace. The namespace prefix.
108
XML Day 2
XPath example
XML document: <?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10.90</price> </cd>
109
XML Day 2
XPath example
<cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <price>9.90</price> </cd>
<cd country="USA">
<title>Greatest Hits</title> <artist>Dolly Parton</artist> <price>9.90</price> </cd> </catalog>
110 XML Day 2
XPath expressions
To select the ROOT element catalog:
/catalog
To select all the cd elements of the catalog element:

/catalog/cd
To select all the price elements of all the cd elements of the catalog element:
/catalog/cd/price
Note: If the path starts with a slash (/) it represents an absolute path to an element
To select all the cd elements that have a price element with a value larger than 10.80:
/catalog/cd [price>10.80]
111
XML Day 2
Locating nodes
XML documents can be represented as a tree view of nodes XPath uses a pattern expression to identify nodes in an XML document. An XPath pattern is a slash-separated list of child element names that describe a path through the XML document. The pattern "selects" elements that match the path. The following XPath expression selects all the price elements of all the cd elements of the catalog element:
/catalog/cd/price
If the path starts with a slash ( / ) it represents an absolute path to an element.

112 XML Day 2
Locating nodes (continued)

If the path starts with two slashes ( // ) then all elements in the document that fulfill the criteria will be selected (even if they are at different levels in the XML tree) To select all the cd elements in the document:
//cd
113
XML Day 2
Selecting unknown elements

Wildcards ( * ) can be used to select unknown XML elements. To select all the child elements of all the cd elements of the catalog element:
/catalog/cd/*
To select all the price elements that are grandchild elements of the catalog element:
/catalog/*/price
114
XML Day 2
Selecting unknown elements (continued)

The following XPath expression selects all price elements which have 2 ancestors:
/*/*/price
The following XPath expression selects all elements in the document:

//*
115
XML Day 2
Selecting branches
Square brackets in an XPath expression can specify an element further. To select the first cd child element of the catalog element:
/catalog/cd[1]
To select the last cd child element of the catalog element (Note: There is no
function named first()):
/catalog/cd[last()]
116
XML Day 2
Selecting branches (continued)

To select all the cd elements of the catalog element that have a price element:
/catalog/cd[price]
To select all the cd elements of the catalog element that have a price element with a value of 10.90:
/catalog/cd[price=10.90]
To select all the price elements of all the cd elements of the catalog element that have a price element with a value > 10.90:
/catalog/cd[price>10.90]/price
117
XML Day 2
Selecting Attributes
In XPath all attributes are specified by the @ prefix. To select all attributes named country:
//@country
To select all cd elements which have an attribute named country:

//cd[@country]
118
XML Day 2
Selecting Attributes (continued)

To select all cd elements which have any attribute:
//cd[@*]
To select all cd elements which have an attribute named country with a value of 'UK':
//cd[@country='UK']
119
XML Day 2
Course map

120
XML Day 1
Need for XSL

XSL stands for EXtensible Stylesheet Language. The World Wide Web Consortium (W3C) started to develop XSL because there was a need for an XML-based Stylesheet Language. CSS = HTML Style Sheets XSL = XML Style Sheets
121
XML Day 2
What is XSLT?
XSLT stands for XSL transformations XSLT is the most important part of XSL XSLT transforms an XML document into another XML document XSLT uses xpath to navigate in XML documents
XSLT is a W3C recommendation
122
XML Day 2
XSL - More than a style sheet language
XSL consists of three parts: XSLT - a language for transforming XML documents XPath - a language for navigating in XML documents XSL-FO - a language for formatting XML documents
123
XML Day 2
XSL
124
XML Day 2
Presenting XML
There are two style sheet languages available for use with XML in Internet Explorer
Cascading Style Sheets (CSS) Extensible Style Language (XSL)
An important point to consider in choosing a style sheet language for a particular document is whether the structure of the XML document is suitable for display. With CSS, the structure of the XML content must be virtually identical to the structure of the presentation. Since one of the goals of XML is a complete separation of content from display, many XML documents are difficult to display as you might wish using CSS.
125
XML Day 2
Difference between CSS and XSL?

XML does not use predefined tags (we can use any tag-names we like), and the meaning of these tags are not well understood. A <table> element could mean an HTML table, a piece of furniture, or something else - and a browser does not know how to display it. XSL describes how the XML document should be displayed!
126
XML Day 2
Books.XML
<?xml version="1.0"?>  <?xml:stylesheet href="books.css" type="text/css"?> <books> <book> <title>Professional Active Server Pages 3.0</title> <authors> <author>Richard Anderson</author> <author>Chris Blexrud</author> <author>Andrea Chiarelli</author> <author>Dan Denault</author> </authors><price>us="$59.99"</price> </book></books>
127 XML Day 2
Books.CSS
authors
{ display:block; fontfamily:Arial,Helvetica; font-style:italic; font-size:10pt; color:#990099;
price
{ display:block; border:2px solid black; padding:1em; background-color:#888833; color:#FFFFDD;
text-align:left;
}
font-weight:bold
margin-bottom: .4em; }
128
XML Day 2
XSL advantages
More sophisticated layout using HTML tables.
Data items appearing more than once in the style sheet

Access to information stored in attribute values Reordering of items Dynamic display behaviors not easily possible through CSS or modifying of the source
129
XML Day 2
XSL elements
There are various XSL Elements that can be used for applying the styles to the XML Document. Here is the list of few: xsl:for-each xsl:value-of xsl:if xsl:sort xsl:choose etc.
130
XML Day 2
Interview.XML
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href=Interview.xsl"?> <Interview xmlns:dt="urn:schemasmicrosoft-com:datatypes"> <Candidate> <Name>Mahesh</Name> <Project>Procter & Gamble</Project> <Score dt:dt="number">88</Score> </Candidate>
<Candidate>
<Name>Vishnu</Name> <Project>Banking</Project> <Score dt:dt="number">99</Score>
</Candidate>
<Candidate> <Name>Sridhar</Name> <Project>Telecom</Project>
<Score dt:dt="number">100</Score>
</Candidate> </Interview>
131
XML Day 2
Interview.XSL
<?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/">
<HTML> <BODY> <TABLE BORDER="2"> <TR> <TD>Name</TD> <TD>Project</TD> <TD>Score</TD> </TR>
132
XML Day 2
Interview.XSL (continued)
<xsl:for-each select="Interview/Candidate"> <TR>
<TD><xsl:value-of select="Name"/></TD>
<TD><xsl:value-of select="Project"/></TD> <TD><xsl:value-of select="Score"/></TD> </TR> </xsl:for-each> </TABLE> </BODY>
</HTML>
</xsl:template> </xsl:stylesheet>
133 XML Day 2
Xsl:for-each
<xsl:for-each order-by="sort-criteria-list select="pattern" > order-by
Sort criteria in a semicolon-separated list. When the first sort results in two equal items, the second sort criterion is checked, and so on. The first non-white-space character in each sort criterion indicates whether the sort is ascending (optional +) or descending (-). The sort criterion is expressed as an XSL pattern, relative to the pattern described in the select attribute.
select
XSL pattern query evaluated the current context to determine the set of nodes to iterate over. The default value "node()" indicates selection of all children of the current node.
134
XML Day 2
Xsl:for-each (continued)
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/"> <HTML> <BODY>
<TABLE>
<xsl:for-each select="customers/customer order-by="name; -address/state"><TR> <TD><xsl:value-of select="name" /></TD>
<TD><xsl:value-of select="address" /></TD>

<TD><xsl:value-of select="phone" /></TD> </TR></xsl:for-each> </TABLE></BODY></HTML> </xsl:template></xsl:stylesheet>
135
XML Day 2
Xsl:value-of
Inserts the value of the selected node as text.
<xsl:value-of select="pattern" >
select
XSL pattern to be matched against the current context. The default value is ".", which inserts the value of the current node.
136
XML Day 2
Accessing the Attribute (example)

Go to the Interview.XML and introduce an attribute called Skill to the Candidate
<?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <HTML> <BODY> <TABLE BORDER="2"> <TR> <TD>Name</TD> <TD>Project</TD> <TD>Score</TD></TR>
137 XML Day 2
Accessing the Attribute example (continued)

<xsl:for-each select="Interview/Candidate[@Skill='COM']"> <TR> <TD><xsl:value-of select="Name"/></TD> <TD><xsl:value-of select="Project"/></TD> <TD><xsl:value-of select="Score"/></TD> </TR> </xsl:for-each> </TABLE> </BODY> </HTML> </xsl:template> </xsl:stylesheet>
138
XML Day 2
Accessing Attribute of child Node

Now introduce attribute as DOB to Name and try this code.
<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/"> <HTML> <BODY>
<TABLE BORDER="2">
<TR> <TD>Name</TD> <TD>Date of Birth</TD>
<TD>Project</TD>
<TD>Score</TD> </TR>
139 XML Day 2
Accessing Attribute of child node (continued)

<xsl:for-each select="Interview/Candidate[@Skill='COM']"> <TR> <TD><xsl:value-of select="Name"/></TD> <TD><xsl:value-of select="Name/@DOB"/></TD> <TD><xsl:value-of select="Project"/></TD> <TD><xsl:value-of select="Score"/></TD> </TR> </xsl:for-each> </TABLE> </BODY> </HTML> </xsl:template> </xsl:stylesheet>
140 XML Day 2
XSLT = XSL transformations

XSLT is the most important part of XSL. XSLT is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. Normally XSLT does this by transforming each XML element into an (X)HTML element. With XSLT we can add/remove elements and attributes to or from the output file. We can also rearrange and sort elements, perform tests and make decisions about which elements to hide and display, and a lot more. A common way to describe the transformation process is to say that XSLT transforms an XML source-tree into an XML result-tree.
141
XML Day 2
XSLT uses XPath

XSLT uses XPath to find information in an XML document.
XPath is used to navigate through elements and attributes in XML documents.

In the transformation process, XSLT uses XPath to define parts of the source document that should match one or more predefined templates. When a match is found, XSLT will transform the matching part of the source document into the result document.
142
XML Day 2
Browsers supporting XML and XSLT

Mozilla Firefox
As of version 1.0.2, Firefox has support for XML and XSLT (and CSS).
Mozilla
Mozilla includes Expat for XML parsing and has support to display XML + CSS. Mozilla also has some support for Namespaces. Mozilla is available with an XSLT implementation.
Netscape
As of version 8, Netscape uses the Mozilla engine, and therefore it has the same XML / XSLT support as Mozilla.
Opera
As of version 9, Opera has support for XML and XSLT (and CSS). Version 8 supports only XML + CSS.
Internet Explorer
As of version 6, Internet Explorer supports XML, Namespaces, CSS, XSLT, and Xpath. Version 5 is NOT compatible with the official W3C XSL Recommendation.
143 XML Day 2
Style sheet declaration

The root element that declares the document to be an XSL style sheet is <xsl:stylesheet> or <xsl:transform>. Note:
<xsl:stylesheet> and <xsl:transform> are completely synonymous and either can be used!
The correct way to declare an XSL style sheet according to the W3C XSLT Recommendation is:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
or:
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
144
XML Day 2
Overview of XML transformations : Tree

Every well-formed XML document is a tree. A tree is a data structure composed of connected nodes beginning with a top node called the root. The root is connected to its child nodes, each of which is connected to zero or more children of its own, and so forth. Nodes that have no children of their own are called leaves.
The most useful property of a tree is that each node and its children also form a tree. Thus, a tree is a hierarchical structure of trees in which each tree is built out of smaller trees.
145
XML Day 2
Overview of XML transformations:

View PeriodicTable.xml The PERIODIC_TABLE element contains two child nodes, both ATOM elements. Each ATOM element has an attribute node for its STATE attribute, and a variety of child element nodes.
Each child element contains a node for its contents, as well as nodes for any attributes, comments and processing instructions it possesses.
Notice in particular that many nodes are something other than elements. There are nodes for text, attributes, comments, namespaces and processing instructions.
146
XML Day 2
XSLT transformation
The input must be an XML document XSLT can work with HTML and SGML documents XSLT is not a general-purpose regular expression language for transforming arbitrary data. The XSL transformation language contains operators for selecting nodes from the tree, reordering the nodes, and outputting nodes. Most of the time the output of an XSLT transformation is also an XML document. XSLT processors also support output as HTML and/or raw text, although the standard does not require them to do so.
147
XML Day 2
XSLT transformation (continued)

For the purposes of XSLT, elements, attributes, namespaces, processing instructions, and comments are counted as nodes. Furthermore, the root of the document must be distinguished from the root element. Thus, XSLT processors model an XML document as a tree that contains seven kinds of nodes:
The root Elements
Text
Attributes Namespaces Processing instructions Comments
148
XML Day 2
XSLT transformation (continued)

The root PERIODIC_TABLE element contains ATOM child elements. Each ATOM element contains several child elements providing the atomic number, atomic weight, symbol, boiling point, and so forth. A UNITS attribute specifies the units for those elements that have units.
149
XML Day 2
Style sheet declaration (continued)

To get access to the XSLT elements, attributes and features we must declare the XSLT namespace at the top of the document. The xmlns:xsl=http://www.w3.org/1999/XSL/Transform points to the official W3C XSLT namespace. If we use this namespace, we must also include the attribute version="1.0".
150
XML Day 2
Creating a raw XML document

We want to transform the following XML document ("cdcatalog.xml") into XHTML: <?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist>
<country>USA</country>
<company>Columbia</company> <price>10.90</price> <year>1985</year>
</cd>.
</catalog>
151 XML Day 2
Create an XSL style sheet

Then you create an XSL Style Sheet ("cdcatalog.xsl") with a transformation template:
<?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html>
<body>
<h2>My CD Collection</h2> <table border="1"> <tr bgcolor="#9acd32"> <th align="left">Title</th> <th align="left">Artist</th></tr>
152 XML Day 2
Create an XSL style sheet (continued)

<xsl:for-each select="catalog/cd"> <tr> <td><xsl:value-of select="title"/></td> <td><xsl:value-of select="artist"/></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet>
153
XML Day 2
Link the XSL style sheet to the XML document
Add the XSL style sheet reference to your XML document ("cdcatalog.xml"):
<?xml-stylesheet type="text/xsl" href="cdcatalog.xsl"?>
View cdcatalog.xml View cdcatalog_with_xsl.xml
154
XML Day 2
XSL templates
An XSL style sheet consists of one or more set of rules that are called templates. Each template contains rules to apply when a specified node is matched.
155
XML Day 2
The <xsl:template> element

The <xsl:template> element is used to build templates. The match attribute is used to associate a template with an XML element. The match attribute can also be used to define a template for the entire XML document. The value of the match attribute is an XPath expression (i.e. match="/" defines the whole document).
156
XML Day 2
An example
<?xml version="1.0"?> <xsl:stylesheet version=1.0 xmlns:xsl="http://www.w3.org/1999/XSL/Transfor m"> <xsl:template match="/"> <html> <body> <h2>My CD Collection</h2> <table border="1"> <tr bgcolor="#9acd32"> <th align="left">Title</th> <th align="left">Artist</th> </tr>
157 XML Day 2
<tr>
<td>.</td>
<td>.</td> </tr> </table> </body> </html> </xsl:template> </xsl:stylesheet >
Explanation
Since an XSL style sheet is an XML document itself, it always begins with the XML declaration: <?xml version="1.0" encoding="ISO-8859-1"?>. The next element, <xsl:stylesheet>, defines that this document is an XSLT style sheet document (along with the version number and XSLT namespace attributes). The <xsl:template> element defines a template. The match="/" attribute associates the template with the root of the XML source document.
The content inside the <xsl:template> element defines some HTML to write to the output.
The last two lines define the end of the template and the end of the style sheet.
The result of the transformation above will look like this:
158
XML Day 2
Where does the XML transformation happen?

There are three primary ways to transform XML documents into other formats, such as HTML, with an XSLT style sheet:
The XML document and associated style sheet are both served to the client (Web browser), which then transforms the document as specified by the style sheet and presents it to the user. The server applies an XSLT style sheet to an XML document to transform it to some other format (generally HTML) and sends the transformed document to the client (Web browser). A third program transforms the original XML document into some other format (often HTML) before the document is placed on the server. Both server and client only deal with the transformed document.
159
XML Day 2
The <xsl:value-of>
The <xsl:value-of> element is used to extract the value of a selected node. It can be used to extract the value of an XML element and add it to the output stream of the transformation View cdcatalog_valueof.xsl View cdcatalog_valueof.xml
160
XML Day 2
The <xsl:for-each>
The <xsl:for-each> element allows you to do looping in XSLT. The element can be used to select every XML element of a specified nodeset: <xsl:for-each select="catalog/cd"> <tr> <td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr> </xsl:for-each> Note: The value of the select attribute is an XPath expression. An XPath expression works like navigating a file system; where a forward slash (/) selects subdirectories.
161 XML Day 2
Filtering the output

We can also filter the output from the XML file by adding a criterion to the select attribute in the <xsl:for-each> element. <xsl:for-each select="catalog/cd [artist='Bob Dylan']"> Legal filter operators are:
= != (equal) (not equal)
< less than

> greater than
162
XML Day 2
Filtering the output

Take a look at the adjusted XSL style sheet: <xsl:for-each select="catalog/cd [artist='Bob Dylan']"> <tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td> </tr> </xsl:for-each>
163
XML Day 2
The <xsl:sort>
To sort the output, simply add an <xsl:sort> element inside the <xsl:for-each> element in the XSL file: <xsl:for-each select="catalog/cd"> <xsl:sort select="artist"/> <tr> <td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr> </xsl:for-each> Note: The select attribute indicates what XML element to sort on.
164
XML Day 2
The <xsl:if> element

To put a conditional if test against the content of the XML file, add an <xsl:if> element to the XSL document. Syntax <xsl:if test="expression"> ... ...
some output if the expression is true

... ... </xsl:if>
165
XML Day 2
Where to Put the <xsl:if> element

<xsl:for-each select="catalog/cd"> <xsl:if test="price > 10"> <tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td> </tr> </xsl:if> </xsl:for-each>
Note: The value of the required test attribute contains the expression to be evaluated. The code above will only output the title and artist elements of the CDs that has a price that is higher than 10.
166 XML Day 2
The <xsl:choose> element

The <xsl:choose> element is used in conjunction with <xsl:when> and <xsl:otherwise> to express multiple conditional tests.
Syntax <xsl:choose> <xsl:when test="expression"> ... some output ... </xsl:when> <xsl:otherwise> ... some output .... </xsl:otherwise> </xsl:choose>
167
XML Day 2
Where to put the choose condition
To insert a multiple conditional test against the XML file, add the <xsl:choose>, <xsl:when>, and <xsl:otherwise> elements to the XSL file:
<xsl:for-each select="catalog/cd"> <xsl:otherwise> <td> <xsl:value-of select="artist"/> </td> </xsl:otherwise> </xsl:choose> </tr> </xsl:for-each>
<tr>
<td> <xsl:value-of select="title"/></td> <xsl:choose> 10">
<xsl:when test="price >
<td bgcolor="#ff00ff"> <xsl:value-of select="artist"/> </td> </xsl:when>

168 XML Day 2
The <xsl:choose> with <xsl:when> element

<xsl:for-each select="catalog/cd">
<tr> <td> <xsl:value-of select="title"/> </td> <xsl:choose> <xsl:when test="price > 10"> <td bgcolor="#ff00ff"> <xsl:value-of select="artist"/> </td> </xsl:when> <xsl:when test="price > 9"> <td bgcolor="#cccccc"> <xsl:value-of select="artist"/> </td> </xsl:when> <xsl:otherwise> <td>
<xsl:value-of select="artist"/>
</td> </xsl:otherwise> </xsl:choose>
</tr>
</xsl:for-each>
169 XML Day 2
Processing the child elements : <xsl:apply-templates>

The <xsl:apply-templates> element applies a template to the current element or to the current element's child nodes. If we add a select attribute to the <xsl:apply-templates> element it will process only the child element that matches the value of the attribute. We can use the select attribute to specify the order in which the child nodes are processed.
170
XML Day 2
Processing the child elements : <xsl:apply-templates>

View cdcatalog_applytemplate.xsl View cdcatalog_applytemplate.xml
171
XML Day 2
Wild cards
Sometimes you want a single template to apply to more than one element. You can indicate that a template matches all elements by using the asterisk wildcard (*) in place of an element name in the match attribute. For example this template says that all elements should be wrapped in a P element: <xsl:template match="*"> <P>
<xsl:value-of select="."/>
</P> </xsl:template> Of course this is probably more than you want.
Wed like to use the template rules already defined for PERIODIC_TABLE and ATOM elements as well as the root node and only use this rule for the other elements.
172 XML Day 2
Matching by ID
We may want to apply a particular style to a particular single element without changing all other elements of that type. The simplest way to do that in XSLT is to attach a style to the element's ID type attribute. This is done with the id() selector, which contains the ID value in single quotes. For example, this rule makes the element with the ID e47 bold: <xsl:template match="id('e47')"> <b><xsl:value-of select="."/></b> </xsl:template>
173
XML Day 2
Matching attributes with @

The @ sign matches against attributes and selects nodes according to attribute names. Simply prefix the name of the attribute that you want to select with the @ sign. For example, this template rule matches UNITS attributes, and wraps them in an I element.
<xsl:template match="@UNITS">
<I><xsl:value-of select="."/></I> </xsl:template>
174
XML Day 2
Expression types
Every expression evaluates to a single value. there are five types of expressions in XSLT:
Node sets Booleans Numbers
Strings
Result tree fragments
175
XML Day 2
Node sets
A node set is an unordered group of nodes from the input document. The axes return a node set containing the nodes they match. Which nodes are in the node set depends on the context node, the node test, and the axis. For example, when the context node is the PERIODIC_TABLE element, the XPath expression
select="child::ATOM" returns a node set that contains both ATOM elements in that document.
select="child::ATOM/child::NAME" returns a node set containing the two element nodes <NAME>Hydrogen</NAME> and <NAME>Helium</NAME> when the context node is the PERIODIC_TABLE element.
176
XML Day 2
Context node
The context node is a member of the context node list. The context node list is that group of elements that all match the same rule at the same time, generally as a result of one xsl:apply-templates or xsl:for-each call.
177
XML Day 2
Functions that operate on or return node sets

Function:
position() last() count(node-set) id(string1 string2 string3) key(string name, Object value) document(string URI, string base)
Return Type:
number number number node set
Returns:
The position of the context node in the context node list; the first node in the list has position 1 The number of nodes in the context node list; this is the same as the position of the last node in the list The number of nodes in node-set. A node set containing all the elements anywhere in the same document that have an ID named in the argument list; the empty set if no element has the specified ID. A node set containing all nodes in this document that have a key with the specified value. Keys are set with the top-level xsl:key element. A node set in the document referred to by the URI; the nodes are chosen from the named anchor or XPointer used by the URI. If there is no named anchor or XPointer, then the root element of the named document is the node set. Relative URIs are relative to the base URI given in the second argument. If the second argument is omitted, then relative URIs are relative to the URI of the style sheet (not the source document!).
node set node set
178
XML Day 2
Functions that operate on or return node sets

Function:
local-name(node set)
Return Type:
String
Returns:
The local name (everything after the namespace prefix) of the first node in the node set argument; can be used without any arguments to get the local name of the context node. The URI of the namespace of the first node in the node set; can be used without any arguments to get the URI of the namespace of the context node; returns an empty string if the node is not in a namespace.
namespace-uri(node set)
String
name(node set)
String
The qualified name (both prefix and local part) of the first node in the node set argument; can be used without an argument to get the qualified name of the context node. A unique identifier for the first node in the argument node set; can be used without any argument to generate an ID for the context node.
generate-id(node set)
String
179
XML Day 2
Position(): Example
The position() function can be used to determine an element's position within a node set. Prefixes the name of each atom's name with its position in the document using
<xsl:value-of select="position()"/>.
View periodictable_position.xsl
View periodictable_position.xml
180
XML Day 2
Booleans
A Boolean has one of two values: True or False. XSLT allows any kind of data to be transformed into a Boolean. This is often done implicitly when a string or a number or a node set is used where a Boolean is expected, as in the test attribute of an xsl:if element. These conversions can also be performed by the boolean() function which converts an argument of any type to a boolean according to these rules:
A number is false if it's zero or NaN (a special symbol meaning Not a Number, used for the result of dividing by zero and similar illegal operations); true otherwise. An empty node set is false. All other node sets are true. An empty result tree fragment is false. All other result tree fragments are true.
A zero length string is false. All other strings are true.
181
XML Day 2
Booleans (continued)
Booleans are also produced as the result of expressions involving these operators:
= equal to != not equal to < less than (really <) > greater than <= less than or equal to (really <=)
>= greater than or equal to
Note : The < sign is illegal in attribute values. Consequently, it must be replaced by < even when used as the less-than operator.
182
XML Day 2
Booleans (continued)
Child::ATOM selects all the ATOM children of the context node. Child::ATOM[position()=1] selects only the first ATOM child of the context node. [position()=1] is a predicate on the node test ATOM that returns a boolean result:
True if the position of the ATOM is equal to one; false otherwise.
Each node test can have any number of predicates. However, more than one is unusual.
183
XML Day 2
Example of boolean operators

For example, this template rule applies to the first ATOM element in the periodic table, but not to subsequent ones, by testing whether or not the position of the element equals 1.
<xsl:template match="PERIODIC_TABLE/ATOM[position()=1]"> <xsl:value-of select="."/> </xsl:template>
This template rule applies to all ATOM elements that are not the first child element of the PERIODIC_TABLE by testing whether the position is greater than 1:
<xsl:template match="PERIODIC_TABLE/ATOM[position()>1]"> <xsl:value-of select="."/>
</xsl:template>
184
XML Day 2
Example of boolean operators (continued)

<xsl:template match="ATOMIC_NUMBER[position()=1 and position()=last()]"> <xsl:value-of select="."/>
</xsl:template>
If the first condition is false, then the complete and expression is guaranteed to be false. Consequently, the second condition won't be checked. This template matches both the first and last ATOM elements in their parent by matching when the position is 1 or when the position is equal to the number of elements in the set:
<xsl:template match="ATOM[position()=1 or position()=last()]">
<xsl:value-of select="."/>
</xsl:template>
185 XML Day 2
Example of boolean operators (continued)

The not() function reverses the result of an operation. For example, this template rule matches all ATOM elements that are not the first child of their parents: <xsl:template match="ATOM[not(position()=1)]"> <xsl:value-of select="."/> </xsl:template> The same template rule could be written using the not equal operator != instead: <xsl:template match="ATOM[position()!=1]"> <xsl:value-of select="."/>
</xsl:template>
186
XML Day 2
Number functions
XPath numbers are 64-bit IEEE 754 floating-point doubles. Even numbers like 42 or -7000 that look like integers are stored as doubles.
Nonnumber values such as strings and booleans are converted to numbers automatically as necessary, or at user request through the number() function using these rules:
Booleans are 1 if true; 0 if false. A string is trimmed of leading and trailing white space, then converted to a number in the fashion you would expect;
For example: The string "12" is converted to the number 12. If the string cannot be interpreted as a number, then it is converted to the special symbol NaN, which stands for Not a Number.
Node sets and result tree fragments are converted to strings; the string is then converted to a number.
187 XML Day 2
Number functions (continued)

For example: This template only outputs the non-naturally occurring transuranium elements; that is, those elements with atomic numbers greater than 92 (the atomic number of uranium).
The node set produced by ATOMIC_NUMBER is implicitly converted to the string value of the current ATOMIC_NUMBER node. This string is then converted into a number.
<xsl:template match="/PERIODIC_TABLE"> <HTML> <HEAD> <TITLE>The Transuranium Elements</TITLE> </HEAD> <BODY> <xsl:apply-templates select="ATOM[ATOMIC_NUMBER>92]"/> </BODY>
</HTML> </xsl:template>
188
XML Day 2
Number functions (continued)

XPath provides the standard four arithmetic operators:
+ for addition - for subtraction * for multiplication div for division (the more common / is already used for other purposes in XPath)
189
XML Day 2
Number functions: Example

For example, this rule selects those elements whose atomic weight is more than twice their atomic number: <xsl:template match="/PERIODIC_TABLE"> <HTML> <BODY> <H1>High Atomic Weight to Atomic Number Ratios</H1> <xsl:apply-templates select="ATOM[ATOMIC_WEIGHT > 2 * ATOMIC_NUMBER]"/> </BODY> </HTML> </xsl:template>
190
XML Day 2
Number functions example (continued)

This template actually prints the ratio of atomic weight to atomic number: <xsl:template match="ATOM"> <p> <xsl:value-of select="NAME"/>
<xsl:value-of select="ATOMIC_WEIGHT div ATOMIC_NUMBER"/>

</p> </xsl:template>
191
XML Day 2

XPath includes four functions that operate on numbers:
floor() returns the greatest integer less than or equal to the number ceiling() returns the smallest integer greater than or equal to the number round() rounds the number to the nearest integer sum() returns the sum of its arguments
192
XML Day 2

For example: This template rule estimates the number of neutrons in an atom by subtracting the atomic number (the number of protons) from the atomic weight (the weighted average over the natural distribution of isotopes of the number of neutrons plus the number of protons) and rounding to the nearest integer: <xsl:template match="ATOM"> <p> <xsl:value-of select="NAME"/>
<xsl:value-of select="round(ATOMIC_WEIGHT - ATOMIC_NUMBER)"/> </p> </xsl:template>
193
XML Day 2

This rule calculates the average atomic weight of all the atoms in the table by adding all the atomic weights, and then dividing by the number of atoms: <xsl:template match="/PERIODIC_TABLE"> <HTML> <BODY> <H1>Average Atomic Weight</H1> <xsl:value-of select="sum(descendant::ATOMIC_WEIGHT) div count(descendant::ATOMIC_WEIGHT)"/> </BODY> </xsl:template>
194 XML Day 2
</HTML>
String functions
A string is a sequence of Unicode characters. Other data types can be converted to strings using the string() function according to these rules: Node sets are converted to strings by using the value of the first node in the set as calculated by the xsl:value-of element. A number is converted to a European-style number string like -12 or 3.1415292. Boolean false is converted to the English word false. Boolean true is converted to the English word true.
195
XML Day 2
String functions
Function: Return Type: Returns:
starts-with(main_string, prefix_string)
contains(containing_string, contained_string) substring(string, offset, length)
Boolean
Boolean String
True if main_string starts with prefix_string; false otherwise

True if the contained_string is part of the containing_string; false otherwise length characters from the specified offset in string; or all characters from the offset to the end of the string if length is omitted; length and offset are rounded to the nearest integer if necessary The part of the string from the first character up to (but not including) the first occurrence of marker-string The part of the string from the end of the first occurrence of marker-string to the end of string; the first character in the string is at offset 1 The number of characters in string The string after leading and trailing white space is stripped and runs of white space are replaced with a single space; if the argument is omitted the string value of the context node is normalized
substring-before(string, markerstring) substring-after(string, markerstring) string-length(string) normalize-space(string)
String
String
Number String
196
XML Day 2
String functions (continued)

Function: translate(string, replaced_text, replacement_text) Return Type: String Returns: Returns string with occurrences of characters in replaced_text replaced by the corresponding characters from replacement_text Returns the concatenation of as many strings as are passed as arguments in the order they were passed Returns the string form of number formatted according to the specified format-string as if by Java 1.1's java.text.DecimalFormat class (see http://java.sun.com/products/jdk/1.1/docs/api/java.te xt.DecimalFormat.html); the locale-string is an optional argument that provides the name of the xsl:decimal-format element used to interpret the format-string
concat(string1, string2, . . . )
String
format-number(number, formatstring, locale-string)
String
197
XML Day 2
Numbering using xsl:number

Counting nodes with xsl:number The xsl:number element inserts a formatted integer into the output docuThe value of the integer is given ment. by the value attribute.
This contains a number, which is rounded to the nearest integer, then formatted according to the value of the format attribute.
198
XML Day 2
An XSLT style sheet that counts atoms

<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="PERIODIC_TABLE">
<html> <head><title>The Elements</title></head> <body> <table> <tr><xsl:apply-templates select="ATOM"/></tr> </table></body></html> </xsl:template> <xsl:template match="ATOM">
<td><xsl:number value="ATOMIC_NUMBER"/></td>
<td><xsl:value-of select="NAME"/></td> </xsl:template></xsl:stylesheet>
199 XML Day 2
Default numbers
If you use the value attribute to calculate the number, that's all you need. However, if the value attribute is omitted, then the position of the current node in the source tree is used as the number.
200
XML Day 2
Default numbers (continued)

For example: Produce a table of atoms that have boiling points less than or equal to the boiling point of nitrogen. View periodictable_number.xsl View periodictable_number.xml
201
XML Day 2
xsl:number
We can change what xsl:number counts using these three attributes:
level count from
202
XML Day 2
xsl:number- count
<xsl:template match="ATOM/*"> <td> <xsl:number count="*"/> </td>
<td>
<xsl:value-of select="."/> </td> </xsl:template>
203
XML Day 2
xsl:number- level
By default, with no value attribute, xsl:number counts siblings of the source node with the same type. For instance, if the ATOMIC_NUMBER elements were numbered instead of ATOM elements, none would have a number higher than 1 because an ATOM never has more than one ATOMIC_NUMBER child. Although the document contains more than one ATOMIC_NUMBER element, these are not siblings. Setting the level attribute of xsl:number to any counts all of the elements of the same kind as the current node in the document. This includes not just the ones in the current node list, but all nodes of the same type. Even if you select only the atomic numbers of the gases, for example, the solids and liquids would still count, even if they weren't output. Consider these rules:
204
XML Day 2
xsl:number- level (continued)

<xsl:template match="ATOM"> <tr><xsl:apply-templates select="NAME"/></tr> </xsl:template> <xsl:template match="NAME">
<td><xsl:number level="any"/></td>
<td><xsl:value-of select="."/></td> </xsl:template>
205
XML Day 2
The from attribute

The from attribute contains an XPath expression that specifies which element the counting begins with in the input tree. However, the counting still begins from 1, not 2 or 10 or some other number. The from attribute only changes which element is considered to be the first element. This attribute is only considered when level="any". Other times it has no effect.
206
XML Day 2
Questions
207
XML Day 2
Testing your understanding
1. Which of the following is used to describe the XML document? a. Document Type Definition b. Data Type Definition c. Data Type Document d. Document Type Decision
2. XMLs goal is to replace HTML.

a. False b. True
208
XML Day 1
3. The wild card character used to describe 1 or many in DTD is _______. a. # b. + c. * d. ? 4. The storage unit that contain particular parts of XML document is called _____. a. b. c. d.
209
ELEMENT ENTITY ATTRIBUTE NOTATION

XML Day 1

5. Processing Instructions are indicated as ________. a. <!xml > b. <?xml ?> c. <!xml . !> d. <?xml ... >
6. XML Elements must always be in lower case.

a. True b. False
210
XML Day 1

7. DTD is used widely when compared to Schema. a. False
b. True
8. External DTD has no <!DOCTYPE> declaration within it. a. False b. True
211
XML Day 1
Testing your understanding (continued)

9. The syntax for XML commenting is __________. a. /* This is comment in XML */ b. // This is comment in XML c. <-- This is comment in XML --> d. <!- - This is comment in XML - ->
10. Attributes of XML elements must always in single quotes.

a. True b. False
212
XML Day 1
Summary
At the completion of this course, we see that you are now able to: Put in your own words and introduction to XML Define what is XML Identify the document type definitions and validity Describe attribute declarations in DTDs Explain entities and external DTD subsets Define embedding non XML data Describe XML namespaces and parsers Explain XML schemas
213
XML Day 1
THANK YOU
214
214
XML Day 2
Distribution Channels

XML (Day1) V1.2

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

XML (Day1) V1.2

Hochgeladen von

Copyright:

Verfügbare Formate

IBM Global Business Services

Copyright IBM Corporation 2009

IBM Global Business Services

XML Introduction Day 1

Copyright IBM Corporation 2009

IBM Global Business Services

Explain XML schemas

Copyright IBM Corporation 2009

IBM Global Business Services

Copyright IBM Corporation 2009

IBM Global Business Services

Module 4: Embedding non XML data

Copyright IBM Corporation 2009

IBM Global Business Services

Document type definition (DTD)

Data sent without a DTD is known as well-formed XML.

Copyright IBM Corporation 2009

IBM Global Business Services

DTD for our simple XML

Copyright IBM Corporation 2009

IBM Global Business Services

Copyright IBM Corporation 2009

IBM Global Business Services

Element type <!ELEMENT Name Contentspec>

IBM Global Business Services

IBM Global Business Services

Hello XML with DTD

<! ELEMENT GREETING (#PCDATA)>

Copyright IBM Corporation 2009

IBM Global Business Services

Validating against a DTD

Copyright IBM Corporation 2009

IBM Global Business Services

Validating against a DTD (continued)

Copyright IBM Corporation 2009

IBM Global Business Services

Listing the elements

Copyright IBM Corporation 2009

IBM Global Business Services

Copyright IBM Corporation 2009

IBM Global Business Services

Element declarations (continued)

<! ELEMENT YEAR (#PCDATA) >

Copyright IBM Corporation 2009

IBM Global Business Services

Copyright IBM Corporation 2009

IBM Global Business Services

CDATA section: Example

<book title = "C++ How to Program" edition = "3">

Entity references required if not in CDATA section

XML does not process CDATA section

IBM Global Business Services

Using a CDATA section

Copyright IBM Corporation 2009

IBM Global Business Services

Sharing common DTDs among documents

<!DOCTYPE root_element_name SYSTEM DTD_URL>

Copyright IBM Corporation 2009

IBM Global Business Services

Copyright IBM Corporation 2009

IBM Global Business Services

Module 4: Embedding non XML data

Copyright IBM Corporation 2009