Sie sind auf Seite 1von 64

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

What is XML?
XML XML XML XML XML XML stands for EXtensible Markup Language is a markup language much like HTML was designed to carry data, not to display data tags are not predefined. You must define your own tags is designed to be self-descriptive is a W3C Recommendation

XML was designed to transport and store data. HTML was designed to display data.

The Difference between XML and HTML


XML is not a replacement for HTML. XML and HTML were designed with different goals: XML was designed to transport and store data, with focus on what data is. HTML was designed to display data, with focus on how data looks.

HTML is about displaying information, while XML is about carrying information.

XML Does not DO Anything


Maybe it is a little hard to understand, but XML does not DO anything. XML was created to structure, store, and transport information. The following example is a note to Tove from Jani, stored as XML:

<note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>


The note above is quite self descriptive. It has sender and receiver information, it also has a heading and a message body. But still, this XML document does not DO anything. It is just pure information wrapped in tags. Someone must write a piece of software to send, receive or display it.

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

XML is Just Plain Text


XML is nothing special. It is just plain text. Software that can handle plain text can also handle XML. However, XML-aware applications can handle the XML tags specially. The functional meaning of the tags depends on the nature of the application.

With XML You Invent Your Own Tags


The tags in the example above (like <to> and <from>) are not defined in any XML standard. These tags are "invented" by the author of the XML document. That is because the XML language has no predefined tags. The tags used in HTML (and the structure of HTML) are predefined. HTML documents can only use tags defined in the HTML standard (like <p>, <h1>, etc.). XML allows the author to define his own tags and his own document structure.

XML is Not a Replacement for HTML


XML is a complement to HTML. It is important to understand that XML is not a replacement for HTML. In most web applications, XML is used to transport data, while HTML is used to format and display the data. My best description of XML is this: XML is a software- and hardware-independent tool for carrying information.

XML is a W3C Recommendation


XML became a W3C Recommendation 10. February 1998. To read more about the XML activities at W3C, please read our W3C Tutorial.

XML is Everywhere
We have been participating in XML development since its creation. It has been amazing to see how quickly the XML standard has developed, and how quickly a large number of software vendors have adopted the standard.

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

XML is now as important for the Web as HTML was to the foundation of the Web. XML is everywhere. It is the most common tool for data transmissions between all sorts of applications, and is becoming more and more popular in the area of storing and describing information.

How Can XML be used?


XML is used in many aspects of web development, often to simplify data storage and sharing.

XML Separates Data from HTML


If you need to display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the data changes. With XML, data can be stored in separate XML files. This way you can concentrate on using HTML for layout and display, and be sure that changes in the underlying data will not require any changes to the HTML. With a few lines of JavaScript, you can read an external XML file and update the data content of your HTML. You will learn more about this in a later chapter of this tutorial.

XML Simplifies Data Sharing


In the real world, computer systems and databases contain data in incompatible formats. XML data is stored in plain text format. This provides a software- and hardwareindependent way of storing data. This makes it much easier to create data that different applications can share.

XML Simplifies Data Transport


With XML, data can easily be exchanged between incompatible systems. One of the most time-consuming challenges for developers is to exchange data between incompatible systems over the Internet.

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Exchanging data as XML greatly reduces this complexity, since the data can be read by different incompatible applications.

XML Simplifies Platform Changes


Upgrading to new systems (hardware or software platforms), is always very time consuming. Large amounts of data must be converted and incompatible data is often lost. XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new applications, or new browsers, without losing data.

XML Makes Your Data More Available


Since XML is independent of hardware, software and application, XML can make your data more available and useful. Different applications can access your data, not only in HTML pages, but also from XML data sources. With XML, your data can be available to all kinds of "reading machines" (Handheld computers, voice machines, news feeds, etc), and make it more available for blind people, or people with other disabilities.

XML is Used to Create New Internet Languages


A lot of new Internet languages are created with XML. Here are some examples: XHTML the latest version of HTML WSDL for describing available web services WAP and WML as markup languages for handheld devices RSS languages for news feeds RDF and OWL for describing resources and ontology SMIL for describing multimedia for the web

If Developers Have Sense


If they DO have sense, future applications will exchange their data in XML.

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

The future might give us word processors, spreadsheet applications and databases that can read each other's data in a pure text format, without any conversion utilities in between. We can only pray that all the software vendors will agree. XML documents form a tree structure that starts at "the root" and branches to "the leaves".

An Example XML Document


XML documents use a self-describing and simple syntax:

<?xml version="1.0" encoding="ISO-8859-1"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
The first line is the XML declaration. It defines the XML version (1.0) and the encoding used (ISO-8859-1 = Latin-1/West European character set). The next line describes the root element of the document (like saying: "this document is a note"):

<note>
The next 4 lines describe 4 child elements of the root (to, from, heading, and body):

<to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body>


And finally the last line defines the end of the root element:

</note>
You can assume, from this example, that the XML document contains a note to Tove from Jani. Don't you agree that XML is pretty self-descriptive?

XML Documents Form a Tree Structure


5

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

XML documents must contain a root element. This element is "the parent" of all other elements. The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree. All elements can have sub elements (child elements):

<root> <child> <subchild>.....</subchild> </child> </root>


The terms parent, child, and sibling are used to describe the relationships between elements. Parent elements have children. Children on the same level are called siblings (brothers or sisters). All elements can have text content and attributes (just like in HTML).

Example:

The image above represents one book in the XML below:

<bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN">
6

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
The root element in the example is <bookstore>. All <book> elements in the document are contained within <bookstore>. The <book> element has 4 children: <title>,< author>, <year>, <price>.

XML Syntax Rules


The syntax rules of XML are very simple and logical. The rules are easy to learn, and easy to use.

All XML Elements Must Have a Closing Tag


In HTML, you will often see elements that don't have a closing tag:

<p>This is a paragraph <p>This is another paragraph

In XML, it is illegal to omit the closing tag. All elements must have a closing tag:

<p>This is a paragraph</p> <p>This is another paragraph</p>

Note: You might have noticed from the previous example that the XML declaration did not have a closing tag. This is not an error. The declaration is not a part of the XML document itself, and it has no closing tag.

XML Tags are Case Sensitive


7

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

XML elements are defined using XML tags. XML tags are case sensitive. With XML, the tag <Letter> is different from the tag <letter>. Opening and closing tags must be written with the same case:

<Message>This is incorrect</message> <message>This is correct</message>

Note: "Opening and closing tags" are often referred to as "Start and end tags". Use whatever you prefer. It is exactly the same thing.

XML Elements must be properly nested


In HTML, you might see improperly nested elements:

<b><i>This text is bold and italic</b></i>

In XML, all elements must be properly nested within each other:

<b><i>This text is bold and italic</i></b>

In the example above, "Properly nested" simply means that since the <i> element is opened inside the <b> element, it must be closed inside the <b> element.

XML Documents Must Have a Root Element


XML documents must contain one element that is the parent of all other elements. This element is called the root element.

<root> <child> <subchild>.....</subchild> </child> </root>

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

XML Attribute Values must be quoted


XML elements can have attributes in name/value pairs just like in HTML. In XML the attribute value must always be quoted. Study the two XML documents below. The first one is incorrect, the second is correct:

<note date=12/11/2007> <to>Tove</to> <from>Jani</from> </note>

<note date="12/11/2007"> <to>Tove</to> <from>Jani</from> </note>

The error in the first document is that the date attribute in the note element is not quoted.

Entity References
Some characters have a special meaning in XML. If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element. This will generate an XML error:

<message>if salary < 1000 then</message>

To avoid this error, replace the "<" character with an entity reference:

<message>if salary &lt; 1000 then</message>

There are 5 predefined entity references in XML: &lt; &gt; < > less than greater than

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

&amp; &apos; &quot;

& ' "

ampersand apostrophe quotation mark

Note: Only the characters "<" and "&" are strictly illegal in XML. The greater than character is legal, but it is a good habit to replace it.

Comments in XML
The syntax for writing comments in XML is similar to that of HTML. <!-- This is a comment -->

White-space is preserved in XML


HTML truncates multiple white-space characters to one single white-space: HTML: Output: Hello my name is Tove

Hello my name is Tove.

With XML, the white-space in a document is not truncated.

XML Stores New Line as LF


In Windows applications, a new line is normally stored as a pair of characters: carriage return (CR) and line feed (LF). The character pair bears some resemblance to the typewriter actions of setting a new line. In Unix applications, a new line is normally stored as a LF character. Macintosh applications use only a CR character to store a new line. An XML document contains XML Elements.

What is an XML Element?


An XML element is everything from (including) the element's start tag to (including) the element's end tag.

10

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

An element can contain other elements, simple text or a mixture of both. Elements can also have attributes.

<bookstore> <book category="CHILDREN"> <title>Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title>Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
In the example above, <bookstore> and <book> have element contents, because they contain other elements. <author> has text content because it contains text. In the example above only <book> has an attribute (category="CHILDREN").

XML Naming Rules


XML elements must follow these naming rules: Names Names Names Names can contain letters, numbers, and other characters cannot start with a number or punctuation character cannot start with the letters xml (or XML, or Xml, etc) cannot contain spaces

Any name can be used, no words are reserved.

Best Naming Practices


Make names descriptive. Names with an underscore separator are nice: <first_name>, <last_name>. Names should be short and simple, like this: <book_title> not like this: <the_title_of_the_book>. Avoid "-" characters. If you name something "first-name," some software may think you want to subtract name from first. Avoid "." characters. If you name something "first.name," some software may think that "name" is a property of the object "first."

11

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Avoid ":" characters. Colons are reserved to be used for something called namespaces (more later). XML documents often have a corresponding database. A good practice is to use the naming rules of your database for the elements in the XML documents. Non-English letters like are perfectly legal in XML, but watch out for problems if your software vendor doesn't support them.

XML Elements are Extensible


XML elements can be extended to carry more information. Look at the following XML example:

<note> <to>Tove</to> <from>Jani</from> <body>Don't forget me this weekend!</body> </note>


Let's imagine that we created an application that extracted the <to>, <from>, and <body> elements from the XML document to produce this output: MESSAGE To: Tove From: Jani Don't forget me this weekend! Imagine that the author of the XML document added some extra information to it:

<note> <date>2008-01-10</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
Should the application break or crash? No. The application should still be able to find the <to>, <from>, and <body> elements in the XML document and produce the same output. One of the beauties of XML, is that it can often be extended without breaking applications.

12

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

XML Attributes
XML elements can have attributes in the start tag, just like HTML. Attributes provide additional information about elements.

XML Attributes
From HTML you will remember this: <img src="computer.gif">. The "src" attribute provides additional information about the <img> element. In HTML (and in XML) attributes provide additional information about elements:

<img src="computer.gif"> <a href="demo.asp">

Attributes often provide information that is not a part of the data. In the example below, the file type is irrelevant to the data, but important to the software that wants to manipulate the element:

<file type="gif">computer.gif</file>

XML Attributes Must be Quoted


Attribute values must always be enclosed in quotes, but either single or double quotes can be used. For a person's sex, the person tag can be written like this:

<person sex="female">

or like this:

<person sex='female'>

If the attribute value itself contains double quotes you can use single quotes, like in this example:

<gangster name='George "Shotgun" Ziegler'>


13

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

or you can use character entities:

<gangster name="George &quot;Shotgun&quot; Ziegler">

XML Elements vs. Attributes


Take a look at these examples:

<person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>

<person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>

In the first example sex is an attribute. In the last, sex is an element. Both examples provide the same information. There are no rules about when to use attributes and when to use elements. Attributes are handy in HTML. In XML my advice is to avoid them. Use elements instead.

My Favorite Way
The following three XML documents contain exactly the same information: A date attribute is used in the first example:

<note date="10/01/2008"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

14

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

A date element is used in the second example:

<note> <date>10/01/2008</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

An expanded date element is used in the third: (THIS IS MY FAVORITE):

<note> <date> <day>10</day> <month>01</month> <year>2008</year> </date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Avoid XML Attributes?


Some of the problems with using attributes are: attributes cannot contain multiple values (elements can) attributes cannot contain tree structures (elements can) attributes are not easily expandable (for future changes)

Attributes are difficult to read and maintain. Use elements for data. Use attributes for information that is not relevant to the data. Don't end up like this:

<note day="10" month="01" year="2008" to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!"> </note>
15

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

XML Attributes for Metadata


Sometimes ID references are assigned to elements. These IDs can be used to identify XML elements in much the same way as the ID attribute in HTML. This example demonstrates this:

<messages> <note id="501"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <note id="502"> <to>Jani</to> <from>Tove</from> <heading>Re: Reminder</heading> <body>I will not</body> </note> </messages>

The ID above is just an identifier, to identify the different notes. It is not a part of the note itself. What I'm trying to say here is that metadata (data about data) should be stored as attributes, and that data itself should be stored as elements.

XML Validation
XML with correct syntax is "Well Formed" XML. XML validated against a DTD is "Valid" XML.

Well Formed XML Documents


A "Well Formed" XML document has correct XML syntax. The syntax rules were described in the previous chapters: XML documents must have a root element

16

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

XML XML XML XML

elements must have a closing tag tags are case sensitive elements must be properly nested attribute values must be quoted

<?xml version="1.0" encoding="ISO-8859-1"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Valid XML Documents


A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a Document Type Definition (DTD):

<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE note SYSTEM "Note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

The DOCTYPE declaration in the example above, is a reference to an external DTD file. The content of the file is shown in the paragraph below.

XML DTD
The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list of legal elements:

<!DOCTYPE [ <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT

note note (to,from,heading,body)> to (#PCDATA)> from (#PCDATA)> heading (#PCDATA)>


17

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<!ELEMENT body (#PCDATA)> ]>

If you want to study DTD, you will find our DTD tutorial on our homepage.

XML Schema
W3C supports an XML-based alternative to DTD, called XML Schema:

<xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element>

If you want to study XML Schema, you will find our Schema tutorial on our homepage.

A General XML Validator


To help you check the syntax of your XML files, we have created an XML validator to syntax-check your XML. Please see the next chapter.

XML Validator
Use our XML validator to syntax-check your XML.

XML Errors Will Stop You


Errors in XML documents will stop your XML applications.

18

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

The W3C XML specification states that a program should stop processing an XML document if it finds an error. The reason is that XML software should be small, fast, and compatible. HTML browsers will display documents with errors (like missing end tags). HTML browsers are big and incompatible because they have a lot of unnecessary code to deal with (and display) HTML errors. With XML, errors are not allowed.

Syntax-Check Your XML


To help you syntax-check your XML, we have created an XML validator. Paste your XML into the text area below, and syntax-check it by clicking the "Validate" button.

Note: This only checks if your XML is "Well formed". If you want to validate your XML against a DTD, see the last paragraph on this page.

Syntax-Check an XML File


You can syntax-check an XML file by typing the URL of the file into the input field below, and then click the "Validate" button: Filename:
http://w w w .w 3schools.com/xml/note_error.xml

Validate

Note: If you get an "Access denied" error, it's because your browser security does not allow file access across domains.

19

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

The file "note_error.xml" demonstrates your browsers error handling. If you want see an error free message, substitute the "note_error.xml" with "cd_catalog.xml".

Validate Your XML Against a DTD


If you know DTD, you can validate your XML in the text area below. Just add the DOCTYPE declaration to your XML and click the "Validate" button:

Note: Only Internet Explorer will actually check your XML against the DTD. Firefox, Mozilla, Netscape, and Opera will not.

Viewing XML Files


Raw XML files can be viewed in all major browsers. Don't expect XML files to be displayed as HTML pages.

Viewing XML Files


<?xml version="1.0" encoding="ISO-8859-1"?> - <note> <to>Tove</to>

20

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Look at this XML file: note.xml The XML document will be displayed with color-coded root and child elements. A plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure. To view the raw XML source (without the + and - signs), select "View Page Source" or "View Source" from the browser menu. Note: In Chrome, Opera, and Safari, only the element text will be displayed. To view the raw XML, you must right click the page and select "View Source"

Viewing an Invalid XML File


If an erroneous XML file is opened, the browser will report the error. Look at this XML file: note_error.xml

Other XML Examples


Viewing some XML documents will help you get the XML feeling. An XML CD catalog This is a CD collection, stored as XML data. An XML plant catalog This is a plant catalog from a plant shop, stored as XML data. A Simple Food Menu This is a breakfast food menu from a restaurant, stored as XML data.

Why Does XML Display Like This?


XML documents do not carry information about how to display the data. Since XML tags are "invented" by the author of the XML document, browsers do not know if a tag like <table> describes an HTML table or a dining table.

21

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Without any information about how to display the data, most browsers will just display the XML document as it is. In the next chapters, we will take a look at different solutions to the display problem, using CSS, XSLT and JavaScript.

Displaying XML with CSS


With CSS (Cascading Style Sheets) you can add display information to an XML document.

Displaying your XML Files with CSS?


It is possible to use CSS to format an XML document. Below is an example of how to use a CSS style sheet to format an XML document: Take a look at this XML file: The CD catalog Then look at this style sheet: The CSS file Finally, view: The CD catalog formatted with the CSS file Below is a fraction of the XML file. The second line links the XML file to the CSS file:

<?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/css" href="cd_catalog.css"?> <CATALOG> <CD> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> <CD> <TITLE>Hide your heart</TITLE> <ARTIST>Bonnie Tyler</ARTIST> <COUNTRY>UK</COUNTRY> <COMPANY>CBS Records</COMPANY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR> </CD>
22

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

. . . </CATALOG>

Formatting XML with CSS is not the most common method.

Displaying XML with XSLT


With XSLT you can transform an XML document into HTML.

Displaying XML with XSLT


XSLT is the recommended style sheet language of XML. XSLT (eXtensible Stylesheet Language Transformations) is far more sophisticated than CSS. One way to use XSLT is to transform XML into HTML before it is displayed by the browser as demonstrated in these examples: View the XML file, the XSLT style sheet, and View the result. Below is a fraction of the XML file. The second line links the XML file to the XSLT file:

<?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href="simple.xsl"?> <breakfast_menu> <food> <name>Belgian Waffles</name> <price>$5.95</price> <description>two of our famous Belgian Waffles</description> <calories>650</calories> </food> </breakfast_menu>

If you want to learn more about XSLT, find our XSLT tutorial on our homepage.

Transforming XML with XSLT on the Server


23

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

In the example above, the XSLT transformation is done by the browser, when the browser reads the XML file. Different browsers may produce different result when transforming XML with XSLT. To reduce this problem the XSLT transformation can be done on the server. View the result. Note that the result of the output is exactly the same, either the transformation is done by the web server or by the web browser.

XML Parser
Most browsers have a build-in XML parser to read and manipulate XML. The parser converts XML into a JavaScript accessible object.

Parsing XML
The XML DOM contains methods (functions) to traverse XML trees, access, insert, and delete nodes. However, before an XML document can be accessed and manipulated, it must be loaded into an XML DOM object. The parser reads XML into memory and converts it into an XML DOM object that can be accessed with JavaScript.

Load an XML Document


The following JavaScript fragment loads an XML document ("books.xml"):

var xmlDoc; xmlDoc=new window.XMLHttpRequest(); xmlDoc.open("GET","books.xml",false); xmlDoc.send("");

Code explained: Create an XMLHTTP object Open the XMLHTTP object Send an XML HTTP request to the server

24

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Load an XML String


The following code loads and parses an XML string:

try //Internet Explorer { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.loadXML(txt); return xmlDoc; } catch(e) { parser=new DOMParser(); xmlDoc=parser.parseFromString(txt,"text/xml"); return xmlDoc; }

Note: Internet Explorer uses the loadXML() method to parse an XML string, while other browsers uses the DOMParser object.

Access Across Domains


For security reasons, modern browsers do not allow access across domains. This means, that both the web page and the XML file it tries to load, must be located on the same server. The examples on W3Schools all open XML files located on the W3Schools domain. If you want to use the example above on one of your web pages, the XML files you load must be located on your own server.

The XML DOM


In the next chapter of this tutorial, you will learn how to access and retrieve data from the XML document object (the XML DOM).

XML DOM
25

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

The DOM (Document Object Model) defines a standard way for accessing and manipulating documents.

The XML DOM


The XML DOM (XML Document Object Model) defines a standard way for accessing and manipulating XML documents. The DOM views XML documents as a tree-structure. All elements can be accessed through the DOM tree. Their content (text and attributes) can be modified or deleted, and new elements can be created. The elements, their text, and their attributes are all known as nodes. In the examples below we use the following DOM reference to get the text from the <to> element: xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue xmlDoc - the XML document created by the parser. getElementsByTagName("to")[0] - the first <to> element childNodes[0] - the first child of the <to> element (the text node) nodeValue - the value of the node (the text itself)

You can learn more about the XML DOM in our XML DOM tutorial.

The HTML DOM


The HTML DOM (HTML Document Object Model) defines a standard way for accessing and manipulating HTML documents. All HTML elements can be accessed through the HTML DOM. In the examples below we use the following DOM reference to change the text of the HTML element where id="to": document.getElementById("to").innerHTML= document - the HTML document getElementById("to") - the HTML element where id="to" innerHTML - the inner text of the HTML element

You can learn more about the HTML DOM in our HTML DOM tutorial.

Load an XML File - A Cross browser Example


26

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

The following code loads an XML document ("note.xml") into the XML parser:

Example
<html> <head> <script type="text/javascript"> function loadXMLDoc(dname) { var xmlDoc; if (window.XMLHttpRequest) { xmlDoc=new window.XMLHttpRequest(); xmlDoc.open("GET",dname,false); xmlDoc.send(""); return xmlDoc.responseXML; } // IE 5 and IE 6 else if (ActiveXObject("Microsoft.XMLDOM")) { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async=false; xmlDoc.load(dname); return xmlDoc; } alert("Error loading document"); return null; } </script> </head> <body> <h1>W3Schools Internal Note</h1> <p><b>To:</b> <span id="to"></span><br /> <b>From:</b> <span id="from"></span><br /> <b>Message:</b> <span id="message"></span> <script type="text/javascript"> xmlDoc=loadXMLDoc("note.xml"); document.getElementById("to").innerHTML=xmlDoc.getElementsByTagN ame("to")[0].childNodes[0].nodeValue; document.getElementById("from").innerHTML=xmlDoc.getElementsByTa gName("from")[0].childNodes[0].nodeValue; document.getElementById("message").innerHTML=xmlDoc.getElementsB
27

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

yTagName("body")[0].childNodes[0].nodeValue; </script> </body> </html>

Important Note
To extract the text "Jani" from the XML, the syntax is:

getElementsByTagName("from")[0].childNodes[0].nodeValue

In the XML example there is only one <from> tag, but you still have to specify the array index [0], because the XML parser method getElementsByTagName() returns an array of all <from> nodes.

Load an XML String - A Cross browser Example


The following code loads and parses an XML string:

Example
<html> <head> <script type="text/javascript"> function loadXMLString() { text="<note>"; text=text+"<to>Tove</to>"; text=text+"<from>Jani</from>"; text=text+"<heading>Reminder</heading>"; text=text+"<body>Don't forget me this weekend!</body>"; text=text+"</note>"; try //Internet Explorer { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.loadXML(text);
28

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

} catch(e) { try // Firefox, Mozilla, Opera, etc. { parser=new DOMParser(); xmlDoc=parser.parseFromString(text,"text/xml"); } catch(e) { alert(e.message); return; } } document.getElementById("to").innerHTML=xmlDoc.getElementsByTagN ame("to")[0].childNodes[0].nodeValue; document.getElementById("from").innerHTML=xmlDoc.getElementsByTa gName("from")[0].childNodes[0].nodeValue; document.getElementById("message").innerHTML=xmlDoc.getElementsB yTagName("body")[0].childNodes[0].nodeValue; } </script> </head> <body onload="loadXMLString()"> <h1>W3Schools Internal Note</h1> <p><b>To:</b> <span id="to"></span><br /> <b>From:</b> <span id="from"></span><br /> <b>Message:</b> <span id="message"></span> </p> </body> </html>

Note: Internet Explorer uses the loadXML() method to parse an XML string, while other browsers use the DOMParser object.

XML to HTML
29

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

This chapter explains how to display XML data as HTML.

Display XML Data in HTML


In the example below, we loop through an XML file (cd_catalog.xml), and display each CD element as an HTML table row:

Example
<html> <body> <script type="text/javascript"> var xmlDoc; if (window.XMLHttpRequest) { xmlDoc=new window.XMLHttpRequest(); xmlDoc.open("GET","cd_catalog.xml",false); xmlDoc.send(""); xmlDoc=xmlDoc.responseXML; } // IE 5 and IE 6 else if (ActiveXObject("Microsoft.XMLDOM")) { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async=false; xmlDoc.load("cd_catalog.xml"); } document.write("<table border='1'>"); var x=xmlDoc.getElementsByTagName("CD"); for (i=0;i<x.length;i++) { document.write("<tr><td>"); document.write(x[i].getElementsByTagName("ARTIST")[0].childNodes [0].nodeValue); document.write("</td><td>"); document.write(x[i].getElementsByTagName("TITLE")[0].childNodes[ 0].nodeValue); document.write("</td></tr>");
30

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

} document.write("</table>"); </script> </body> </html>

Example explained
We the We We For We check the browser, and load the XML using the correct parser (explained in previous chapter) create an HTML table with <table border="1"> use getElementsByTagName() to get all XML CD nodes each CD node, we display data from ARTIST and TITLE as table data. end the table with </table>

For more information about using JavaScript and the XML DOM, visit our XML DOM tutorial.

The XMLHttpRequest Object


The XMLHttpRequest object provides a way to communicate with a server after a web page has loaded.

What is the XMLHttpRequest Object?


The XMLHttpRequest object is the developers dream, because you can: Update a web page with new data without reloading the page Request data from a server after the page has loaded Receive data from a server after the page has loaded Send data to a server in the background

The XMLHttpRequest object is supported in all modern browsers. Example: XML HTTP communication with a server while typing input

Creating an XMLHttpRequest Object


31

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Creating an XMLHttpRequest object is done with one single line of JavaScript. In all modern browsers (including IE7):

var xmlhttp=new XMLHttpRequest()

In Internet Explorer 5 and 6:

var xmlhttp=new ActiveXObject("Microsoft.XMLHTTP")

Example
<script type="text/javascript"> var xmlhttp; function loadXMLDoc(url) { xmlhttp=null; if (window.XMLHttpRequest) {// code for all new browsers xmlhttp=new XMLHttpRequest(); } else if (window.ActiveXObject) {// code for IE5 and IE6 xmlhttp=new ActiveXObject("Microsoft.XMLHTTP"); } if (xmlhttp!=null) { xmlhttp.onreadystatechange=state_Change; xmlhttp.open("GET",url,true); xmlhttp.send(null); } else { alert("Your browser does not support XMLHTTP."); } } function state_Change() { if (xmlhttp.readyState==4) {// 4 = "loaded"
32

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

if (xmlhttp.status==200) {// 200 = OK // ...our code here... } else { alert("Problem retrieving XML data"); } } } </script>

Note: onreadystatechange is an event handler. The value (state_Change) is the name of a function which is triggered when the state of the XMLHttpRequest object changes. States run from 0 (uninitialized) to 4 (complete). Only when the state = 4, we can execute our code.

Why Use Async=true?


Our examples use "true" in the third parameter of open(). This parameter specifies whether the request should be handled asynchronously. True means that the script continues to run after the send() method, without waiting for a response from the server. The onreadystatechange event complicates the code. But it is the safest way if you want to prevent the code from stopping if you don't get a response from the server. By setting the parameter to "false", you can avoid the extra onreadystatechange code. Use this if it's not important to execute the rest of the code if the request fails. Try it yourself using JavaScript

More Examples
Load a textfile into a div element with XML HTTP Make a HEAD request with XML HTTP

33

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Make a specified HEAD request with XML HTTP Display an XML file as an HTML table

Is the XMLHttpRequest Object a W3C Standard?


The XMLHttpRequest object is not specified in any W3C recommendation. However, the W3C DOM Level 3 "Load and Save" specification contains some similar functionality, but these are not implemented in any browsers yet.

XML Application
This chapter demonstrates a small XML application built with HTML and JavaScript.

The XML Example Document


Look at the following XML document ("cd_catalog.xml"), that represents a CD catalog:

<?xml version="1.0" encoding="ISO-8859-1"?> <CATALOG> <CD> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> . . .

View the full "cd_catalog.xml" file.

Load the XML Document


34

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

To load the XML document (cd_catalog.xml), we use the same code as we used in the XML Parser chapter:

if (window.XMLHttpRequest) { xmlDoc=new window.XMLHttpRequest(); xmlDoc.open("GET","cd_catalog.xml",false); xmlDoc.send(""); xmlDoc=xmlDoc.responseXML; } // IE 5 and IE 6 else if (ActiveXObject("Microsoft.XMLDOM")) { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async=false; xmlDoc.load("cd_catalog.xml"); }

After the execution of this code, xmlDoc is an XML DOM object, accessible by JavaScript.

Display XML Data as an HTML Table


The following code displays an HTML table filled with data from the XML DOM object:

Example
document.write("<table border='1'>"); var x=xmlDoc.getElementsByTagName("CD"); for (i=0;i<x.length;i++) { document.write("<tr><td>"); document.write(x[i].getElementsByTagName("ARTIST")[0].childNodes [0].nodeValue); document.write("</td><td>"); document.write(x[i].getElementsByTagName("TITLE")[0].childNodes[ 0].nodeValue); document.write("</td></tr>"); } document.write("</table>");

35

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

For each CD element in the XML document, a table row is created. Each table row contains two table data with ARTIST and TITLE from the current CD element.

Display XML Data in any HTML Element


XML data can be copied into any HTML element that can display text. The code below is part of the <head> section of the HTML file. It gets the XML data from the first <CD> element and displays it in the HTML element with the id="show":

Example
var x=xmlDoc.getElementsByTagName("CD"); i=0; function display() { artist=(x[i].getElementsByTagName("ARTIST")[0].childNodes[0].nod eValue); title=(x[i].getElementsByTagName("TITLE")[0].childNodes[0].nodeV alue); year=(x[i].getElementsByTagName("YEAR")[0].childNodes[0].nodeVal ue); txt="Artist: " + artist + "<br />Title: " + title + "<br />Year: "+ year; document.getElementById("show").innerHTML=txt; }

The body of the HTML document contains an onload event attribute that calls the display() function when the page is loaded. It also contains a <div id='show'> element to receive the XML data.

<body onload="display()"> <div id='show'></div> </body>

36

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

In the example above, you will only see data from the first CD element in the XML document. To navigate to the next CD element, you have to add some more code.

Add a Navigation Script


To add navigation to the example above, create two functions called next() and previous():

Example
function next() { if (i<x.length-1) { i++; display(); } } function previous() { if (i>0) { i--; display(); } }

The next() function displays the next CD, unless you are on the last CD element. The previous() function displays the previous CD, unless you are at the first CD element. The next() and previous() functions are called by clicking next/previous buttons:

<input type="button" onclick="previous()" value="previous" /> <input type="button" onclick="next()" value="next" />

37

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

All Together Now


With a little creativity you can create a full application. If you use what you have learned on this page, and a little imagination, you can easily develop this into a full application. Try it yourself: See how you can add a little fancy to this application. For more information about using JavaScript and the XML DOM, visit our XML DOM tutorial. XML Examples

Viewing XML Files


Raw XML files can be viewed in all major browsers. Don't expect XML files to be displayed as HTML pages.

Viewing XML Files


<?xml version="1.0" encoding="ISO-8859-1"?> - <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Look at this XML file: note.xml The XML document will be displayed with color-coded root and child elements. A plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure. To view the raw XML source (without the + and - signs), select "View Page Source" or "View Source" from the browser menu.

38

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Note: In Chrome, Opera, and Safari, only the element text will be displayed. To view the raw XML, you must right click the page and select "View Source"

Viewing an Invalid XML File


If an erroneous XML file is opened, the browser will report the error. Look at this XML file: note_error.xml

Other XML Examples


Viewing some XML documents will help you get the XML feeling. An XML CD catalog This is a CD collection, stored as XML data. An XML plant catalog This is a plant catalog from a plant shop, stored as XML data. A Simple Food Menu This is a breakfast food menu from a restaurant, stored as XML data.

Why Does XML Display Like This?


XML documents do not carry information about how to display the data. Since XML tags are "invented" by the author of the XML document, browsers do not know if a tag like <table> describes an HTML table or a dining table. Without any information about how to display the data, most browsers will just display the XML document as it is. In the next chapters, we will take a look at different solutions to the display problem, using CSS, XSLT and JavaScript.

XML CDATA
All text in an XML document will be parsed by the parser. But text inside a CDATA section will be ignored by the parser.

39

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

PCDATA - Parsed Character Data


XML parsers normally parse all the text in an XML document. When an XML element is parsed, the text between the XML tags is also parsed:

<message>This text is also parsed</message>

The parser does this because XML elements can contain other elements, as in this example, where the <name> element contains two other elements (first and last):

<name><first>Bill</first><last>Gates</last></name>

and the parser will break it up into sub-elements like this:

<name> <first>Bill</first> <last>Gates</last> </name>

Parsed Character Data (PCDATA) is a term used about text data that will be parsed by the XML parser.

CDATA - (Unparsed) Character Data


The term CDATA is used about text data that should not be parsed by the XML parser. Characters like "<" and "&" are illegal in XML elements. "<" will generate an error because the parser interprets it as the start of a new element. "&" will generate an error because the parser interprets it as the start of an character entity. Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA. Everything inside a CDATA section is ignored by the parser. A CDATA section starts with "<![CDATA[" and ends with "]]>":

<script> <![CDATA[
40

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

function matchwo(a,b) { if (a < b && a < 0) then { return 1; } else { return 0; } } ]]> </script>

In the example above, everything inside the CDATA section is ignored by the parser. Notes on CDATA sections: A CDATA section cannot contain the string "]]>". Nested CDATA sections are not allowed. The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks.

DTD Tutorial

The purpose of a DTD (Document Type Definition) is to define the legal building blocks of an XML document. A DTD defines the document structure with a list of legal elements and attributes. Start learning DTD now!

DTD Newspaper Example


<!DOCTYPE NEWSPAPER [ <!ELEMENT NEWSPAPER (ARTICLE+)> <!ELEMENT ARTICLE (HEADLINE,BYLINE,LEAD,BODY,NOTES)> <!ELEMENT HEADLINE (#PCDATA)>
41

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST <!ATTLIST ]>

BYLINE (#PCDATA)> LEAD (#PCDATA)> BODY (#PCDATA)> NOTES (#PCDATA)> ARTICLE ARTICLE ARTICLE ARTICLE AUTHOR CDATA #REQUIRED> EDITOR CDATA #IMPLIED> DATE CDATA #IMPLIED> EDITION CDATA #IMPLIED>

Introduction to DTD
A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes. A DTD can be declared inline inside an XML document, or as an external reference.

Internal DTD Declaration


If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax:

<!DOCTYPE root-element [element-declarations]>


Example XML document with an internal DTD:

<?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)>
42

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body> </note>
Open the XML file above in your browser (select "view source" or "view page source" to view the DTD) The DTD above is interpreted like this:

!DOCTYPE note defines that the root element of this document is note !ELEMENT note defines that the note element contains four elements: "to,from,heading,body" !ELEMENT to defines the to element to be of type "#PCDATA" !ELEMENT from defines the from element to be of type "#PCDATA" !ELEMENT heading defines the heading element to be of type "#PCDATA" !ELEMENT body defines the body element to be of type "#PCDATA"

External DTD Declaration


If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax:

<!DOCTYPE root-element SYSTEM "filename">


This is the same XML document as above, but with an external DTD (Open it, and select view source):

<?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading>
43

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<body>Don't forget me this weekend!</body> </note>


And this is the file "note.dtd" which contains the DTD:

<!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT

note (to,from,heading,body)> to (#PCDATA)> from (#PCDATA)> heading (#PCDATA)> body (#PCDATA)>

Why Use a DTD?


With a DTD, each of your XML files can carry a description of its own format. With a DTD, independent groups of people can agree to use a standard DTD for interchanging data. Your application can use a standard DTD to verify that the data you receive from the outside world is valid. You can also use a DTD to verify your own data.

DTD - XML Building Blocks


The main building blocks of both XML and HTML documents are elements.

The Building Blocks of XML Documents


Seen from a DTD point of view, all XML documents (and HTML documents) are made up by the following building blocks:

Elements Attributes 44

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Entities PCDATA CDATA

Elements
Elements are the main building blocks of both XML and HTML documents. Examples of HTML elements are "body" and "table". Examples of XML elements could be "note" and "message". Elements can contain text, other elements, or be empty. Examples of empty HTML elements are "hr", "br" and "img". Examples:

<body>some text</body> <message>some text</message>

Attributes
Attributes provide extra information about elements. Attributes are always placed inside the opening tag of an element. Attributes always come in name/value pairs. The following "img" element has additional information about a source file:

<img src="computer.gif" />


The name of the element is "img". The name of the attribute is "src". The value of the attribute is "computer.gif". Since the element itself is empty it is closed by a " /".

Entities
45

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Some characters have a special meaning in XML, like the less than sign (<) that defines the start of an XML tag. Most of you know the HTML entity: "&nbsp;". This "no-breaking-space" entity is used in HTML to insert an extra space in a document. Entities are expanded when a document is parsed by an XML parser. The following entities are predefined in XML:
Entity References &lt; &gt; &amp; &quot; &apos; Character < > & " '

PCDATA
PCDATA means parsed character data. Think of character data as the text found between the start tag and the end tag of an XML element. PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for entities and markup. Tags inside the text will be treated as markup and entities will be expanded. However, parsed character data should not contain any &, <, or > characters; these need to be represented by the &amp; &lt; and &gt; entities, respectively.

CDATA
46

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

CDATA means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.

DTD - Elements
In a DTD, elements are declared with an ELEMENT declaration.

Declaring Elements
In a DTD, XML elements are declared with an element declaration with the following syntax:

<!ELEMENT element-name category> or <!ELEMENT element-name (element-content)>

Empty Elements
Empty elements are declared with the category keyword EMPTY:

<!ELEMENT element-name EMPTY> Example: <!ELEMENT br EMPTY> XML example: <br />

47

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Elements with Parsed Character Data


Elements with only parsed character data are declared with #PCDATA inside parentheses:

<!ELEMENT element-name (#PCDATA)> Example: <!ELEMENT from (#PCDATA)>

Elements with any Contents


Elements declared with the category keyword ANY, can contain any combination of parsable data:

<!ELEMENT element-name ANY> Example: <!ELEMENT note ANY>

Elements with Children (sequences)


Elements with one or more children are declared with the name of the children elements inside parentheses:

<!ELEMENT element-name (child1)> or <!ELEMENT element-name (child1,child2,...)> Example:

48

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<!ELEMENT note (to,from,heading,body)>


When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. In a full declaration, the children must also be declared, and the children can also have children. The full declaration of the "note" element is:

<!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT

note (to,from,heading,body)> to (#PCDATA)> from (#PCDATA)> heading (#PCDATA)> body (#PCDATA)>

Declaring Only One Occurrence of an Element


<!ELEMENT element-name (child-name)> Example: <!ELEMENT note (message)>
The example above declares that the child element "message" must occur once, and only once inside the "note" element.

Declaring Minimum One Occurrence of an Element


<!ELEMENT element-name (child-name+)> Example: <!ELEMENT note (message+)>
The + sign in the example above declares that the child element "message" must occur one or more times inside the "note" element.
49

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Declaring Zero or More Occurrences of an Element


<!ELEMENT element-name (child-name*)> Example: <!ELEMENT note (message*)>
The * sign in the example above declares that the child element "message" can occur zero or more times inside the "note" element.

Declaring Zero or One Occurrences of an Element


<!ELEMENT element-name (child-name?)> Example: <!ELEMENT note (message?)>
The ? sign in the example above declares that the child element "message" can occur zero or one time inside the "note" element.

Declaring either/or Content


Example: <!ELEMENT note (to,from,header,(message|body))>
The example above declares that the "note" element must contain a "to" element, a "from" element, a "header" element, and either a "message" or a "body" element.

Declaring Mixed Content


50

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Example: <!ELEMENT note (#PCDATA|to|from|header|message)*>


The example above declares that the "note" element can contain zero or more occurrences of parsed character data, "to", "from", "header", or "message" elements.

DTD - Attributes
In a DTD, attributes are declared with an ATTLIST declaration.

Declaring Attributes
An attribute declaration has the following syntax:

<!ATTLIST element-name attribute-name attribute-type default-value> DTD example: <!ATTLIST payment type CDATA "check"> XML example: <payment type="check" />
The attribute-type can be one of the following:
Type CDATA (en1|en2|..) ID Description The value is character data The value must be one from an enumerated list The value is a unique id 51

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

IDREF IDREFS NMTOKEN NMTOKENS ENTITY ENTITIES NOTATION xml:

The value is the id of another element The value is a list of other ids The value is a valid XML name The value is a list of valid XML names The value is an entity The value is a list of entities The value is a name of a notation The value is a predefined xml value

The default-value can be one of the following:


Value value #REQUIRED #IMPLIED #FIXED value Explanation The default value of the attribute The attribute is required The attribute is not required The attribute value is fixed

A Default Attribute Value


DTD: <!ELEMENT square EMPTY> <!ATTLIST square width CDATA "0"> Valid XML: <square width="100" />

52

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

In the example above, the "square" element is defined to be an empty element with a "width" attribute of type CDATA. If no width is specified, it has a default value of 0.

#REQUIRED
Syntax

<!ATTLIST element-name attribute-name attribute-type #REQUIRED>


Example

DTD: <!ATTLIST person number CDATA #REQUIRED> Valid XML: <person number="5677" /> Invalid XML: <person />
Use the #REQUIRED keyword if you don't have an option for a default value, but still want to force the attribute to be present.

#IMPLIED
Syntax

<!ATTLIST element-name attribute-name attribute-type #IMPLIED>


Example

DTD: <!ATTLIST contact fax CDATA #IMPLIED>

53

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Valid XML: <contact fax="555-667788" /> Valid XML: <contact />


Use the #IMPLIED keyword if you don't want to force the author to include an attribute, and you don't have an option for a default value.

#FIXED
Syntax

<!ATTLIST element-name attribute-name attribute-type #FIXED "value">


Example

DTD: <!ATTLIST sender company CDATA #FIXED "Microsoft"> Valid XML: <sender company="Microsoft" /> Invalid XML: <sender company="W3Schools" />
Use the #FIXED keyword when you want an attribute to have a fixed value without allowing the author to change it. If an author includes another value, the XML parser will return an error.

Enumerated Attribute Values


Syntax

<!ATTLIST element-name attribute-name (en1|en2|..)


54

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

default-value>
Example

DTD: <!ATTLIST payment type (check|cash) "cash"> XML example: <payment type="check" /> or <payment type="cash" />
Use enumerated attribute values when you want the attribute value to be one of a fixed set of legal values.

XML Elements vs. Attributes


In XML, there are no rules about when to use attributes, and when to use child elements.

Use of Elements vs. Attributes


Data can be stored in child elements or in attributes. Take a look at these examples:

<person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>

<person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname>


55

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

</person>
In the first example sex is an attribute. In the last, sex is a child element. Both examples provide the same information. There are no rules about when to use attributes, and when to use child elements. My experience is that attributes are handy in HTML, but in XML you should try to avoid them. Use child elements if the information feels like data.

My Favorite Way
I like to store data in child elements. The following three XML documents contain exactly the same information: A date attribute is used in the first example:

<note date="12/11/2002"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
A date element is used in the second example:

<note> <date>12/11/2002</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
An expanded date element is used in the third: (THIS IS MY FAVORITE):

56

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<note> <date> <day>12</day> <month>11</month> <year>2002</year> </date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Avoid using attributes?


Should you avoid using attributes? Some of the problems with attributes are:

attributes cannot contain multiple values (child elements can) attributes are not easily expandable (for future changes) attributes cannot describe structures (child elements can) attributes are more difficult to manipulate by program code attribute values are not easy to test against a DTD

If you use attributes as containers for data, you end up with documents that are difficult to read and maintain. Try to use elements to describe data. Use attributes only to provide information that is not relevant to the data. Don't end up like this (this is not how XML should be used):

<note day="12" month="11" year="2002" to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!"> </note>

57

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

An Exception to my Attribute Rule


Rules always have exceptions. My rule about attributes has one exception: Sometimes I assign ID references to elements. These ID references can be used to access XML elements in much the same way as the NAME or ID attributes in HTML. This example demonstrates this:

<messages> <note id="p501"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <note id="p502"> <to>Jani</to> <from>Tove</from> <heading>Re: Reminder</heading> <body>I will not!</body> </note> </messages>
The ID in these examples is just a counter, or a unique identifier, to identify the different notes in the XML file, and not a part of the note data. What I am trying to say here is that metadata (data about data) should be stored as attributes, and that data itself should be stored as elements.

58

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

DTD - Entities
Entities are variables used to define shortcuts to standard text or special characters.

Entity references are references to entities Entities can be declared internal or external

An Internal Entity Declaration


Syntax

<!ENTITY entity-name "entity-value">


Example

DTD Example: <!ENTITY writer "Donald Duck."> <!ENTITY copyright "Copyright W3Schools."> XML example: <author>&writer;&copyright;</author>
Note: An entity has three parts: an ampersand (&), an entity name, and a semicolon (;).

An External Entity Declaration


Syntax

<!ENTITY entity-name SYSTEM "URI/URL">

59

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

Example

DTD Example: <!ENTITY writer SYSTEM "http://www.w3schools.com/entities.dtd"> <!ENTITY copyright SYSTEM "http://www.w3schools.com/entities.dtd"> XML example: <author>&writer;&copyright;</author>

DTD Validation
With Internet Explorer 5+ you can validate your XML against a DTD.

Validating With the XML Parser


If you try to open an XML document, the XML Parser might generate an error. By accessing the parseError object, you can retrieve the error code, the error text, or even the line that caused the error. Note: The load( ) method is used for files, while the loadXML( ) method is used for strings.

Example
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.validateOnParse="true"; xmlDoc.load("note_dtd_error.xml"); document.write("<br />Error Code: "); document.write(xmlDoc.parseError.errorCode);
60

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

document.write("<br />Error Reason: "); document.write(xmlDoc.parseError.reason); document.write("<br />Error Line: "); document.write(xmlDoc.parseError.line);

Look at the XML file

Turn Validation Off


Validation can be turned off by setting the XML parser's validateOnParse="false".

Example
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.validateOnParse="false"; xmlDoc.load("note_dtd_error.xml"); document.write("<br />Error Code: "); document.write(xmlDoc.parseError.errorCode); document.write("<br />Error Reason: "); document.write(xmlDoc.parseError.reason); document.write("<br />Error Line: "); document.write(xmlDoc.parseError.line);

A General XML Validator


61

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

To help you check your xml files, you can syntax-check any XML file here.

The parseError Object


You can read more about the parseError object in our XML DOM tutorial.

TV Schedule DTD
By David Moisan. Copied from http://www.davidmoisan.org/

<!DOCTYPE TVSCHEDULE [ <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST <!ATTLIST <!ATTLIST ]> TVSCHEDULE (CHANNEL+)> CHANNEL (BANNER,DAY+)> BANNER (#PCDATA)> DAY (DATE,(HOLIDAY|PROGRAMSLOT+)+)> HOLIDAY (#PCDATA)> DATE (#PCDATA)> PROGRAMSLOT (TIME,TITLE,DESCRIPTION?)> TIME (#PCDATA)> TITLE (#PCDATA)> DESCRIPTION (#PCDATA)> TVSCHEDULE NAME CDATA #REQUIRED> CHANNEL CHAN CDATA #REQUIRED> PROGRAMSLOT VTR CDATA #IMPLIED> TITLE RATING CDATA #IMPLIED> TITLE LANGUAGE CDATA #IMPLIED>

Newspaper Article DTD


Copied from http://www.vervet.com/

<!DOCTYPE NEWSPAPER [ <!ELEMENT NEWSPAPER (ARTICLE+)> <!ELEMENT ARTICLE (HEADLINE,BYLINE,LEAD,BODY,NOTES)> <!ELEMENT HEADLINE (#PCDATA)>
62

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST <!ATTLIST

BYLINE (#PCDATA)> LEAD (#PCDATA)> BODY (#PCDATA)> NOTES (#PCDATA)> ARTICLE ARTICLE ARTICLE ARTICLE AUTHOR CDATA #REQUIRED> EDITOR CDATA #IMPLIED> DATE CDATA #IMPLIED> EDITION CDATA #IMPLIED>

<!ENTITY NEWSPAPER "Vervet Logic Times"> <!ENTITY PUBLISHER "Vervet Logic Press"> <!ENTITY COPYRIGHT "Copyright 1998 Vervet Logic Press"> ]>

Product Catalog DTD


Copied from http://www.vervet.com/

<!DOCTYPE CATALOG [ <!ENTITY AUTHOR "John Doe"> <!ENTITY COMPANY "JD Power Tools, Inc."> <!ENTITY EMAIL "jd@jd-tools.com"> <!ELEMENT CATALOG (PRODUCT+)> <!ELEMENT PRODUCT (SPECIFICATIONS+,OPTIONS?,PRICE+,NOTES?)> <!ATTLIST PRODUCT NAME CDATA #IMPLIED CATEGORY (HandTool|Table|Shop-Professional) "HandTool" PARTNUM CDATA #IMPLIED PLANT (Pittsburgh|Milwaukee|Chicago) "Chicago" INVENTORY (InStock|Backordered|Discontinued) "InStock"> <!ELEMENT SPECIFICATIONS (#PCDATA)> <!ATTLIST SPECIFICATIONS WEIGHT CDATA #IMPLIED POWER CDATA #IMPLIED>
63

WEB TECHNOLOGIES

http://www.lectnote.blogspot.com

<!ELEMENT OPTIONS (#PCDATA)> <!ATTLIST OPTIONS FINISH (Metal|Polished|Matte) "Matte" ADAPTER (Included|Optional|NotApplicable) "Included" CASE (HardShell|Soft|NotApplicable) "HardShell"> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST PRICE MSRP CDATA #IMPLIED WHOLESALE CDATA #IMPLIED STREET CDATA #IMPLIED SHIPPING CDATA #IMPLIED> <!ELEMENT NOTES (#PCDATA)> ]>

64

Das könnte Ihnen auch gefallen