Sie sind auf Seite 1von 75

DOCUMENT TYPE DEFINITION (DTD)

Compiled & prepared by: Norzilah Musa

Question?

An XML document is not required to have a corresponding DTD.

Is there any way of making sure that DATA is STRUCTURED in a particular way?

Yes, there is. It is necessary to However, DTDs are often recommended to ensure ensure that the document conformity data exchange (sending/ receiving) conforms to a particular structure. Document Type Definition (DTD): define the allowable structures in an XML

Question?
Q: What happen if I dont want the data validated? A: If you dont want your XML data validated, dont link it to a DTD.

Parsers, Well-formed and Valid XML Documents

A validity parser is able to read the DTD and determine whether or not the XML document conforms to it. If the document conforms to the DTD, it is referred to as valid If the document fails to conform to the DTD but is syntactically correct, it is wellformed but not valid. By definition, a valid document is wellformed.

DTD : Introduction

Allows you to create rules for the elements within your XML documents. The rules defined in a DTD are specific to your own needs. XML document to be well-formed - Use correct XML syntax, - Conform to its DTD or schema The DTD is declared at the top of XML document. Contents of the DTD can be included within XML document or external

DTD : Introduction (contd)

A DTD consists of a list of syntax definitions for each element in your XML document. DTD means creating a syntax rules for any XML document that uses the DTD. Specify which element names can be included in the document, the attributes that each element can have, whether or not these are required or optional, and more.

Why?

Each XML files can carry a description of its own format. Independent groups of people can agree to use a standard DTD for interchanging data. An application can use a standard DTD to verify that the data it receive from the outside world is valid. To verify your own data.

Example

DTD : Internal

If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax:

!DOCTYPE note defines that the root element of this DTD

Example

DTD: External
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</hea ding> <body>Don't forget me this weekend!</body> </note>
<!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

DTD : XML Building Blocks

All XML documents (and HTML documents) are made up by the following building blocks: 1) Elements main building blocks of XML 2) Attributes extra informations about element 3) Entities - characters have a special meaning in XML 4) PCDATA parsed character data

DTD : XML Building Blocks (contd)


Predefined XML entities Entity References &lt; &gt; &amp; &quot; &apos; Character < > & "

DTD : XML Building Blocks (contd)

PCDATA Parsed Character Data Think of character data as the text found between the start tag and the end tag of an XML element. PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for entities and markup. However, parsed character data should not contain any &, <, or > characters; these need to be represented by the &amp; &lt; and &gt; entities, respectively.

DTD : XML Building Blocks (contd)


CDATA means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.

DTD : Elements

Elements are declared with an ELEMENT declaration Syntax:

<!ELEMENT element-name (elementcontent)>

Element content

Specific rule 1) ANY - can contain any combination of parsable data - remove all syntax checking, NOT advisable 2) Specific data or another element - the data type/element name needs to be surrounded by brackets (i.e. (tutorial) or (#PCDATA)).

Element content (contd)

Empty Element Syntax : <!ELEMENT element_name EMPTY> Example: <!ELEMENT header EMPTY> <header />
Used for elements that do not contain content. In this case, the header element contains nothing.

Element content (contd)

Child Elements : Specify an element that contain another element, by providing the name of the element it must contain. Syntax : <!ELEMENT element_name (child_element_name)> Example: <!ELEMENT tutorials (tutorial)> <tutorials> <tutorial> XML tutorial </tutorial> </tutorials>

Element content (contd)


Multiple Child Elements (Sequences) - More than one elements, used comma to separated the - This is referred to as a "sequence". - The XML document must contain the tags in the same order that they're specified in the sequence. Syntax :

<!ELEMENT element_name (child_element_name, child_element_name,...)>

Element content (contd)

Elements with only character data Elements with only character data are declared with #PCDATA inside parentheses:
Syntax : <!ELEMENT element-name (#PCDATA)>

Example:
<!ELEMENT from (#PCDATA)>

Example

Element content (contd)


Declaring only one occurrence of the same element Syntax

<!ELEMENT element-name (child-name)>

Example:
<!ELEMENT note (message)>

The example declaration above declares that the child element message can only occur one time inside the "note" element

Element content (contd)


Declaring minimum one occurrence of the same element. Syntax:

<!ELEMENT element-name (child-name+)>

Example:
<!ELEMENT note (message+)>

The + sign in the example above declares that the child element message must occur one or more times inside the "note" element.

Element content (contd)

Example : Plus indicator


<!ELEMENT album (song+)> An elements frequency (i.e., number of occurrences) is specified by using either the plus (+), astrerisk(*) or question mark (?) occurrence indicator. The example specifies that element album contains one or more song elements.

Element content (contd)

Declaring zero or more occurrences of the same element


<!ELEMENT element-name (child-name*)>

Example: <!ELEMENT note (message*)> The * sign in the example above declares that the child element message can occur zero or more times inside the "note" element.

Example : Asterik indicator


<! ELEMENT library (book*)> indicates that element library contains any number of book elements, including the possibility of none at all. Markup examples that conform to this: <library> <book>The nations</book> <book>The Iliad</book> <book>The Jungle</book> </library> and <library></library>

Element content (contd)

Declaring zero or one occurrences of the same element


Syntax : <!ELEMENT element-name (childname?)>

Example:
<!ELEMENT note (message?)>

The ? sign in the example above declares that the child element message can occur zero or one times inside the "note" element.

Example : ? indicator
<!ELEMENT seat (person ?)> indicates that element seat contains at most one person element. Examples of markup that conform to this are: <seat> <person>Maya Karin</person> </seat> and <seat></seat>

Element content (contd)


Declaring either/or content Example:


<!ELEMENT note (to,from,header,(message| body))>

The example above declares that the "note" element must contain a "to" element, a "from" element, a "header" element, and either a "message" or a "body" element

Example : Pipe character (|)


Eg: <!ELEMENT dessert(icecream |pastry)>

Choices are specified using the pipe character (|) The example specifies that element dessert must contain either one icecream element or one pastry element, but not both. The content specification may contain any number of pipe character-separated choices.

Example : Pipe character (|)


<!ELEMENT dessert (ice-cream| coffee)> dessert elements contains of either a single ice-cream element or a single coffee element. A valid example: <dessert> <ice-cream>vanilla</ice-cream> </dessert>

Example : Pipe character (|)


Invalid example (well formed but not valid under this element type declaration): <dessert> <pie>Apple</pie> </dessert>

Declaring mixed content

Example:
<!ELEMENT note (#PCDATA|to|from|header|message)*>

The example above declares that the "note" element can contain parsed character data and any number of "to", "from", "header", and/or "message" elements.

<!ELEMENT order (#PCDATA|menu_item)*> This notation is used for elements with mixed content, that is, character data and further elements. In this case, an order element can consist of a mixture of at least zero or more character data and menu elements in any order. For example: <order>The customer will have: <menu_item>Chicken Soup</menu_item> <menu_item>Apple Pie</menu_item> </order>

<!ELEMENT order (appetizer,main_course,dessert)> <!ELEMENT appetizer (#PCDATA)>

DTDExample.doc

<!ELEMENT album(title,(songTitle,duration)+)>

An example of markup that conforms to this is: <album> <title>XML Classical Hits</title> <songTitle>XML Overture</songTitle> <duration>10</duration> <songTitle>XML Symphony 1.0 </songTitle> <duration>54</duration> </album>

<!ELEMENT class (number,(instructor| assistant+), (credit|noCredit))> Markup examples: <class> <number>123</number> <instructor>Dr. Ayip</instructor> <credit>4</credit> </class> and <class> <number>456</number> <assistant>Puteh</assistant> <assistant>Muhammad</assistant> <credit>3</credit> </class>

<!ELEMENT donutBox ( jelly?, lemon*, ((crme|sugar)+|glazed))> Markup example: <donutBox> <jelly>grape</jelly> <lemon>half-sour</lemon> <lemon>sour</lemon> <lemon>half-sour</lemon> <glazed>chocolate</glazed> </donutBox> and <donutBox> <sugar>semi-sweet</sugar> <creme>whipped</creme> <sugar>sweet</sugar> </donutBox>

<!ELEMENT farm (farmer+,(dog*|cat?),bird*, (goat|cow)?,(chicken+|duck*))> Markup example: <farm> <farmer>Ali Hassan</farmer> <farmer>Ali Hassan</farmer> <cat>Mastura</cat> <bird>Bobo</bird> <chicken>Dodo</chicken> </farm> And <farm> <farmer>Ah Wong</farmer> <duck>Billy</duck> <duck>Donni</duck> </farm>

Summary
Description of Element Type Model:

EMPTY (specifying an element with no content) ANY (specifying an element that can contain any elements and character data) #PCDATA (specifying a single element that contains character data) , (specifying an element to contain sub-elements: exactly one occurrence allowed) | (specifying a list of sub-elements: Only one occurrence allowed) ? (specifying a sub-element that occurs once or not at all) + (specifying sub-elements that occur at least once and possibly more)

K-Break

43

K-Break

44

K-Break

45

Element Structure
46

Create Element or Attribute? - Particular part of document -> Element - Provide detail to an existing bit of data -> Attribute

Norzilah Musa/Jul2010/CSC570

DTD - Attributes
47

In a DTD, Attributes are declared with an ATTLIST declaration. Declaring Attributes An attribute declaration has the following syntax:
<!ATTLIST element-name attribute-name attributetype default-value>

DTD example:
<!ATTLIST payment type CDATA "check">

XML example: <payment type="check" /> Norzilah Musa/Jul2010/CSC570

Attribute list declarations


48

Attributes:- further describe a noun- an element. Attribute list declarations generally describe four key aspects: 1) The element to which the attribute is associated 2) The name of the attribute 3) The type of the attribute 4) What the parser should do in case an attribute value is not supplied (that is, the default value) Musa/Jul2010/CSC570 Norzilah

Attributes : Example
49

Quotes - Place quotation marks around the attribute's value Example: <tutorials type="Web"> <tutorial> <name>XML</name> </tutorial> </tutorials>
Norzilah Musa/Jul2010/CSC570

Attributes : Example
50

<person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person> <person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>
Norzilah Musa/Jul2010/CSC570

Attribute Type
51

The attribute-type can have the following values:

Norzilah Musa/Jul2010/CSC570

Attribute Type : CDATA

Character data, numbers, text that is not parsed. Example

<ATTLIST genre category CDATA #REQUIRED>

Attribute Type : CDATA


53

Strings in attribute types are like strings in all programming a series of character data. In XML, string lengths not defined. The CDATA keyword implies a string. Strings can contain all text characters except for special characters like quotation marks and ampersands (symbols). <book author =A.R Husin>Bawang Merah Bawang Putih</book>
Norzilah Musa/Jul2010/CSC570

Attribute Type : NMTOKEN


Text with some restriction Numbers, letters, !xml Symbols : _ , - , . , : ! Space

<ATTLIST genre category NMTOKEN #REQUIRED>

Attribute Type : (value-1 | value -2)


A value list a set of acceptable options for the attribute to contain Example <ATTLIST genre category (drama|scifi| comedy|other) other>

Attribute Type : ID

A unique id, identifies particular element Work together with IDREF cross referencing of elements Example

<ATTLIST genre category ID #IMPLIED>

Attribute Type : IDREF

A unique id reference value that points to another elements ID value Example

<ATTLIST genre category IDREF #IMPLIED>

Attribute Type : ENTITY


Attribute value is an entity An entity is a value that has been defined elsewhere in the DTD to have particular meaning. Example <ATTLIST genre category ENTITY #IMPLIED>

Attribute Type : NOTATION


Attribute value is a notation A notation is a description of how information should be processed. To define a helper application or plug-in , image datatype require specific viewer, state the location, parameter Example <ATTLIST genre category NOTATION #IMPLIED> <!NOTATION png SYSTEM

Attribute Default
60

The default-value can have the following values:

Norzilah Musa/Jul2010/CSC570

61

Attribute Default : Default value


DTD:
<!ELEMENT square EMPTY> <!ATTLIST square width CDATA "0">

Valid XML:
<square width="100" />

In the example above, the "square" element is defined to be an empty element with a "width" attribute of type CDATA. If no width is specified, it has a default value of 0.
Norzilah Musa/Jul2010/CSC570

62

Attribute Default : #IMPLIED


Syntax
<!ATTLIST element-name attribute-name attributetype #IMPLIED> - Optional; may be ignored if no value

Example DTD: <!ATTLIST contact fax CDATA #IMPLIED> Valid XML: <contact fax="555-667788" /> Valid XML: <contact />

Norzilah Musa/Jul2010/CSC570

63

Attribute Default : #REQUIRED


Syntax
<!ATTLIST element-name attribute_name attributetype #REQUIRED> - Must be present; return an error if empty

Example DTD: <!ATTLIST Matrix Number CDATA


#REQUIRED>

Valid XML: <Matrix Number=36831" /> Invalid XML: <Matrix/>


Norzilah Musa/Jul2010/CSC570

64

Attribute Default : #FIXED


Syntax
<!ATTLIST element-name attribute-name attributetype #FIXED "value">

Example DTD: <!ATTLIST sender company CDATA #FIXED


"Microsoft">

Valid XML: <sender company="Microsoft" /> Invalid XML: <sender company="WeanSndBhd" /> Use the #FIXED keyword when you want an attribute to have a fixed value without allowing the author to change it. If an author includes another value, the XML parser will Norzilah return an error. Musa/Jul2010/CSC570

Exercise

DTD Exercise.doc

DTD - Entities
Entity The value is entity Entities The value is a list of entity Entities are variables used to define shortcuts to common text. Entity references are references to entities. Entities can be declared internal, or external

DTD Entities (contd)


67

The ENTITY/ ENTITIES keyword is used in the declaration to tell the parser to go and find the unparsed entity and pop it into the value of the attribute. Has three parts: 1) ampersand (&) 2) entity name 3) semicolon (;)
Norzilah Musa/Jul2010/CSC570

68

Internal Entity Declaration


Syntax:
<!ENTITY entity-name "entity-value">

DTD Example:
<!ENTITY writer "Donald Duck."> <!ENTITY copyright Walt Disney.">

XML example:
<author>&writer;&copyright;</author>

Norzilah Musa/Jul2010/CSC570

Exercise

CSC570 -> April 2008, Question 1, Part B.

Norzilah Musa/Jul2010/CSC570

70

DTD - EXAMPLES FROM THE NET

TV Schedule DTD
71

By David Moisan. Copied from his Web: http://www.davidmoisan.org/

Norzilah Musa/Jul2010/CSC570

Newspaper Article DTD


72

Copied from http://www.vervet.com/

Norzilah Musa/Jul2010/CSC570

DTD Limitations
Non-XML syntax DTD is not Extensible Weak Data Typing No inheritance Possible solution: XML Schema

Brain Teaser

How could all of your cousins have an aunt who is not your aunt?

Brain Teaser

Das könnte Ihnen auch gefallen