Beruflich Dokumente
Kultur Dokumente
Question?
Is there any way of making sure that DATA is STRUCTURED in a particular way?
Yes, there is. It is necessary to However, DTDs are often recommended to ensure ensure that the document conformity data exchange (sending/ receiving) conforms to a particular structure. Document Type Definition (DTD): define the allowable structures in an XML
Question?
Q: What happen if I dont want the data validated? A: If you dont want your XML data validated, dont link it to a DTD.
A validity parser is able to read the DTD and determine whether or not the XML document conforms to it. If the document conforms to the DTD, it is referred to as valid If the document fails to conform to the DTD but is syntactically correct, it is wellformed but not valid. By definition, a valid document is wellformed.
DTD : Introduction
Allows you to create rules for the elements within your XML documents. The rules defined in a DTD are specific to your own needs. XML document to be well-formed - Use correct XML syntax, - Conform to its DTD or schema The DTD is declared at the top of XML document. Contents of the DTD can be included within XML document or external
A DTD consists of a list of syntax definitions for each element in your XML document. DTD means creating a syntax rules for any XML document that uses the DTD. Specify which element names can be included in the document, the attributes that each element can have, whether or not these are required or optional, and more.
Why?
Each XML files can carry a description of its own format. Independent groups of people can agree to use a standard DTD for interchanging data. An application can use a standard DTD to verify that the data it receive from the outside world is valid. To verify your own data.
Example
DTD : Internal
If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax:
Example
DTD: External
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</hea ding> <body>Don't forget me this weekend!</body> </note>
<!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>
All XML documents (and HTML documents) are made up by the following building blocks: 1) Elements main building blocks of XML 2) Attributes extra informations about element 3) Entities - characters have a special meaning in XML 4) PCDATA parsed character data
PCDATA Parsed Character Data Think of character data as the text found between the start tag and the end tag of an XML element. PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for entities and markup. However, parsed character data should not contain any &, <, or > characters; these need to be represented by the & < and > entities, respectively.
CDATA means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.
DTD : Elements
Element content
Specific rule 1) ANY - can contain any combination of parsable data - remove all syntax checking, NOT advisable 2) Specific data or another element - the data type/element name needs to be surrounded by brackets (i.e. (tutorial) or (#PCDATA)).
Empty Element Syntax : <!ELEMENT element_name EMPTY> Example: <!ELEMENT header EMPTY> <header />
Used for elements that do not contain content. In this case, the header element contains nothing.
Child Elements : Specify an element that contain another element, by providing the name of the element it must contain. Syntax : <!ELEMENT element_name (child_element_name)> Example: <!ELEMENT tutorials (tutorial)> <tutorials> <tutorial> XML tutorial </tutorial> </tutorials>
Elements with only character data Elements with only character data are declared with #PCDATA inside parentheses:
Syntax : <!ELEMENT element-name (#PCDATA)>
Example:
<!ELEMENT from (#PCDATA)>
Example
Example:
<!ELEMENT note (message)>
The example declaration above declares that the child element message can only occur one time inside the "note" element
Example:
<!ELEMENT note (message+)>
The + sign in the example above declares that the child element message must occur one or more times inside the "note" element.
Example: <!ELEMENT note (message*)> The * sign in the example above declares that the child element message can occur zero or more times inside the "note" element.
Example:
<!ELEMENT note (message?)>
The ? sign in the example above declares that the child element message can occur zero or one times inside the "note" element.
Example : ? indicator
<!ELEMENT seat (person ?)> indicates that element seat contains at most one person element. Examples of markup that conform to this are: <seat> <person>Maya Karin</person> </seat> and <seat></seat>
The example above declares that the "note" element must contain a "to" element, a "from" element, a "header" element, and either a "message" or a "body" element
Choices are specified using the pipe character (|) The example specifies that element dessert must contain either one icecream element or one pastry element, but not both. The content specification may contain any number of pipe character-separated choices.
Example:
<!ELEMENT note (#PCDATA|to|from|header|message)*>
The example above declares that the "note" element can contain parsed character data and any number of "to", "from", "header", and/or "message" elements.
<!ELEMENT order (#PCDATA|menu_item)*> This notation is used for elements with mixed content, that is, character data and further elements. In this case, an order element can consist of a mixture of at least zero or more character data and menu elements in any order. For example: <order>The customer will have: <menu_item>Chicken Soup</menu_item> <menu_item>Apple Pie</menu_item> </order>
DTDExample.doc
<!ELEMENT album(title,(songTitle,duration)+)>
An example of markup that conforms to this is: <album> <title>XML Classical Hits</title> <songTitle>XML Overture</songTitle> <duration>10</duration> <songTitle>XML Symphony 1.0 </songTitle> <duration>54</duration> </album>
<!ELEMENT class (number,(instructor| assistant+), (credit|noCredit))> Markup examples: <class> <number>123</number> <instructor>Dr. Ayip</instructor> <credit>4</credit> </class> and <class> <number>456</number> <assistant>Puteh</assistant> <assistant>Muhammad</assistant> <credit>3</credit> </class>
<!ELEMENT donutBox ( jelly?, lemon*, ((crme|sugar)+|glazed))> Markup example: <donutBox> <jelly>grape</jelly> <lemon>half-sour</lemon> <lemon>sour</lemon> <lemon>half-sour</lemon> <glazed>chocolate</glazed> </donutBox> and <donutBox> <sugar>semi-sweet</sugar> <creme>whipped</creme> <sugar>sweet</sugar> </donutBox>
<!ELEMENT farm (farmer+,(dog*|cat?),bird*, (goat|cow)?,(chicken+|duck*))> Markup example: <farm> <farmer>Ali Hassan</farmer> <farmer>Ali Hassan</farmer> <cat>Mastura</cat> <bird>Bobo</bird> <chicken>Dodo</chicken> </farm> And <farm> <farmer>Ah Wong</farmer> <duck>Billy</duck> <duck>Donni</duck> </farm>
Summary
Description of Element Type Model:
EMPTY (specifying an element with no content) ANY (specifying an element that can contain any elements and character data) #PCDATA (specifying a single element that contains character data) , (specifying an element to contain sub-elements: exactly one occurrence allowed) | (specifying a list of sub-elements: Only one occurrence allowed) ? (specifying a sub-element that occurs once or not at all) + (specifying sub-elements that occur at least once and possibly more)
K-Break
43
K-Break
44
K-Break
45
Element Structure
46
Create Element or Attribute? - Particular part of document -> Element - Provide detail to an existing bit of data -> Attribute
Norzilah Musa/Jul2010/CSC570
DTD - Attributes
47
In a DTD, Attributes are declared with an ATTLIST declaration. Declaring Attributes An attribute declaration has the following syntax:
<!ATTLIST element-name attribute-name attributetype default-value>
DTD example:
<!ATTLIST payment type CDATA "check">
Attributes:- further describe a noun- an element. Attribute list declarations generally describe four key aspects: 1) The element to which the attribute is associated 2) The name of the attribute 3) The type of the attribute 4) What the parser should do in case an attribute value is not supplied (that is, the default value) Musa/Jul2010/CSC570 Norzilah
Attributes : Example
49
Quotes - Place quotation marks around the attribute's value Example: <tutorials type="Web"> <tutorial> <name>XML</name> </tutorial> </tutorials>
Norzilah Musa/Jul2010/CSC570
Attributes : Example
50
<person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person> <person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>
Norzilah Musa/Jul2010/CSC570
Attribute Type
51
Norzilah Musa/Jul2010/CSC570
Strings in attribute types are like strings in all programming a series of character data. In XML, string lengths not defined. The CDATA keyword implies a string. Strings can contain all text characters except for special characters like quotation marks and ampersands (symbols). <book author =A.R Husin>Bawang Merah Bawang Putih</book>
Norzilah Musa/Jul2010/CSC570
Attribute Type : ID
A unique id, identifies particular element Work together with IDREF cross referencing of elements Example
Attribute Default
60
Norzilah Musa/Jul2010/CSC570
61
Valid XML:
<square width="100" />
In the example above, the "square" element is defined to be an empty element with a "width" attribute of type CDATA. If no width is specified, it has a default value of 0.
Norzilah Musa/Jul2010/CSC570
62
Example DTD: <!ATTLIST contact fax CDATA #IMPLIED> Valid XML: <contact fax="555-667788" /> Valid XML: <contact />
Norzilah Musa/Jul2010/CSC570
63
64
Valid XML: <sender company="Microsoft" /> Invalid XML: <sender company="WeanSndBhd" /> Use the #FIXED keyword when you want an attribute to have a fixed value without allowing the author to change it. If an author includes another value, the XML parser will Norzilah return an error. Musa/Jul2010/CSC570
Exercise
DTD Exercise.doc
DTD - Entities
Entity The value is entity Entities The value is a list of entity Entities are variables used to define shortcuts to common text. Entity references are references to entities. Entities can be declared internal, or external
The ENTITY/ ENTITIES keyword is used in the declaration to tell the parser to go and find the unparsed entity and pop it into the value of the attribute. Has three parts: 1) ampersand (&) 2) entity name 3) semicolon (;)
Norzilah Musa/Jul2010/CSC570
68
DTD Example:
<!ENTITY writer "Donald Duck."> <!ENTITY copyright Walt Disney.">
XML example:
<author>&writer;©right;</author>
Norzilah Musa/Jul2010/CSC570
Exercise
Norzilah Musa/Jul2010/CSC570
70
TV Schedule DTD
71
Norzilah Musa/Jul2010/CSC570
Norzilah Musa/Jul2010/CSC570
DTD Limitations
Non-XML syntax DTD is not Extensible Weak Data Typing No inheritance Possible solution: XML Schema
Brain Teaser
How could all of your cousins have an aunt who is not your aunt?
Brain Teaser