Sie sind auf Seite 1von 16

Medical informatics

Chapter 2

XML

105

1. Introduction
What is a structured document?

A paper document (eg a letter) may contain:


the date
the sender's address
the recipient's address
a salutation
body text
and a signature

106

Dr. Ziad EL BALAA 1


Medical informatics

1. Introduction
This document is not well-structured
there is no defined rule
the order of items on the page
or the number of paragraphs
No structure rule is explicitly
associated with this document

It is easy to
forget one of the elements
or place them in the wrong order

XML solves this problem


The structuring of information is the first step in writing
an XML document
107

2. XML
XML
Extensible Markup Language
has been developed by the XML Working Group
under the W3C since 1996

XML
is a generic markup language
It is primarily used to store / transfer data to fields arborescent

He is described as extensible as it allows


the user to define the tags from items

All documents related to the standard XML


http://www.w3.org/XML/
108

Dr. Ziad EL BALAA 2


Medical informatics

2. XML
Unlike HTML, which is considered
as a language-defined and fixed
with a limited number of tags

XML is considered a metalanguage allow


to describe the presentation of a text
to separate content from presentation
to define new tags
to format documents
using tags

HTML: <p>Name: Peter</p>


XML: <Name>Peter</Name>
109

2. XML
XML is recognized by all current browsers

To write XML we can use


a text editor software
WordPad
Notepad ++
Microsoft Word
specialized software, which contains specific features:
Test whether the XML document is well formed
Validate XML relative to a DTD or XML Schema
Perform an XSLT transformation
Recognize the Unicode character set

Example:
XML Copy (free)
EditiX XML (free)
Altova XMLSpy (fee required)
Oxygen XML (fee required)
110

Dr. Ziad EL BALAA 3


Medical informatics

3. Tag
XML documents contain
1. text that represents content (i.e., data), such as John,
2. elements that specify the documents structure, such as firstName.

Rules for naming elements


Names can contain letters, numbers, and other characters:
good: <firstname01>; an element can start with underscore.
Names cannot start with a number or punctuation character
error: <01firstname>; <?firstname>;
Names cannot start with the letters xml (or XML, or Xml, etc):
error: <xmlfirstname>;
Names cannot contain spaces: error: <first name>;
Unlike HTML, the case of a tag is important<DOCUMENT> is not the
same tag as <document>, for example.
111

3. Tag
An XML elements is delimit by
start tags and end tags
A start tag consists of the element name in angle brackets
e.g., <player> and <firstName>
An end tag consists of the element name preceded by a
forward slash (/) in angle brackets
e.g., </firstName> and </player>
<firstName>John</firstName>
Every XML document must have exactly
one element that contains all the other elements
root element

112

Dr. Ziad EL BALAA 4


Medical informatics

3. Tag
The name of the tag should describe the delimited data

We can choose any name for tag


Example
<xyz>
is correct but it gives no information about the content

but better to choose a meaningful name


Example
<invoice>
<Family>
<chapter>
<Course>
...

113

Structure en HTML Structure en XML

114

Dr. Ziad EL BALAA 5


Medical informatics

4. The XML document


XML document
is a text file with .xml extension
structured in two parts:
1. The prolog
2. The data: tags and content

The prolog contains:


The XML header
The definition of the XML document structure:
(DTD - Document Type Definition) ou XML Schema
processing instructions are instructions to the software
processing the XML;
it one start with <? and end with ?>
115

4. The XML document


Example of the structure of the prolog:
Header

<?xml version="1.0" encoding="UTF-8"?>


<!DOCTYPE document SYSTEM "doc.dtd"> DTD
<?xml-stylesheet type="text/css" href="doc.css"?>
<document>
<heading> Processing instruction
Hello From XML
</heading>
<message>
This is an XML document!
</message>
</document>

116

Dr. Ziad EL BALAA 6


Medical informatics

4. The XML document


The XML header should
be at the first line of the document
starting with the 5 characters '<? xml

XML header has the following form:


<?xml version="..." encoding="..." standalone="..."?>

Examples:
<?xml version="1.0"?>
<?xml version='1.0' encoding='iso-8859-1'?>
<?xml version="1.1" encoding="UTF-8" standalone="yes"?> 117

4. The XML document


The version attribute
specifies the XML version to which the document conforms
The current XML standard used is version 1.0 or 1.1
it is mandatory

The encoding attribute


specifies the character encoding used in the file
the main possible values are ISO-8859-1, UTF-8, UTF-16
it is mandatory if the character encoding is not UTF-8 (for English
character)

The standalone attribute


specify whether the file is standalone or if there are external
declarations
the value of this attribute can be yes or no
the default value is no
118

Dr. Ziad EL BALAA 7


Medical informatics

4. The XML document


1 <?xml version = "1.0"?>
Header
2 <!-- Fig. 20.1: article.xml -->
3 <!-- Article structured with XML --> Comments article.xml
4 <!DOCTYPE letter SYSTEM "letter.dtd">
5 DTD
6 <article> Root
7
8 <title>Simple XML</title> Tag contains text
9
10 <date>July 15, 2003</date>
11
12 <author> Satrt tag
13 <firstName>Carpenter</firstName>
14 <lastName>Cal</lastName>
15 </author>
End tag
16
17 <summary>XML is pretty easy.</summary>
18
19 <content>Once you have mastered XHTML, XML is easily
20 learned. You must remember that XML is not for
21 displaying information but for managing information.
22 </content>
23
24 </article>
119
2004 Prentice Hall, Inc. All rights reserved.

4. The XML document

120

Dr. Ziad EL BALAA 8


Medical informatics

4. The XML document

121

5. The elements syntax


XML uses the following terms:

A parent element: any element that contains other elements


Example:

Parent of

122

Dr. Ziad EL BALAA 9


Medical informatics

5. The elements syntax


A child elements: element nested inside a container element

Example:

Child of

123

5. The elements syntax


A siblings elements inside a parent element
child elements at the same nesting level

Example:

Siblings

124

Dr. Ziad EL BALAA 10


Medical informatics

6. The attributes
The XML attributes allow
to store additional information about an item
without adding the text to the content of the element
this makes the document more confusing
to display or sort some elements
based on their attribute

Choose between attribute and tag!


If the format will change and complicate tag
If you want to select or sort attribute

125

6. The attributes
An attribute is a key-value pair
written under the form Key = "Value"
The syntax is:
<tag key="value">

Example:
Tag Name Attribute Name Attribute Value

<name language ="French">Colosse</name>

Attribute

The name of the attribute follows


126
the same rules as the tag name

Dr. Ziad EL BALAA 11


Medical informatics

6. The attributes
Example:
<message priority ="low">
Attendance is mandatory in the course.
</message>
<message priority ="high">
Studying well for examination.
</message>

Tag: message
Attribute : priority
Value : low or high
We can display certain messages based on their priority
127

6. The attributes
same example without attributes:
<message>
<priority>low
<msg>
Attendance is mandatory in the course.
</msg>
</priority>
<priority> high
<msg>
Studying well for examination.
</msg>
</priority>
</message>
128

Dr. Ziad EL BALAA 12


Medical informatics

7. Special characters
The characters <, >, , , &
are reserved for XML
and can not be used directly

Must be used with their


general entity reference

Characters Entity
< &lt;
> &gt;
&quot;
&apos;
& &amp;
129

8. The structure of XML


We distinguish:
Well-formed document:
respects the grammar of XML

Valided Document :
well-formed document
respects the constraints structuring : includes a DTD or XML
Schema

130

Dr. Ziad EL BALAA 13


Medical informatics

8.1- Well formed document


The presence of the XML header
XML header must be written correctly
in the order of mandatory attributes

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>


<!-- valid Header -->

<?xml encoding="ISO-8859-1" version="1.0"?>


<!-- not valid Header -->

131

8.1- Well formed document


It must have a tag root including all the others

< ?xml version = "1.0" encoding = "ISO-8859-1"?>


<!-- Is a valid structure whose root tag is course -->
<course>
<time> 2 hours </time>
<name> Informatics </name>
</course>

<? xml version = "1.0" encoding = "ISO-8859-1"?>


<!-- Is a not valid structure -->
<time> 2 hours </time>
<name> Informatics </name>

132

Dr. Ziad EL BALAA 14


Medical informatics

8.1- Well formed document


Any open tag must be close except empty tag (unless empty tag)

<?xml version="1.0" encoding="ISO-8859-1" standlone ="yes"?>


<ListeCourse>
<course>
<time> 2 hours <! not valid tag -->
<name> Informatics </name> <! valid tag -->
<credit> </ credit> <! empty tag : valid tag -->
<content file=content.ppt/> <! empty tag : valid tag -->
</course>
</ListeCourse>

empty tag : it has no content, sometimes it contains attributes

133

8.1- Well formed document


is not allowed to overlap the tags

<?xml version="1.0" encoding="ISO-8859-1" standlone ="yes"?>


<ListeCourse>
<course>
<time> 2 hours
<name>Informatics
</time>
</name> <! not valid tag-->
</course> <! valid tag -->
</ListeCourse> <! valid tag -->

134

Dr. Ziad EL BALAA 15


Medical informatics

8.1- Well formed document


Any assignment attribute must be between quotation marks:

<?xml version="1.0" ?>


<ListeCourse>
<course teacher=EL BALAA> <! not valid tag-->
<name> XML</name>
</course>
<course teacher="EL BALAA"> <! valid tag -->
<name> PHP</name>
</course>
</ListeCourse>

135

Dr. Ziad EL BALAA 16

Das könnte Ihnen auch gefallen