Beruflich Dokumente
Kultur Dokumente
Updated by
Dr Suthikshn Kumar
Suthikshn.kumar@pes.edu
Contents
Introduction
Syntax of XML
XML Document Structure
NameSpaces
XML Schemas
Displaying Raw XML documents
Displaying XML Documents with CSS
XSLT Style Sheets
XML Processors
Web Services
Summary
Intro to XML
The Extensible Markup Language (XML) is a general-purpose
markup language.
It is classified as an extensible language because it allows its users to
define their own tags.
Its primary purpose is to facilitate the sharing of structured data across
different information systems, particularly via the Internet.
It is used both to encode documents and serialize data.
In the latter context, it is comparable with other text-based serialization
languages such as JSON and YAML.
It started as a simplified subset of the Standard Generalized Markup
Language (SGML), and is designed to be relatively human-legible.
By adding semantic constraints, application languages can be
implemented in XML. These include XHTML, RSS, MathML, GraphML,
Scalable Vector Graphics, MusicXML, and thousands of others.
Moreover, XML is sometimes used as the specification language for
such application languages.
XML is recommended by the World Wide Web Consortium. It is a fee-
free open standard. The W3C recommendation specifies both the
lexical grammar, and the requirements for parsing.
Introduction
What is XML?
• XML stands for EXtensible Markup Language
• XML is a markup language much like HTML
• XML was designed to describe data
• XML tags are not predefined. You must define your
own tags
• XML uses a Document Type Definition (DTD) or an
XML Schema to describe the data
• XML with a DTD or XML Schema is designed to be self-
descriptive
• XML is a W3C Recommendation
XML is a W3C Recommendation
• The Extensible Markup Language (XML) became a W3C
Recommendation 10. February 1998.
The Main Difference Between
XML and HTML
XML was designed to carry data.
• XML is not a replacement for HTML.
XML and HTML were designed with different goals:
• XML was designed to describe data and to focus on what data is.
HTML was designed to display data and to focus on how data looks.
• HTML is about displaying information, while XML is about describing information.
XML Does not DO Anything
XML was not designed to DO anything.
• Maybe it is a little hard to understand, but XML does not DO anything. XML was
created to structure, store and to send information.
• The following example is a note to Tove from Jani, stored as XML:
• <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading>
<body>Don't forget me this weekend!</body> </note>
• The note has a header and a message body. It also has sender and receiver
information. But still, this XML document does not DO anything. It is just pure
information wrapped in XML tags. Someone must write a piece of software to
send, receive or display it.
XML is Free and Extensible
XML tags are not predefined. You must "invent" your own tags.
• The tags used to mark up HTML documents and the structure of HTML
documents are predefined. The author of HTML documents can only use tags
that are defined in the HTML standard (like <p>, <h1>, etc.).
• XML allows the author to define his own tags and his own document structure.
• The tags in the example above (like <to> and <from>) are not defined in any
XML standard. These tags are "invented" by the author of the XML document.
• and data transmission
XML is a Complement to
HTML
XML is not a replacement for HTML.
• It is important to understand that XML is not a replacement for HTML.
In future Web development it is most likely that XML will be used to
describe the data, while HTML will be used to format and display the
same data.
• My best description of XML is this: XML is a cross-platform, software
and hardware independent tool for transmitting information.
XML in Future Web Development
XML is going to be everywhere.
• We have been participating in XML development since its creation. It
has been amazing to see how quickly the XML standard has been
developed and how quickly a large number of software vendors have
adopted the standard.
• We strongly believe that XML will be as important to the future of the
Web as HTML has been to the foundation of the Web and that XML will
be the most common tool for all data manipulation
XML can Separate Data from
HTML
With XML, your data is stored outside your HTML.
• When HTML is used to display data, the data is stored inside your HTML. With
XML, data can be stored in separate XML files. This way you can concentrate on
using HTML for data layout and display, and be sure that changes in the
underlying data will not require any changes to your HTML.
• XML data can also be stored inside HTML pages as "Data Islands". You can still
concentrate on using HTML only for formatting and displaying the data.
XML is Used to Exchange Data
With XML, data can be exchanged between incompatible systems.
• In the real world, computer systems and databases contain data in incompatible
formats. One of the most time-consuming challenges for developers has been to
exchange data between such systems over the Internet.
• Converting the data to XML can greatly reduce this complexity and create data
that can be read by many different types of applications.
XML and B2B
With XML, financial information can be exchanged over the Internet.
• Expect to see a lot about XML and B2B (Business To Business) in the near
future.
• XML is going to be the main language for exchanging financial information
between businesses over the Internet. A lot of interesting B2B applications are
under development.
XML Can be Used to Share
Data
With XML, plain text files can be used to share data.
• Since XML data is stored in plain text format, XML provides a software- and hardware-
independent way of sharing data.
• This makes it much easier to create data that different applications can work with. It also
makes it easier to expand or upgrade a system to new operating systems, servers,
applications, and new browsers.
XML Can be Used to Store Data
With XML, plain text files can be used to store data.
• XML can also be used to store data in files or in databases. Applications can be written to
store and retrieve information from the store, and generic applications can be used to display
the data.
XML Can Make your Data More Useful
With XML, your data is available to more users.
• Since XML is independent of hardware, software and application, you can make your data
available to other than only standard HTML browsers.
• Other clients and applications can access your XML files as data sources, like they are
accessing databases. Your data can be made available to all kinds of "reading machines"
(agents), and it is easier to make your data available for blind people, or people with other
disabilities.
XML Can be Used to Create New Languages
XML is the mother of WAP and WML.
• The Wireless Markup Language (WML), used to markup Internet applications for handheld
devices like mobile phones, is written in XML.
If Developers Have Sense
If they DO have sense, all future applications will exchange their data in XML
XML Syntax
As long as only well-formedness is required, XML is a generic framework for
storing any amount of text or any data whose structure can be represented as a
tree.
The only indispensable syntactical requirement is that the document has exactly
one root element (alternatively called the document element).
This means that the text must be enclosed between a root opening tag and a
corresponding closing tag. The following is a well-formed XML document:
<book>This is a book.... </book>
The root element can be preceded by an optional XML declaration. This
element states what version of XML is in use (normally 1.0); it may also contain
information about character encoding and external dependencies.
<?xml version="1.0" encoding="UTF-8"?>
The specification requires that processors of XML support the pan-Unicode
character encodings UTF-8 and UTF-16 (UTF-32 is not mandatory). The use of
more limited encodings, such as those based on ISO/IEC 8859, is
acknowledged and is widely used and supported.
Comments can be placed anywhere in the tree, including in the text if the
content of the element is text or #PCDATA:
<!-- This is a comment. -->
In any meaningful application, additional markup is used to structure the
contents of the XML document. The text enclosed by the root tags may contain
an arbitrary number of XML elements. The basic syntax for one element is:
<name attribute="value">content</name>
Example: Recipe for making
bread
<?xml version="1.0" encoding="UTF-8"?>
<!ENTITY c “Cessna”>
<!ENTITY p “Piper”>
<!ENTITY b “Beechcraft”>
Planes.xml
<?xml version = “1.0” encoding = “utf-8”?>
<!– planes.xml
<!DOCTYPE planes_for_sale SYSTEM “planes.dtd”>
<planes_for_sale>
<ad>
<year> 1977</year>
<make> &c; </make>
<model> skyhawk </model>
<color> Light Blue and White </color>
<description> 685 hours, full IFR… </description>
<price> 23,495</price>
<seller phone = “555-222-3333”> skyway Aircraft </seller>
<location>
<city> Rapid city </city>
<state> South Dakota </state>
</location>
</ad>
</planes_for_sale>
DTD
Document Type Definition (DTD), defined slightly differently within
the XML and SGML (the language XML was derived from)
specifications, is one of several SGML and XML schema languages,
and is also the term used to describe a document or portion thereof
that is authored in the DTD language.
A DTD is primarily used for the expression of a schema via a set of
declarations that conform to a particular markup syntax and that
describe a class, or type, of SGML or XML documents, in terms of
constraints on the structure of those documents.
As an expression of a schema, a DTD specifies, in effect, the syntax of
an "application" of SGML or XML, such as the derivative language
HTML or XHTML. This syntax is usually a less general form of the
syntax of SGML or XML.
In a DTD, the structure of a class of documents is described via
element and attribute-list declarations.
Element declarations name the allowable set of elements within the
document, and specify whether and how declared elements and runs
of character data may be contained within each element.
Attribute-list declarations name the allowable set of attributes for each
declared element, including the type of each attribute value, if not an
explicit set of valid value(s).
Associating DTDs with
documents
A DTD is associated with an XML document via a Document Type Declaration,
which is a tag that appears near the start of the XML document. The declaration
establishes that the document is an instance of the type defined by the
referenced DTD.
The declarations in a DTD are divided into an internal subset and an external
subset. The declarations in the internal subset are embedded in the Document
Type Declaration in the document itself. The declarations in the external subset
are located in a separate text file. The external subset may be referenced via a
public identifier and/or a system identifier. Programs for reading documents may
not be required to read the external subset.
Examples
Here is an example of a Document Type Declaration containing both public and
system identifiers:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Here is an example of a Document Type Declaration that encapsulates an
internal subset consisting of a single entity declaration:
<!DOCTYPE foo [ <!ENTITY greeting "hello"> ]> <!DOCTYPE bar [ <!ENTITY
greeting "hello"> ]>
An XML DTD example
An example of an XML file which makes use of and
conforms to this DTD follows. It assumes the DTD is
identifiable by the relative URI reference "example.dtd":
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE
people_list SYSTEM "example.dtd">
<people_list>
<person>
<name>Fred Bloggs</name>
<birthdate>27/11/2008</birthdate>
<gender>Male</gender>
</person>
</people_list>
DTD problems:
Create a DTD for a catalog of cars where each car has the child
elements make, model, year, color, engine, number_of_doors,
transmission_type and accessories. The engine element has the child
elements number_of_cylinders and fuel_system( carburatted or fuel
injected ). The accessories element has the attributes radio,
air_conditioning, power_windows, power_steering and power_brakes,
each of which is required and has the possible values yes and no. Entities
must be declared for the names of popular car makes.
Create an XML document with atleast three instances of the car element
defined in the DTD of above. Process this document using the DTD and
produce a display of the raw XML document.
Design an XML document to store information about patients in a
hospital. Information about patients must include name ( in three parts),
social security number, age, room number, primary insurance company-
including member id number, group number, phone number, and
address--- secondary insurance company ( in the same sub parts as for
the primary insurance company), known medical problems, and know
drug allergies. Both attributes and nested tags must be included. Make up
sample data for at least four patients.
Write a DTD for the document described above. With the following
restrictions: the name, social security number, age, room number, and
primary insurance company are required. All the other elements are
optional, as are middle names.
Valid XML Documents
XML documents do not carry information about how to display the data.
Since XML tags are "invented" by the author of the XML document, browsers do not know if a
tag like <table> describes an HTML table or a dining table.
Without any information about how to display the data, most browsers will just display the
XML document as it is.
Displaying XML documents
with CSS
CSS file that has style info for the elements in XML doc can be
developed
The other way is to use th XSLT style sheet technology
XSLT provides far more power over the appearance of the
documents display.
XSLT is not supported by all the browsers.
The form of a css style sheet for an XML document is simple.
It is just the list element names, each followed by a brace-
delimited set of element’s CSS attributes.
Planes.css
<!– planes.css
Ad { display : block; margin-top: 15px; color: blue;}
Year, make, model { color: red; font-size: 16pt}
Using in an XML
<?xml-stylesheet type = “text/css” href = “planes.css” >
Displaying your XML Files
with CSS?
It is possible to use CSS to format an XML document.
Below is an example of how to use a CSS style sheet to format an XML document:
Take a look at this XML file: The CD catalog
Then look at this style sheet: The CSS file
Finally, view: The CD catalog formatted with the CSS file
Below is a fraction of the XML file. The second line, <?xml-stylesheet type="text/css"
href="cd_catalog.css"?>, links the XML file to the CSS file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/css" href="cd_catalog.css"?>
<CATALOG> <CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR> </CD>
<CD> <TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR> </CD> . . . .
</CATALOG>
Note: Formatting XML with CSS is NOT the future of how to style XML documents. XML
document should be styled by using the W3C's XSL standard!
Displaying XML with XSL
.
Microsoft's XML Parser
Microsoft's XML parser is a COM component that comes with Internet Explorer 5 and higher.
Once you have installed Internet Explorer, the parser is available to scripts.
Microsoft's XML parser supports all the necessary functions to traverse the node tree, access
the nodes and their attribute values, insert and delete nodes, and convert the node tree back
to XML.
To create an instance of Microsoft's XML parser, use the following code:
JavaScript:
var xmlDoc=new ActiveXObject("Microsoft.XMLDOM");VBScript:
set xmlDoc=CreateObject("Microsoft.XMLDOM")ASP:
set xmlDoc=Server.CreateObject("Microsoft.XMLDOM")The following code fragment loads an
existing XML document ("note.xml") into Microsoft's XML parser:
var xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false";
xmlDoc.load("note.xml");The first line of the script above creates an instance of the XML
parser. The second line turns off asynchronized loading, to make sure that the parser will not
continue execution of the script before the document is fully loaded. The third line tells the
parser to load an XML document called "note.xml".
XML DTD Example
a very simple XML DTD to describe a list of persons is given below:
<!ELEMENT people_list (person*)> <!ELEMENT person (name, birthdate?,
gender?, socialsecuritynumber?)> <!ELEMENT name (#PCDATA)>
<!ELEMENT birthdate (#PCDATA)> <!ELEMENT gender (#PCDATA)>
<!ELEMENT socialsecuritynumber (#PCDATA)>
Taking this line by line, it says:
people_list is a valid element name, and an instance of such an element
contains any number of person elements. The * denotes there can be 0 or more
person elements within the people_list element.
person is a valid element name, and an instance of such an element contains
one element named name, followed by one named birthdate (optional), then
gender (also optional) and socialsecuritynumber (also optional). The ? indicates
that an element is optional. The reference to the name element name has no ?,
so a person element must contain a name element.
name is a valid element name, and an instance of such an element contains
parseable character data (#PCDATA).
birthdate is a valid element name, and an instance of such an element contains
character data.
gender is a valid element name, and an instance of such an element contains
character data.
socialsecuritynumber is a valid element name, and an instance of such an
element contains character data.