Beruflich Dokumente
Kultur Dokumente
1) What is XML?
Extensible Markup Language (XML) is used to describe data. The XML standard is a flexible way to create
information formats and electronically share structured data via the public Internet, as well as via corporate
networks.
Short for Extensible Markup Language, a specification developed by the W3C. XML is a pared-down
version of SGML, designed especially for Web documents. It allows designers to create their own
customized tags, enabling the definition, transmission, validation, and interpretation of data between
applications and between organizations.
XML BASICS
XML, or eXtensible markup language, is all about creating a universal way for both formatting
and presenting data. Once data is coded or marked up with XML tags, data can then be used in
many different ways.
Main features of XML:
XML files are text files, which can be managed by any text editor.
XML is very simple, because it has less than 10 syntax rules.
XML provides a basic syntax that can be used to share information between different
kinds of computers, different applications, and different organizations. XML data is
stored in plain text format.
With XML, your data can be available to all kinds of "reading machines" (Handheld
computers, voice machines, news feeds, etc), and make it more available for blind people,
or people with other disabilities.
Databases can trade tables, business applications can trade updates, and document
systems can share information.
It supports Unicode, allowing almost any information in any written human language to
be communicated.
Its self-documenting format describes structure and field names as well as specific
values.
Content-based XML markup enhances searchability, making it possible for agents and
search engines to categorize data instead of wasting processing power on context-based
full-text searches.
XML is heavily used as a format for document storage and processing, both online and
offline.
THETOPPERSWAY.COM
9.HTML is about displaying data, hence static but XML is about carrying information, hence
dynamic.
Thus,it can be said that HTML and XML are not competitors but rather complement to each
other and clearly serving altogether different purposes.
The following is the same document as above but with an added reference to a DTD:
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "InternalNote.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
3. Introduction to DTD
The purpose of a DTD is to define the legal building blocks of an XML document. It defines the
document structure with a list of legal elements. A DTD can be declared inline in your XML
document, or as an external reference.
Internal DTD
This is an XML document with a Document Type Definition
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to
(#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
!ELEMENT note (in line 2) defines the element "note" as having four elements:
"to,from,heading,body".
!ELEMENT to (in line 3) defines the "to" element to be of the type "CDATA".
!ELEMENT from (in line 4) defines the "from" element to be of the type "CDATA"
and so on.....
External DTD
This is the same XML document with an external DTD:
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
This is a copy of the file "note.dtd" containing the Document Type Definition:
<?xml version="1.0"?>
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
A lot of forums are emerging to define standard DTDs for almost everything in the areas of data
exchange.
4. XML Schema
A schema formally describes what a given XML document contains, in the same way a database schema
describes the data that can be contained in a database (table structure, data types). An XML schema
describes the coarse shape of the XML document, what fields an element can contain, which sub elements
it can contain, and so forth. It also can describe the values that can be placed into any element or attribute.
An XML Schema is a language for expressing constraints about XML documents. There are several
different schema languages in widespread use, but the main ones are Document Type Definitions
(DTDs), Relax-NG, Schematron and W3C XSD (XML Schema Definitions).
iii)
iv)
v)
vi)
vii)
viii)
DTD
DTD stands for Document Type
Definition.
DTDs are derived from SGML syntax.
XSD
XSD stands for XML Schema Definition.
XSDs are written in XML.
The name of the element is "img". The name of the attribute is "src". The value of the attribute is
"computer.gif". Since the element itself is empty it is closed by a " /".
PCDATA
PCDATA stands for Parsed Character data. PCDATA is the text that will be parsed by a parser.
Tags inside the PCDATA will be treated as markup and entities will be expanded.
CDATA
CDATA: (Unparsed Character data): CDATA contains the text which is not parsed further in an
XML document. Tags inside the CDATA text are not treated as markup and entities will not be
expanded.
Entities
Entities as variables used to define common text. Entity references are references to entities.
Most of you will known the HTML entity reference: " " that is used to insert an extra
space in an HTML document. Entities are expanded when a document is parsed by an XML
parser.
The following entities are predefined in XML:
:
Entity References
Character
<
<
>
>
&
&
"
"
'
'
Wrapping
If the DTD is to be included in your XML source file, it should be wrapped in a DOCTYPE
definition with the following syntax:
<!DOCTYPE root-element [element-declarations]>
example:
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to
(#CDATA)>
<!ELEMENT from (#CDATA)>
<!ELEMENT heading (#CDATA)>
<!ELEMENT body (#CDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
6. XMl Parsers
Parsing XML refers to going through XML document to access data or to modify data in one or
other way. Parser has the job of reading the XML, checking it for errors, and passing it on to the
intended application. If no DTD or schema is provided, the parser simply checks that the XML is
well-formed. If a DTD is provided then the parser also determines whether the XML is valid, i.e.
that the tags, attributes, and content meet the specifications found in the DTD, before passing it on
to the application.
Simple API for XML (SAX) parsing is different from DOM as it parses the XML files step by
step and is event based model. The SAX parser triggers an event when they encounter an
opening tag, element or attribute.
Unlike in DOM parser it is advisable to use the SAX parser for parsing large XML documents as
it does not load the complete XML file in the memory. This parser parses node by node so it can
read large XML files in smaller parts.
Difference between SAX and DOM parsers.
DOM
Tree model parser (Tree of nodes)
DOM loads the file into the memory and
then parse the file
SAX
Event based parser (Sequence of events)
SAX parses the file at it reads i.e. Parses
node by node
Thats all on difference between SAX and DOM parsers in Java, now its up to you on which
XML parser you going to choose. DOM parser is recommended over SAX parser if XML file is
small enough and go with SAX parser if you dont know size of xml files to be processed or they
are large.
What are the usual application for a DOM parser and for a SAX parser?
In the following cases, using SAX parser is advantageous than using DOM parser.
The input document is too big for available memory
You can process the document in small contiguous chunks of input. You do not need the
entire document before you can do useful work
You just want to use the parser to extract the information of interest, and all your
computation will be completely based on the data structures created by yourself.
In the following cases, using DOM parser is advantageous than using SAX parser.
Your application needs to access widely separately parts of the document at the same
time.
Your application may probably use an internal data structure which is almost as
complicated as the document itself.
Your application has to modify the document repeatedly.
Your application has to store the document for a significant amount of time through many
method calls.
Sample document for the example
<?xml version="1.0"?>
<!DOCTYPE shapes [
<!ELEMENT shapes (circle)*>
<!ELEMENT circle (x,y,radius)>
<!ELEMENT x (#PCDATA)>
<!ELEMENT y (#PCDATA)>
<!ELEMENT radius (#PCDATA)>
<!ATTLIST circle color CDATA #IMPLIED>
]>
<shapes>
<circle color="BLUE">
<x>20</x>
<y>20</y>
<radius>20</radius>
</circle>
<circle color="RED" >
<x>40</x>
<y>40</y>
<radius>20</radius>
</circle>
</shapes>
<MANUFACTURER>Creative Labs</MANUFACTURER>
<MODEL>Sound Blaster Live</MODEL>
<COST> 80.00</COST>
</PART>
<PART>
<ITEM inch Monitor</ITEM>
<MANUFACTURER>LG Electronics</MANUFACTURER>
<MODEL> 995E</MODEL>
<COST> 290.00</COST>
</PART>
</PARTS>
The root element is PARTS. The root element may contain no TITLE element or one TITLE
element along with any number of PART elements. The PART element must contain one each of
items ITEM, MANUFACTURER, MODEL, and COST in order. The PART element may
contain an attribute called "type" which may have a value of "computer", "auto", or "airplane".
The elements TITLE, ITEM, MANUFACTURER, MODEL, and COST all contain PCDATA
which is parsed character data.
B. Example(Program) on Schema
This example attempts to validate an XML data file (books.xml) against an XML schema definition file
(books.xsd). According to books.xsd, each <book> element must have a <pub_date> child element. The
second <book> element in books.xml does not have this required element. Therefore, when we attempt to
validate the XML file , we should get a validation error.
XML file(books.xml)
<?xml version="1.0"?>
<x:books xmlns:x="urn:books">
<book id="bk001">
<author>Writer</author>
<title>The First Book</title>
<genre>Fiction</genre>
<price>44.95</price>
<pub_date>2000-10-01</pub_date>
<review>An amazing story of nothing.</review>
</book>
<book id="bk002">
<author>Poet</author>
<title>The Poet's First Poem</title>
<genre>Poem</genre>
<price>24.95</price>
<review>Least poetic poems.</review>
</book>
</x:books>
XSD file(books.xsd)
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="urn:books"
xmlns:bks="urn:books">
<xsd:element name="books" type="bks:BooksForm"/>
<xsd:complexType name="BooksForm">
<xsd:sequence>
<xsd:element name="book"
type="bks:BookForm"
minOccurs="0"
maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="BookForm">
<xsd:sequence>
<xsd:element name="author" type="xsd:string"/>
<xsd:element name="title" type="xsd:string"/>
<xsd:element name="genre" type="xsd:string"/>
<xsd:element name="price" type="xsd:float" />
<xsd:element name="pub_date" type="xsd:date" />
<xsd:element name="review" type="xsd:string"/>
</xsd:sequence>
<xsd:attribute name="id" type="xsd:string"/>
</xsd:complexType>
</xsd:schema>
**********************************************************
Dynamic HTML
Dynamic HTML is a collective term for a combination of Hypertext Markup Language (HTML) tags and
options that can make Web pages more animated and interactive than previous versions of HTML. Much
of dynamic HTML is specified in HTML 4.0. Simple examples of dynamic HTML capabilities include
having the color of a text heading change when a user passes a mouse over it and allowing a user to "drag
and drop" an image to another place on a Web page. Dynamic HTML can allow Web documents to look
and act like desktop applications or multimedia productions.
When capitalized, Dynamic HTML refers to new HTML extensions that will enable a Web page to react
to user input without sending requests to the Web server.
i)
ii)
iii)
With DHTML, we can easily add effects to our pages that previously were difficult to achieve.
For example, we can:
Hide content until a given time elapses or the user interacts with the page.
Animate text and images in our document, independently moving each element from any
starting point to any ending point, following a predetermined path or one chosen by the
user.
Embed a ticker that automatically refreshes its content with the latest news, stock quotes,
or other data.
Use a form to capture user input, and then instantly process and respond to that data.
In short, DHTML eliminates the shortcomings of static pages. We can create innovative Web sites, on the
Internet or on an intranet, without having to sacrifice performance for interactivity. Not only does
DHTML enhance the user's perception of your documents, it also improves server performance by
reducing requests to the server.