Beruflich Dokumente
Kultur Dokumente
Extensible Markup Language (XML) is used to describe data. The XML standard is a
flexible way to create information formats and electronically share structured data via
the public Internet, as well as via corporate networks.
XML code, a formal recommendation from the World Wide Web Consortium (W3C), is
similar to Hypertext Markup Language (HTML). Both XML and HTML contain markup
symbols to describe page or file contents. HTML code describes Web page content
(mainly text and graphic images) only in terms of how it is to be displayed and
interacted with.XML data is known as self-describing or self-defining, meaning that the
structure of the data is embedded with the data, thus when the data arrives there is no
need to pre-build the structure to store the data; it is dynamically understood within the
XML. XML is actually a simpler and easier-to-use subset of the Standard Generalized
Markup Language (SGML), which is the standard to create a document structure.
The basic building block of an XML document is an element, defined by tags. An
element has a beginning and an ending tag. All elements in an XML document are
contained in an outermost element known as the root element. XML can also
support nested elements, or elements within elements. This ability allows XML to
support hierarchical structures. Element names describe the content of the element,
and the structure describes the relationship between the elements. An XML document is
considered to be "well formed" (that is, able to be read and understood by an
XML parser) if its format complies with the XML specification, if it is properly marked up,
and if elements are properly nested.
For example:
<? Xml version="1.0" standalone="yes"?>
<Conversation>
<greeting>Hello, world!</greeting>
<response>Stop the planet, I want to get off!</response>
</conversation>
XML Features
• Excellent for handling data with a complex structure or atypical data.
• Data described using markup language.
• Text data description.
• Human- and computer-friendly format.
• Handles data in a tree structure having one-and only one-root element.
• Excellent for long-term data storage and data reusability
XML Component
The most basic components of an XML document are elements,
attributes, and comments.
XML Elements- can be defined as building blocks of an XML. Elements can behave as
containers to hold text, elements, attributes, media objects or all of these.
Each XML document contains one or more elements, the scope of which are either
delimited by start and end tags, or for empty elements, by an empty-element tag.
Syntax
Following is the syntax to write an XML element –
<element-name attribute1 attribute2>
....content
</element-name>
Where,
.Element-name is the name of the element.
The name its case in the start and end tags must match.
.attribute1, attribute2 are attributes of the element separated by white spaces.
An attribute defines a property of the element. It associates a name with a value, which
is a string of characters.
An attribute is written as −
name = "value"
name is followed by an = sign and a string value inside double(" ") or single(' ') quotes.
Empty Element
An empty element (element with no content) has following syntax −
<name attribute1 attribute2.../>
XML Attributes- Attributes are part of XML elements. An element can have multiple
unique attributes. Attribute gives more information about XML elements.
Syntax
An XML attribute has the following syntax −
<element-name attribute1 attribute2 >
....content.
< /element-name>
where attribute1 and attribute2 has the following form −
name = "value"
Value has to be in double (" ") or single (' ') quotes. Here, attribute1 and attribute2 are
unique attribute labels. Attributes are used to add a unique label to an element, place
the label in a category, add a Boolean flag, or otherwise associate it with some string of
data. Following example demonstrates the use of attributes −
<?xml version = "1.0" encoding = "UTF-8"?>
<!DOCTYPE garden [
<!ELEMENT garden (plants)*>
<!ELEMENT plants (#PCDATA)>
<!ATTLIST plants category CDATA #REQUIRED>
]>
<garden>
<plants category = "flowers" />
<plants category = "shrubs">
</plants>
</garden>
Attributes are used to distinguish among elements of the same name, when you do not
want to create a new element for every situation. Hence, the use of an attribute can add
a little more detail in differentiating two or more similar elements.
In the above example, we have categorized the plants by including attribute category
and assigning different values to each of the elements. Hence, we have two categories
of plants, one flowers and other color. Thus, we have two plant elements with different
attributes.
XML Document Structure
The XML Recommendation states that an XML document has both logical and physical
structure. Physically, it is comprised of storage units called entities, each of which may
refer to other entities, similar to the way that includes works in the C language.
Logically, an XML document consists of declarations, elements, comments, character
references, and processing instructions, collectively known as the markup.
An XML document consists of three parts, in the order given:
1. An XML declaration (which is technically optional, but recommended in most
normal cases)
2. A document type declaration that refers to a DTD (which is optional, but required if
you want validation)
3. A body or document instance (which is required)
Collectively, the XML declaration and the document type declaration are called the XML
prolog.
XML Declaration
The XML declaration is a piece of markup (which may span multiple lines of a file) that
identifies this as an XML document. The declaration also indicates whether the
document can be validated by referring to an external Document Type Definition (DTD).
The minimal XML declaration is:
<? Xml version=”1.0” ?>
XML is case-sensitive (more about this in the next subsection), so it's important that you
use lowercase for xml and version. The quotes around the value of the version attribute
are required, as are the ? characters. At the time of this writing, "1.0" is the only
acceptable value for the version attribute, but this is certain to change when a
subsequent version of the XML specification appears.
NOTE
Do not include a space before the string xml or between the question mark and the
angle brackets. The strings <?xml and ?> must appear exactly as indicated. The space
before the ?> is optional. No blank lines or space may precede the XML declaration;
adding white space here can produce strange error messages.
In most cases, this XML declaration is present. If so, it must be the very first line of the
document and must not have leading white space. This declaration is technically
optional; cases where it may be omitted include when combining XML storage units to
create a larger, composite document.
Actually, the formal definition of an XML declaration, according to the XML 1.0
specification is as follows:
Xml Decl = '<? Xml' Version Info Encoding Decl? SDDecl? S? '?>'
This Extended Backus-Naur Form (EBNF) notation, characteristic of many W3C
specifications, means that an XML declaration consists of the literal sequence '<?xml',
followed by the required version information, followed by optional encoding and
standalone declarations, followed by an optional amount of white space, and
terminating with the literal sequence '?>'. In this notation, a question mark not contained
in quotes means that the term that proceeds it is optional.
The following declaration means that there is an external DTD on which this document
depends. See the next subsection for the DTD that this negative standalone value
implies.
<? Xml version="1.0" standalone="no" ?>
On the other hand, if your XML document has no associated DTD, the correct XML
declaration is:
<? Xml version="1.0" standalone="yes" ?>
The XML 1.0 Recommendation states: "If there are external markup declarations but
there is no standalone document declaration, the value 'no' is assumed."
The optional encoding part of the declaration tells the XML processor (parser) how to
interpret the bytes based on a particular character set. The default encoding is UTF-8,
which is one of seven character-encoding schemes used by the Unicode standard, also
used as the default for Java. In UTF-8, one byte is used to represent the most common
characters and three bytes are used for the less common special characters. UTF-8 is
an efficient form of Unicode for ASCII-based documents. In fact, UTF-8 is a superset of
ASCII.
Yes, it is. The order of attributes does not matter. Single and double quotes can be used
interchangeably, provided they are of matching kind around any particular attribute
value. (Although there is no good reason in this example to use double quotes for
version and single quotes for the other, you may need to do so if the attribute value
already contains the kind of quotes you prefer.) Finally, the lack of a blank space
between 'no' and ?> is not a problem.
Neither of the following XML declarations is valid.
The first is invalid because these particular attribute names must be lowercase, as must
"xml". The problem with the second declaration is that the value of the standalone
attribute must be literally "yes" or "no", not "No". (Do I dare call this a "no No"?)
Xml Comments –
Comments can be used to include related links, information, and terms. They are visible
only in the source code; not in the XML code. Comments may appear anywhere in XML
code.
Syntax
XML comment has the following syntax −
<!--Your comment-->
A comment starts with <!-- and ends with -->. You can add textual notes as comments
between the characters. You must not nest one comment inside the other.
Example
Following example demonstrates the use of comments in XML document −
<?xml version = "1.0" encoding = "UTF-8" ?>
<!--Students grades are uploaded by months-->
<class_list>
<student>
<name>Tanmay</name>
<grade>A</grade>
</student>
</class_list>
Any text between <!-- and --> characters is considered as a comment.
Advantages of Schemas
• Define the characteristics and syntax of a set of documents.
• Independent groups can have a common format for interchanging XML documents.
• Software applications that process the XML documents know what to expect if the
documents adhere to a formal schema
• XML documents can be validated to verify that they conform to a given schema.
• Validation can be used as a debugging tool, directing the designer to items in a
document that violate the schema.
• A schema can act as documentation for users defining or reading some set of XML
documents.
• A schema can increase the reliability, consistency, and accuracy of exchanged
documents.
Validation
Validation is a process by which an XML document is validated. An XML document is
said to be valid if its contents match with the elements, attributes and associated
document type declaration (DTD), and if the document complies with the constraints
expressed in it. Validation is dealt in two ways by the XML parser. They are −
• Well-formed XML document
• Valid XML document
Well-formed XML Document
An XML document is said to be well-formed if it adheres to the following rules −
• Non DTD XML files must use the predefined character entities
for amp(&), apos(single quote), gt(>), lt(<), quote (double quote).
• It must follow the ordering of the tag. i.e., the inner tag must be closed before
closing the outer tag.
• Each of its opening tags must have a closing tag or it must be a self ending
tag.(<title>....</title> or <title/>).
• It must have only one attribute in a start tag, which needs to be quoted.
• amp(&), apos(single quote), gt(>), lt(<), quot (double quote) entities other than
these must be declared.
Example
Following is an example of a well-formed XML document −
<?xml version = "1.0" encoding = "UTF-8" standalone = "yes" ?>
<!DOCTYPE address
[
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<address>
<name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
The above example is said to be well-formed as −
• It defines the type of document. Here, the document type is element type.
• It includes a root element named as address.
• Each of the child elements among name, company and phone is enclosed in its
self explanatory tag.
• Order of the tags is maintained.
<!DOCTYPE html>
<html>
<body>
<h1>TutorialsPoint DOM example </h1>
<div>
<b>Name:</b> <span id = "name"></span><br>
<b>Company:</b> <span id = "company"></span><br>
<b>Phone:</b> <span id = "phone"></span>
</div>
<script>
if (window.XMLHttpRequest)
{// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp = new XMLHttpRequest();
}
else
{// code for IE6, IE5
xmlhttp = new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.open("GET","/xml/address.xml",false);
xmlhttp.send();
xmlDoc = xmlhttp.responseXML;
document.getElementById("name").innerHTML=
xmlDoc.getElementsByTagName("name")[0].childNodes[0].nodeValue;
document.getElementById("company").innerHTML=
xmlDoc.getElementsByTagName("company")[0].childNodes[0].nodeValue;
document.getElementById("phone").innerHTML=
xmlDoc.getElementsByTagName("phone")[0].childNodes[0].nodeValue;
</script>
</body>
</html>
XSL
Before learning XSLT, we should first understand XSL which stands for
Extensible Style sheet Language. It is similar to XML as CSS is to HTML.
Need for XSL
In case of HTML document, tags are predefined such as table, div, and span; and the
browser knows how to add style to them and display those using CSS styles. But in
case of XML documents, tags are not predefined. In order to understand and style an
XML document, World Wide Web Consortium (W3C) developed XSL which can act as
XML based Style sheet Language. An XSL document specifies how a browser should
render an XML document.
Following are the main parts of XSL −
• XSLT − used to transform XML document into various other types of document.
• X-Path − used to navigate XML document.
• XSL-FO − used to format XML document.
XSLT
XSLT, Extensible Style sheet Language Transformations, provides the ability to
transform XML data from one format to another automatically.
How XSLT Works
An XSLT style sheet is used to define the transformation rules to be applied on the
target XML document. XSLT style sheet is written in XML format. XSLT Processor
takes the XSLT style sheet and applies the transformation rules on the target XML
document and then it generates a formatted document in the form of XML, HTML, or
text format. This formatted document is then utilized by XSLT formatter to generate the
actual output which is to be displayed to the end-user.
Advantages
Here are the advantages of using XSLT −
• Independent of programming. Transformations are written in a separate XSL file
which is again an XML document.
• Output can be altered by simply modifying the transformations in XSL file. No
need to change any code. So Web designers can edit the style sheet and can
see the change in the output quickly.
XSLT Example
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>MyCDCollection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>