Beruflich Dokumente
Kultur Dokumente
UNIT - IV
XML DATABASES:XML
XML Hierarchical data model, XML Documents, DTD, XML Schema, XML
Querying, XHTML, Illustrative Experiments.
• Traversing through a tree structure is very simple and fast due to its one-to-
many relationships format. Major several programming languages provide
functionality to read tree structure databases.
• Easy to understand due to its one-to-many relationships.
• Key disadvantages of hierarchical databases are:
• It’s rigid format of one-to-many relationships. That means, it doesn’t allow
more than one parent of a child.
• Multiple nodes with same parent will add redundant data.
• Moving one record from one level to other level could be challenging.
Disadvantages:
• This type of database cannot support complex relationships, and there is also
a problem of redundancy, which can result in producing inaccurate
information due to the inconsistent recording of data at various sites.
XML
The name XML stands for eXtensible Markup Language. Markup is the meta data
that describes the data.
XML is a meta language that allows to define languages with user defined tags. XML
schema is the definition of such tailored language.
Example:
<?xml version = “1.0”>
<greeting kind=”succint” >Hello, World </greeting>
Document Prolog comes at the top of the document, before the root element. This
section contains :
• XML declaration
• Document type declaration
Document Elements are the building blocks of XML. These divide the document into
a hierarchy of sections, each serving a specific purpose. You can separate a document into
multiple sections so that they can be rendered differently, or used by a search engine. The
elements can be containers, with a combination of text and other elements.
XML elements can be defined as building blocks of an XML. Elements can behave as
containers to hold text, elements, attributes, media objects or all of these.
Each XML document contains one or more elements, the scope of which are either
delimited by start and end tags, or for empty elements, by an empty-element tag.
<element-name
name attribute1 attribute2>
....content
</element-name>
XML Database
Its main purpose is to define the structure of an XML document. It contains a list of
legal elements and define the structure with the help of them. It provides:
• The basic rules for defining and using markup languages based on XML - how
to write markups, character data, comments etc.
• The rules for how a program is designed to process XML documents (XML
parser).
• The rules for defining DTD
The revised version of our PARTS relation is given below:
In internal case, DTD precedes the root element. It must be enclosed in delimiters { }
In external case, delimiters do not appear but reference to the external file appears as
shown below:
DTDs do not support certain kinds of integrity constraints (legal values of attributes).
But, DTDs support certain uniqueness and referential constraints. For example,
Attributes of ID behaves like primary keys. IDREF behaves like foreign keys.
PNUM is required attribute as it is the primary of the PARTS relation.
PNUM in Shipment Relation should find an entry in PARTS relation (referential integrity)
Limitations of DTDs
DTD support for integrity constraints is very weak. The following are the other
disadvantages:
• They do not use XML syntax. They can’t be processed by regular XML parser.
For example, !ELEMENT, #PCDATA and PNUM are not legal XML attributes.
551103 Advanced DBMS UNIT IV Notes by Prof. T. MEYYAPPAN, DCS, ALU
• It is used to describe and validate the structure and the content of XML data.
• XML schema defines the elements, attributes and data types.
• It is similar to a database schema that describes the data in a database.
• XML schema provides more extensive constraints than a DTD could.
• XML schema provides a set of built-in primitive types – String, Boolean,
decimal and derived data types integer, positiveinteger, negativeinteger and
so on.
• Data types can be simple or complex
</xsd: element>
<xsd: element ref=”NOTE” minoccurs=”0”/>
</xsd: sequence>
<xsd: attribute name=”CITY” type = “City”>
<xsd: attribute name=”COLOR” type = “Color” default=”Red”/>
</xsd: complexType>
XQuery is derived from an earlier language Quilt. Quilt was influenced by SQL, OQL,
XQL ,XML-QL and Lorel.
• XQuery is read only.
• Updation of data must be done with DOM or by some proprietary
facilities(vendor-specific).
• XML documents are essentially character strings that are meant to be
readable by humans.
551103 Advanced DBMS UNIT IV Notes by Prof. T. MEYYAPPAN, DCS, ALU
Examples:
/PartsRelation/PartTuple returns sequence of nodes corresponding to PartTuple
elements. / is equivalent to . (dot) in relational calculus.
XQuery Example
To get the supplier name, part name, and shipment quantity for every shipment of
parts from suppliers, the XQuery is formulated as shown below:
<Result>
{
for $spx in document(“ShipmentsRelation.xml”)
//ShipmentTuple ,
for $sx in document(“SuppliersRelation.xml”)
//SupplierTuple[SNUM = $spx/SNUM] ,
for $px in document(“PartsRelation.xml”)
//PartTuple[PNUM = $spx/PNUM]
Order by SUPPLIER_NAME, PART_NAME
Return
<ResultTuple>
{ $sx / SUPPLIER_NAME, $px /PART_NAME, $spx /QUANTITY }
</ResultTuple>
}
</Result>
// returns the tuples of the specified relation starting from the SHIPMENTS relation (root
node) and ShipmentTuple(children of the root node). spx, sx and px are range variables over
shipment, suppliers and parts relations.
Result of the query is converted into XML form for target consumers.
The above XQuery is equivalent ot SQL query in relational calculus as shown below:
{SX.SUPPLIER_NAME, PX.PART_NAME, SPX.QUANTITY} WHERE SX.SNUM = SPX.SNUM AND PX.PNUM = SPX.PNUM
desired format. In contrast, data can be retrieved from relational databases as a result of
some query and convert it into XML form, so that it can be transmitted to some consumer.
It does not involve any new data type. Instead, XML documents are shredded into
pieces and stored as values of various attributes of various relations in various places in the
database. In this case, the database does not contain XML documents. XML can be formed
by combining certain values in some ways.
Application programs can create XML document from regular database data using
the results of a query. This operation is called Publishing. It provides the ability to have
XML views in non-XML data. Shred and Publish approach is sometimes referred to as XML
collection.
XML Database
There are three different types of XML databases:
1. Native XML Database (NXD)
2. XML Enabled Database (XEDB)
3. Hybrid XML Databases (HXD):
It defines a (logical) model for an XML document and stores and retrieves documents
according to that model. The model must include elements, attributes, PCDATA, and
document order.
Examples of such models are: XPath data model, XML Infoset.
a) The model must include elements, attributes, PCDATA, and document order.
b) It must have an XML document as its fundamental unit of (logical) storage, just as a
relational database has a row in a table as its fundamental unit of (logical) storage.
c) It is not required to have any particular underlying physical storage model. It can be
built on a relational, hierarchical, or object-oriented database, or use a proprietary
storage format such as indexed, compressed files.
A database that has an added XML mapping layer provided either by the database
vendor or a third party. This mapping layer manages the storage and retrieval of XML data.
Data that is mapped into the database is mapped into application specific formats and the
original XML meta-data and structure may be lost. Data manipulation may occur via either
XML specific technologies (e.g. XPath, XSLT, DOM or SAX) or other database technologies
(e.g. SQL). The fundamental unit of storage in an XML Enabled Database is implementation
dependent.