Beruflich Dokumente
Kultur Dokumente
Introduction
In real world applications, data sharing
and integration system design is
challenging task with the problem. The
system currently is only used to
integrate and query on structured data.
To solve this problem, there is a
requirement of a system for accessing,
querying and sharing unstructured filebased data.
Introduction of
Heterogeneous
ADatabase
distributed database
has to be
constructed by linking multiple alreadyexisting database systems together, each
with its own schema and possibly running
different database management software
like SQL SERVER, ORACLE, MS ACCESS.
Such systems are calledheterogeneous
distributed database systems. In a
heterogeneous
distributed
database
system, sites may run different DBMS
products.
Architecture of
Heterogeneous distributed
database system
Data spaces
XML:
Has a structure, but may be
different from one
instance to another
Word & Txt file:
May not have any structureMay have also well structured
At all. May be structured byDatabases and files, but no
Heading and paragraphs One well defined structure
or schema
Objectives
Computing aggregation from transaction
from many different local databases.
The unstructured data is usually managed
by operating system. So manage
unstructured data with any relational
database.
To retrieve and shared unstructured data in
heterogeneous database.
To establish a technique which integrates
and analyzes the unstructured data
collected in different format.
Example of Unstructured
data
Federated DBMS
Architecture
Known Issues
Security issue for federated data
base.
Database systems, users cleared at
different security levels are expected
to access and share a Database
consisting of data at different
sensitivity level
MAGIC Solution
In
Bayesian
framework
,
the
MAGIC(Multisource
Association
of Genes by Integration of
Clusters) has a distributed design
that promotes flexibility for adding
new input methods and datasets.
Known Issues
In a MAGIC DBMS access to
replicated In a distributed DBMS
access to replicated data have to be
controlled in multiple data have to be
controlled In multiple locations.
MAGIC working on microarray data or
only on nonexpression data.
Thank you