Sie sind auf Seite 1von 12

Unstructured based Heterogeneous

Database Integration System design


and Implementation

Prepared By :- Vimal Vaiwala (M.sc(I.T.))


Assistant Professor

Introduction
In real world applications, data sharing
and integration system design is
challenging task with the problem. The
system currently is only used to
integrate and query on structured data.
To solve this problem, there is a
requirement of a system for accessing,
querying and sharing unstructured filebased data.

Introduction of
Heterogeneous
ADatabase
distributed database

has to be
constructed by linking multiple alreadyexisting database systems together, each
with its own schema and possibly running
different database management software
like SQL SERVER, ORACLE, MS ACCESS.
Such systems are calledheterogeneous
distributed database systems. In a
heterogeneous
distributed
database
system, sites may run different DBMS
products.

Architecture of
Heterogeneous distributed
database system

Ref:-Paper by Zhan Liu , Anne Le Calve, Fabian Cretton,Nicole


Glassey
Institute of Business Informarion System

Data spaces

Data spaces are collections of


heterogeneous and partially
unstructured
data. is an adjective used to descr
Heterogeneous

an object or system consisting of multiple item


having a large number of structural variations

XML:
Has a structure, but may be
different from one
instance to another
Word & Txt file:
May not have any structureMay have also well structured
At all. May be structured byDatabases and files, but no
Heading and paragraphs One well defined structure
or schema

Objectives
Computing aggregation from transaction
from many different local databases.
The unstructured data is usually managed
by operating system. So manage
unstructured data with any relational
database.
To retrieve and shared unstructured data in
heterogeneous database.
To establish a technique which integrates
and analyzes the unstructured data
collected in different format.

Example of Unstructured
data

Ref:- Paper by David


a Maluf

Federated DBMS
Architecture

Known Issues
Security issue for federated data
base.
Database systems, users cleared at
different security levels are expected
to access and share a Database
consisting of data at different
sensitivity level

MAGIC Solution
In
Bayesian
framework
,
the
MAGIC(Multisource
Association
of Genes by Integration of
Clusters) has a distributed design
that promotes flexibility for adding
new input methods and datasets.

Known Issues
In a MAGIC DBMS access to
replicated In a distributed DBMS
access to replicated data have to be
controlled in multiple data have to be
controlled In multiple locations.
MAGIC working on microarray data or
only on nonexpression data.

Thank you

Das könnte Ihnen auch gefallen