Sie sind auf Seite 1von 45

ADBT Assign-1

1. Explain the architecture of parallel database with advantages and disadvantages.

Parallel database: parallel database system improves performance through parallelization of


various operations such as loading data, building indexes and evaluating queries by using multiple
CPUs and disks in parallel.
Benefits of parallel database:

Improves response time (by interquery parallelism )


Im prove throughput

Three main arthitectures have been proposed for building parallel database
1. Shared memory
2. Shared disk
3. Shared nothing

Shared memory:
In shared memory system, multiple CPUs are attached to an interconnected network and can access
a common region of main memory .
Advantages:
1. Closer to conventional machine and easy to program
2. Low overhead
3. Operating system services are leveraged to utilise the additional CPUs
Disadvantages:
1. It leads to bottleneck problem
2. Expensive to build
3. Less sensitive to partitioning

INTERCONNECTING

GLOBAL

D
http://way2mca.com

SHARED

NETWORK

MEMORY

ADBT Assign-1
Shared Disk
In shared Disk systems each CPU has a private memory and direct access to all disks through an
interconnection network.

Advantages:
1. It is closer to conventional machine easy to program
2. Overhead is low
3. OS services are leveraged to utilise the additional CPUs
Disadvantages:
1. More interference
2. Increases network bandwidth
3. Less sensitive to partitioning

INTERCONNECTING

NETWORK

(Fig: Shared Disk Architecture)


Shared Nothing:
In a share nothing system , each CPU has local main memory and local disk space, no two
CPUs can access the same storage area; all communication between CPUs is through a
network connection .
Advantages:
1. Linear speed up and scale up
2. Good partitioning
3. Cheap to build

http://way2mca.com

ADBT Assign-1
Disadvantages:
1. Hard to program
2. Addition of new nodes requires reorganizing

INTERCONNECTING

NETWORK

(Fig: Shared Nothing architecture)

Q-2) Explain distributed Query Processing:


Ans:
While estimating the cost of an evaluation strategy, in addition to counting number of pages , I/Os
we must count the number of pages sent from one site to another (shipping cost)
1. Non join queries in a distributed DBMS
Simple operations such as scanning a relation, selection, and projection are affected by
fragmentation and replications.
Consider the following query:
SELECT s.age
FROM sailors s
WHERE s.rating >3 AND s.rating < 7
-suppose that the sailors relation is horizontally fragmented , with all tuples having a rating less than
5 at Shanghai and all tuples having a rating greter than 5 at Tokyo.

http://way2mca.com

ADBT Assign-1
The dbms must answer this query by evaluating it at both the sites and takin g the union of the
answers.
_ If the SELECT clause contains AVG(s.age) combining the answers could not be done by simply
taking the union the DBMS must compute the sum and count of age values at the two sites and
use this information to compute the average age of all sailors.
_ if the WHERE clause contained just the condition s.rating >6 , on the other hand, the DBMS should
recognise that this query could answer by just executing at Tokyo.
_ Another example , suppose that the sailors relation were vertically fragmented , with the sid and
rating fields at Shanghai & the sname and age fields at Tokyo. No fields are stored at both sites,. This
vertical fragmentation would therefore be a lossy decomposition , except that a field containing the
id of the corresponding sailors tuple is included by the DBMS in both fragments. Now the DBMS has
to reconstruct the sailors relation by joining the two fragments on the common tuppled-id field and
executing the query over this reconstructed relation.
2)Joins in a Distributed DBMS:Joins of relations at different sites can be very expensive
_here we consider the cost of various strategies for computing
Sailors

Reserves

Where sailors relations were stored at London and the reserved at Paris.
Fetch as needed
we could do a page oriented nested loops join in London with sailors as the outer and for each page ,
fetch all reserves pages from paris.
Therefore cost=cost of scan sailors + cost of scanning and shipping of all reserves
If the query was not submitted at the London site , we must add the cost of shipping the result to
the query site and this cost depends upon the size of the result.
-In this example. If the query site is not London or Paris, the cost of shipping the result is greater
than the cost of shipping both sailors and Reserves to the query site. Therefore, it would be cheaper
to ship both relations to the query site and compute the join there.

http://way2mca.com

Das könnte Ihnen auch gefallen