Sie sind auf Seite 1von 28

Query Processing

Query Processing is the procedure of


transforming a high level query into a
correct and efficient execution plan
expressed in low level language that
performs the required retrievals and
manipulations in the database.
General High- Level query
Transform into
query language (standard
form), for example,SQL
Syntax checking and verification by
Scanning, parsing, the parser portion of query processor
Action
Validating whether relations and attributes used
in the query are defined in the
database

Correct Query Translation of relational calculus


query to a relational algebra query
Query Decomposer Action using equivalency rules, idem-potency
Database rules, transformation rules etc. from
Catalog the global database dictionary

Algebraic
expression
Statistical data Performing optimization by
Query optimizer Action substituting equivalent
Estimatio expression for those in the
n query
Formulas
Cost Module
Execution plan
Query Code Generator Action
generating code for
the queries
Join manger
Code to execute
query Estimation of each
access plan,
Runtime database
Action selecting optimal
Main processor
plan and execution
Database
Query Result
Query Processing
As shown in the figure_
The user gives the query request, which may
be QBE or other form.
This is first transformed into standard high
level query language, such as SQL.
This SQL query is read by the syntax analyzer
so that it can be checked for correctness.
The correct query is then passes to the query
decomposer. Which will gives the algebraic
expression of the query.
 This expression is now passes to the query
optimiser.
Query Processing
After optimization, the query optimiser
generates an action plan.
This action plans are converted into
query codes that are finally executes by
the run time database processor.
The runtime database processor
estimates the cost of each access plan
and chose the optimal one for execution.
1. Syntax Analyzer
The syntax analyzer takes query from the
users, parses it into tokens and analyses the
tokens and their order to make sure they
comply with rules of the language grammar.
If an error is found in the query submitted by
the user, it is rejected and an error code
together with an explanation of why the query
was rejected is returned to the user.
2. Query Decomposition
Aim is to transform a high level
query into a relational algebra
query
To check whether that query is
syntactically and semantically
correct.
To transform the high level query
into a query graph of a low level
operations (algebraic expression).
Query Decomposition
The query decomposer goes
through five sages of processing
for decomposition into low level
algebraic expression.
Query analysis
Query normalization
Semantic analysis
Query simplifier
Query restructuring
SQL Query

Equivalence
Rules

Data
Dictiona
ry
Idem potency
Rules

Transformation
Rules

Algebraic
Expression
2.1 Query Analysis
At the end of this analysis phase,
the high level query(SQL) is
transformed into some internal
representation that is more
suitable for processing.
This internal representation is_
Kind of query tree
It is a tree data structure that corresponds
to a relational algebra expression.
It is also called as relational algebra tree.
Relational Algebra Tree
Leaf nodes of the tree, representing the
base input relations of the query.
Internal nodes of the tree, representing an
intermediate relation which is the result of
the applying an operation in the algebra.
Root of the tree, representing the result of
the query.
The sequence of operation is directed from
leaves to the root.
Relational Algebra Tree
Mumbai_projбproj_loc=“Mumbai”(project)
Control_dept(Mumbai_proj deptno=dno
(departement))
Proj_de_mgr(Control_dept mgrid=empid
(employee))
Result∏proj_no,deptno,name,add,dob(proj_de_mgr)
Relational Algebra Tree
∏proj_no,deptno,name,add,dob (proj_de_mgr)

mgrid=empid

deptno=dno employee

бproj_loc=“Mumbai” (project) department

project
Query Graph Notation
In query graph representation, the relations in
the query are represented by relation nodes.
These relation nodes are displayed as single
circle.
The constant values from the query selection
are represented by the constant nodes,
displayed as double circles.
The selection and join conditions are
represented by the graph edges.
The attributes to be retrieved from each
relation are displayed in square brackets above
each relation.
Query Graph Notation

[p.proj_no,p.deptno] e.ename,e.add,a.dob

P D E
p.deptno=d.deptno d.mgrid=e.empid

p.proj_loc=“Mumbai”

“Mumbai

Disadvantages of query graph
notation

It corresponds to a relation


calculus expression.
It does not indicate an order on
which operation to perform first as
is the case with query tree.
Query Normalization
The primary goal of normalization
is to avoid redundancy
In the normalization phase, a set
of equivalency rules is applied so
that the projection and selection
operations included in the query
are simplified to avoid
redundancy.
Query Normalization
Conjunctive normal form – a
sequence of boolean expressions
connected by conjunction (AND):
Each expression contains terms of
comparison operators connected by
Disjunctions (OR)
(emp_desig=“programmer” V
empsal>40000) ^ loc=“mumbai”
Query Normalization
Disjunctive normal form – a
sequence of boolean expressions
connected by disjunction (OR):
Each expression contains terms of
comparison operators connected by
Conjunction (AND)
(emp_desig=“programmer” ^
loc=“mumbai”) V (empsal>40000 ^
loc=“mumbai”)
Example
Let us consider the following two relations stored in a
distributed database
Employee (empid, ename, salary, designation,
deptno)
Department (deptno, dname, location)
and the following query:
“Retrieve the names of all employees whose
designation is Manager and department name is
Production or Printing”.
In SQL, the above query can be represented as
Select ename from Employee, Department where
designation = “Manager” and Employee.deptno =
Department.deptno and dname = “Production” or
dname = “Printing”.
Example
The conjunctive normal form of the query is as follows: 
designation = “Manager” ∧ Employee.deptno =
Department.deptno ∧ (dname = “Production” ν dname =
“Printing”)
 The disjunctive normal form of the same query is
(designation = “Manager” ∧ Employee.deptno =
Department.deptno ∧ dname = “Production) ν
(designation = “Manager” ∧ Employee.deptno =
Department.deptno ∧ dname = “Printing”)
Hence, in the above disjunctive normal form, each disjunctive
connected by ν (OR) operator can processed as independent
conjunctive subqueries.
Equivalency Rules
An equivalence rule says that expressions
of two forms are equivalent.
By applying these rules we can transform
the RE into equivalent CNF or DNF.
CNF – only tuples that satisfy all expressions
DNF – tuples that are the result of union of
tuples that satisfy the expressions
Equivalency Rules
1. Commutativity of UNARY operation:
 UNARYOP1 UNARYOP2 REL <-> UNARYOP2
UNARYOP1 REL
 б ѳ1 (бѳ2(E))=б ѳ2 (бѳ1(E))
1. Commutativity of BINARY operation:
 REL1 BINOP (REL2 BINOP REL3) <-> (REL1
BINOP REL2) BINOP REL3
 (E1 E2) E3 = E1 (E2 E3)
Equivalency Rules
3. Idempotency of UNARY operations:
 UNARYOP1 UNARYOP2 REL <-> UNARYOP REL
4. Distributivity of UNARY operations with
respect to BINARY operation:
 UNARYOP (REL1 BINOP REL2) <-> UNARYOP
(REL1) BINOP UNARYOP (REL2)
 ∏L(E1 U E2)= (∏L(E1)) U (∏L(E2))
5. Factorisation of UNARY operations:
 UNARYOP (REL1) BINOP UNARYOP (REL2) <->
UNARYOP (REL BINOP REL2)
Semantic Analyser
Applied to normalized queries
Rejects incorrectly formulated queries:
Condition components do not contribute to
generation of the result.
Rejects contradictory queries:
Qualification condition cannot be satisfied by
any tuple
The incorrectness and contradiction in the
query is detected based on the
corresponding query graph or relation
connection graph.
Relation Connection Graph for
Incorrectness
A node is created in the query graph for the result
and for each base relation specified in the query.
An edge between two nodes is drawn in the query
graph for each join operation and for each project
operation in the query. An edge between two nodes
that are not result nodes represents a join operation,
while an edge whose destination node is the result
node represents a project operation.
A node in the query graph which is not result node is
labeled by a select operation or a self-join operation
specified in the query.
A join graph for a query is a subgraph of the
relation connection graph which represents only join
operations specified in the query and it can be
derived from the corresponding query graph.
Example
Let us consider the following two relations
Student (s-id, sname, address, course-id, year) and
Course (course-id, course-name, duration, course-fee,
intake-no, coordinator)
and the query “Retrieve the names, addresses and course
names of all those student whose year of admission is 2008
and course duration is 4 years”.
Using SQL, the above query can be represented as:
Select sname, address, course-name from Student,
Course where year = 2008 and duration = 4 and
Student.course-id = Course.course-id.
Example
Student.course-id = Course.course-
id
Student.course-id = Course.course-id

year = 2008 Studen Course Duration = 4


t

Student Course

sname,
address course-name
Figure 11.4(b) Join Graph
Result

Figure 11.4(a) Query Graph


Example
In the above SQL query, if the join condition
between two relations (that is Student.course-id =
Course.course-id) is missing, then there should be
no line between the nodes representing the
relations Student and Course in the corresponding
query graph (figure 11.4(a)). Hence, the SQL
query is semantically incorrect since the relation
connection graph is disconnected. In this case,
either the query is rejected or an implicit Cartesian
product between the relations is assumed.

Das könnte Ihnen auch gefallen