Sie sind auf Seite 1von 7

Database Study

Material

Prepared by iTech Team


Collection of data organized in a particular way is called a database.

E.g. Oracle, MySQL, Microsoft SQL Server

Structure
There are number of possible data structures such as flat file, RDBMS etc.

RDBMS:
In RDBMS, data is stored in the form of table. Each table consist of rows and columns.

Row : Record (Horizontal, if you store 10 student details in a table, you will have 10 rows)

Column: Vertical group of values within a table. It contains values from a single field (E.g. S. No.)
Null : A field with a NULL value is one that has been left blank during record creation.

Constraints : Rules enforced on columns of the table.

 Not Null – Column can’t have null value


 Primary Key – Uniquely identify a record in a table
 Foreign Key - columns of a table that points to the primary key of another table
 Unique - Will make sure all values in column are unique
 Composite Key – If more than one column combined to uniquely identify a record in a table.

SQL Commands
There can be divided into four sub groups such as DDL, DML, DCL, DQL

DDL : Data Definition Language, statements are used to define the database structure

E.g. CREATE, ALTER, DROP, TRUNCATE

DML : Data manipulation Language, used to store, retrieve, modify & delete data from database

E.g. SELECT, INSERT, UPDATE, DELETE,

DCL : Data Control Language, used to give right/permission for a data access

E.g. GRANT, REVOKE

DQL : Data Query Language, used to read the data from database

E.g. SELECT

Prepared by iTech Team


Normalization
Process of organizing data in database to avoid data redundancy (repeating of data) is called
Normalization.

Why we apply Normalization/ Need for Normalization?


If data is redundant, we will face three anomalies.

Update anomalies : When we try to update one data item having its copies scattered over several
places, a few instances get updated properly while a few others are left with old values. So data will be
inconsistent

Deletion anomalies : When we try to delete one data item having its copies scattered over several
places, a few instances get deleted properly while a few others are left in database. So data will be
inconsistent

Insertion anomalies : Consider a Student table with course details in same table. If I want to add a
course, I need to give student details as well. So I can’t add new course without student record. This is
called insertion anomalies.

There are 3 important normal forms 1st, 2nd and 3rd. Additional forms like Boyce Codd Normal form also
available.

In Short

Normal Form Description


1ST Normal Form Data should be atomic (only one value)
2nd Normal Form Every Non-prime attribute should dependent on every prime attribute
3rd Normal Form Avoid transitive dependency (A -> B. B->C, A->C)

1st Normal Form:


All the attributes in a relation must have atomic values (meaning only one value in one field).

Staff Name Subject


Lakshmanan C, C++, Java
Vicky Java

After applying 1st normal form

Staff Name Subject


Lakshmanan C
Lakshmanan C++
Lakshmanan Java
Vicky Java

Prepared by iTech Team


2nd Normal Form:
Every not prime attribute should be fully dependent on every prime attribute.

(Example taken from internet)

Student Table:

Student_Id Project_Id Student_Name Project_Name

Here Student_Id & Project_Id are prime attributes (primary keys) where Student_Name & Project_Name
are non-prime attributes (not primary keys).

According to 2nd Normal Form, every non-prime attribute should dependent on all prime attributes in
the table. In the above example, Student_Name is dependent on prime key Student_Id, Project_Name is
dependent on Project_Id.

As 2nd Normal Form states, Student_Name & Project_Name both should dependent on Student_Id &
Project_Id. If it’s not dependent on both separate them into new table.

Student table:

Student_Id Project_Id Student_Name

Project table:

Project_Id Project_Name

3rd Normal Form:


In 3rd Normal Form, we are going to avoid transitive dependency. 3rd normal form should be applied first
before applying 3rd Normal form.

Transitive dependency: A depends on B & B depends on C, so C transitively dependents on both A & B

Book_id Book_type_id Book_type Price


1 1 History Rs. 200
2 21 Economics Rs. 250
3 18 Programming Rs. 300

Here Book_id determines the Book_type_id. Book_type_id determines the Book_type. So Book_type
transitively depends on Book_id. Rule of 3rd Normal Form is, If transitively dependent appear in a table,
separate them into two tables.

Prepared by iTech Team


Book_id Book_type_id Price Book_type_id Book_type
1 1 Rs. 200 1 History
2 21 Rs. 250 21 Economics
3 18 Rs. 300 18 Programming

BCNF:
Its slightly stronger version of 3rd Normal form. So called 3.5 Normal Form.

A relational schema R is in Boyce–Codd normal form if and only if for every one of its dependencies X →
Y, at least one of the following conditions hold:

 X → Y is a trivial functional dependency (Y ⊆ X)


 X is a super key for schema R

Please read example from this web site which gives better understanding about BCNF

http://www.vertabelo.com/blog/technical-articles/boyce-codd-normal-form-bcnf

SQL Clauses:
WHERE clause : Where clause is used to specify condition while retrieving data from table
Like clause : Like clause is used as condition in SQL query. Like clause compares data with an
expression using wildcard operators.

Percent sign %: represents zero, one or more than one character.


Underscore _ : represents only one character.
Order by : To arrange retrieved data in sorted order. Default sort is ascending order. To sort data
in descending order DESC keyword is used with Order by clause.

Group by : Used to group results of a SELECT query based on one or more columns.

Distinct : Used with Select statement to retrieve unique values from the table

Aggregate Functions:
These functions return a single value after calculating from a group of values

avg() – average,

count() – return number of rows,

max() – returns max value from select,

min() – returns min value from select,

sum() – return total sum of selected column

Prepared by iTech Team


Join Query
SQL JOIN is a method to retrieve data from two or more database tables.

 INNER JOIN : It merges matched rows from two tables


 OUTER JOIN : It merges matched rows from two tables & unmatched rows
filled with NULL
o LEFT JOIN (or Left outer join) : Returns matched rows form two tables and
unmatched from LEFT table
o RIGHT JOIN (or Right outer join): Returns matched rows form two tables and
unmatched from RIGHT table
o FULL OUTER JOIN : Returns matched and unmatched from both tables
o CROSS JOIN (Cartesian Join) : Returns Cartesian product of two tables

Transaction
A transaction can be defined as a group of tasks. A single task is the minimum processing unit which
cannot be divided further (E.g. Update/Insert/Delete are task which can’t be further divided).

ACID Property : Every transaction should possess ACID property.


Before reading about ACID property, please understand, a transaction usually contains more than one
insert or update or delete statements or all.

 Atomicity - Either all of operations (statements) in a transaction executed or none. No half-done


transaction.
 Consistency - Any transaction will bring the database from one valid state to another
 Isolation - Concurrent execution of more than one transaction results in a system state that
would be obtained if transactions were executed one after another.
 Durability - The durability property ensures that once a transaction has been committed, it will
remain so, even in the event of power loss, crashes, or errors.

Dead lock
In a multi-process system, deadlock is an unwanted situation, where a transaction indefinitely waits for
a resource that is held by another transaction.

E.g. there are three transactions T1, T2, T3.

T1 already holds resource X & waiting for resource Y to complete.

T2 holds the resource Y but waiting for resource Z to complete

T3 holds the resource Z but waiting for resource X to be released by T1.

Prepared by iTech Team


How to detect dead lock?
Wait for graph
Wait for graph is used to detect dead lock. It’s a directed graph (which means arrow marked graph). It
only helps in deadlock prevention not recovery if happened.

For every transaction enters system (let’s say Ti) create a node in graph (a circle). If Ti request a resource
X which is held by transaction Tj, then draw a directed edge from Ti to Tj.

If there is a chance of cycle in the graph, system will detect it.

Prepared by iTech Team

Das könnte Ihnen auch gefallen