Vec - Cse IV Semester - II Year - CS8492 - DBMS

VEC – CSE IV Semester – II Year – CS8492 – DBMS
CS8492 – DATABASE MANAGEMENT SYSTEMS
UNIT 1 RELATIONAL DATABASES 10

Purpose of Database System – Views of data – Data Models – Database System
Architecture – Introduction to relational databases – Relational Model – Keys –
Relational Algebra – SQL fundamentals – Advanced SQL features – Embedded
SQL– Dynamic SQL
DBMS – Definition
 A database-management system (DBMS) is a collection of interrelated data
and a set of programs to access those data. The collection of data, usually
referred to as the database, contains information relevant to an enterprise.
 The primary goal of a DBMS is to provide a way to store and retrieve
database information that is both convenient and efficient.
Database Applications
Databases are widely used. Here are some representative applications:
• Enterprise Information
◦ Sales: For customer, product, and purchase information.
◦ Accounting: For payments, receipts, account balances, assets and other
accounting information.
◦ Human resources: For information about employees, salaries, payroll
taxes, and benefits, and for generation of paychecks.
◦ Manufacturing: For management of the supply chain
◦ Online retailers: For sales data noted above plus online order tracking,
generation of recommendation lists, and maintenance of online product
evaluations.
• Banking and Finance
◦ Banking: For customer information, accounts, loans, and banking
transactions.
◦ Credit card transactions: For purchases on credit cards and generation of
monthly statements.
◦ Finance: For storing information about holdings, sales, and purchases of
financial instruments such as stocks and bonds;
• Universities: For student information, course registrations, and grades (in
addition to standard enterprise information such as human resources and
accounting).
1
• Airlines: For reservations and schedule information. Airlines were among the
first to use databases in a geographically distributed manner.
• Telecommunication: For keeping records of calls made, generating monthly
bills, maintaining balances on prepaid calling cards, and storing information about
the communication networks.
Purpose of Database System

 Database systems arose in response to early methods of computerized
management of commercial data.
 This typical file-processing system is supported by a conventional operating
system.
 The system stores permanent records in various files, and it needs different
application programs to extract records from, and add records to, the
appropriate files.
 Before database management systems (DBMSs) were introduced,
organizations usually stored information in such systems.
Keeping organizational information in a file-processing system has a number of
major disadvantages:
 Data redundancy and inconsistency.
Redundancy leads to higher storage and access cost. In addition, it may lead to
data inconsistency; that is, the various copies of the same data may no longer
agree. For example, a changed student address may be reflected in the Music
department records but not elsewhere in the system.
 Difficulty in accessing data.
The point here is that conventional file-processing environments do not allow
needed data to be retrieved in a convenient and efficient manner. More
responsive data-retrieval systems are required for general use.
 Data isolation.
Because data are scattered in various files, and files may be in different formats,
writing new application programs to retrieve the appropriate data is difficult.
 Integrity problems.
The data values stored in the database must satisfy certain types of consistency
constraints.
 Atomicity problems.
A computer system, like any other device, is subject to failure. In many
applications, it is crucial that, if a failure occurs, the data be restored to the
consistent state that existed prior to the failure.
2
 Concurrent-access anomalies.
For the sake of overall performance of the system and faster response, many
systems allow multiple users to update the data simultaneously.
 Security problems.
Not every user of the database system should be able to access all the data.
These difficulties, among others, prompted the development of database

systems. DBMS has the concepts and algorithms that enable database systems to
solve the problems with file-processing systems.
Views of DATA
A major purpose of a database system is to provide users with an abstract view of
the data. That is, the system hides certain details of how the data are stored and
maintained.
Data Abstraction
For the system to be usable, it must retrieve data efficiently. The need for
efficiency has led designers to use complex data structures to represent data in
the database. Since many database-system users are not computer trained,
developers hide the complexity from users through several levels of abstraction,
to simplify users’ interactions with the system:
 Physical level. The lowest level of abstraction describes how the data are
actually stored. The physical level describes complex low-level data
structures in detail.
 Logical level. The next-higher level of abstraction describes what data are
stored in the database, and what relationships exist among those data. The
logical level thus describes the entire database in terms of a small number
of relatively simple structures.
 View level. The highest level of abstraction describes only part of the entire
database. Even though the logical level uses simpler structures, complexity
remains because of the variety of information stored in a large database.
3
type instructor = record

ID : char (5);
name : char (20);
dept name : char (20);
salary : numeric (8,2);
end;
This code defines a new record type called instructor with four fields.
Instances and Schemas

 Databases change over time as information is inserted and deleted.
 The collection of information stored in the database at a particular moment
is called an instance of the database.
 The overall design of the database is called the database schema. Schemas
are changed infrequently, if at all.
 Database systems have several schemas, partitioned according to the levels
of abstraction.
 The physical schema describes the database design at the physical level,
while the logical schema describes the database design at the logical level.
 A database may also have several schemas at the view level, sometimes
called subschemas that describe different views of the database.
 Application programs are said to exhibit physical data independence if they
do not depend on the physical schema, and thus need not be rewritten if
the physical schema changes.
4
Data Models
Underlying the structure of a database is the data model: a collection of
conceptual tools for describing data, data relationships, data semantics, and
consistency constraints. A data model provides a way to describe the design of a
database at the physical, logical, and view levels.
There are a number of different data models that we shall cover in the text. The
data models can be classified into four different categories:
Relational Model.
 The relational model uses a collection of tables to represent both data and
the relationships among those data. Each table has multiple columns, and
each column has a unique name. Tables are also known as relations.
 Tables are also known as relations. The relational model is an example of a
record-based model.
Table: Student
Student ID Student Name Department Date of Birth
111 Ajay CSE 23-June-1999
112 Aravind CSE 20-Jan-1998
113 Balakumaran CSE 21-Jun-1999
Student(StudentID, StudentName, Department, DOB)

The underlined student ID is the primary key.
 Advantages
 Structural Independence
 Conceptual Simplicity
 Design, Implementation and Maintenance
 Disadvantages
 Significant hardware and software overhead
 Not as good as transaction process modeling
 May have slow processing time than the hierarchical and network
model
5
Entity-Relationship Model.
The entity-relationship (E-R) data model uses a collection of basic objects, called
entities, and relationships among these objects. An entity is a “thing” or “object”
in the real world that is distinguishable from other objects. The entity-relationship
model is widely used in database design,
 Rectangle: Represents Entity sets.

 Ellipses: Attributes
 Diamonds: Relationship Set
 Lines: They link attributes to Entity Sets and Entity sets to Relationship Set
 Double Ellipses: Multivalued Attributes
 Dashed Ellipses: Derived Attributes
 Double Rectangles: Weak Entity Sets
 Double Lines: Total participation of an entity in a relationship set
Components of a ER Diagram
ER Diagram Components
6
 Advantages
o Easy to develop relational model using ER model
o ER specifies mapping cardinalities
o Specifies key like primary key, foreign key
 Disadvantages
o Used for design purpose only not implementation
Object-Based Data Model.

 Object-oriented programming (especially in Java, C++, or C#) has become
the dominant software-development methodology.
 This led to the development of an object-oriented data model that can be
seen as extending the E-R model with notions of encapsulation, methods
(functions), and object identity.
 The object-relational data model combines features of the object-oriented
data model and relational data model.
Semistructured Data Model.

 The semi structured data model permits the specification of data where
individual data items of the same type may have different sets of attributes.
 This is in contrast to the data models mentioned earlier, where every data
item of a particular type must have the same set of attributes.
 The Extensible Markup Language (XML) is widely used to represent semi
structured data.
Historically, the network data model and the hierarchical data model preceded
the relational data model. These models were tied closely to the underlying
implementation, and complicated the task of modeling data. As a result they are
used little now, except in old database code that is still in service in some places.
Database Languages
A database system provides a data-definition language to specify the database
schema and a data-manipulation language to express database queries and
updates. In practice, the data-definition and data-manipulation languages are not
two separate languages; instead they simply form parts of a single database
language, such as the widely used SQL language.
7
Data-Manipulation Language
A data-manipulation language (DML) is a language that enables users to access or
manipulate data as organized by the appropriate data model. The types of access
are:
• Retrieval of information stored in the database
• Insertion of new information into the database
• Deletion of information from the database
• Modification of information stored in the database
There are basically two types:

 Procedural DMLs require a user to specify what data are needed and how
to get those data.
 Declarative DMLs (also referred to as nonprocedural DMLs) require a user
to specify what data are needed without specifying how to get those data.
A query is a statement requesting the retrieval of information. The portion of a

DML that involves information retrieval is called a query language. Although
technically incorrect, it is common practice to use the terms query language and
data-manipulation language synonymously.
Data-Definition Language
 A database schema by a set of definitions expressed by a special language
called a data-definition language (DDL).The DDL is also used to specify
additional properties of the data.
 Storage structure and access methods used by the database system are
specified by a set of statements in a special type of DDL called a data
storage and definition language.
 These statements define the implementation details of the database
schemas, which are usually hidden from the users.
 The data values stored in the database must satisfy certain consistency
constraints. For example, suppose the university requires that the account
balance of a department must never be negative.
Domain Constraints. A domain of possible values must be associated with every

attribute (for example integer types, character types, date/time types). Declaring
an attribute to be of a particular domain acts as a constraint on the values that it
can take. Domain constraints are the most elementary form of integrity
constraint.
8
Referential Integrity. There are cases where we wish to ensure that a value that
appears in one relation for a given set of attributes also appears in a certain set of
attributes in another relation (referential integrity).
Assertions. An assertion is any condition that the database must always satisfy.
Domain constraints and referential-integrity constraints are special forms of
assertions.
Authorization. To differentiate among the users as far as the type of access they
are permitted on various data values in the database.
These differentiations are expressed in terms of authorization, the most
common being: read authorization, which allows reading, but not modification, of
data;
insert authorization, which allows insertion of new data, but not modification of
existing data; update authorization, which allows modification, but not deletion,
of data;
delete authorization, which allows deletion of data.
Data Storage and Querying

A database system is partitioned into modules that deal with each of the
responsibilities of the overall system. The functional components of a database
system can be broadly divided into the storage manager and the query processor
components.
9
Database System Architecture
 The architecture of a database system is greatly influenced by the

underlying computer system on which the database system runs.
 Database systems can be centralized, or client-server, where one server
machine executes work on behalf of multiple client machines. Database
system scan also be designed to exploit parallel computer architectures.
Storage Manager
 The storage manager is the component of a database system that provides
the interface between the low-level data stored in the database and the
application programs and queries submitted to the system.
 The storage manager is responsible for the interaction with the file
manager. The raw data are stored on the disk using the file system
provided by the operating system. The storage manager translates the
various DML statements into low-level file-system commands.
The storage manager components include:

• Authorization and integrity manager, which tests for the satisfaction of
integrity constraints and checks the authority of users to access data.
• Transaction manager, which ensures that the database remains in a
consistent (correct) state despite system failures, and that concurrent
transaction executions proceed without conflicting.
• File manager, which manages the allocation of space on disk storage and the
data structures used to represent information stored on disk.
• Buffer manager, which is responsible for fetching data from disk storage into
main memory, and deciding what data to cache in main memory. The buffer
manager is a critical part of the database system, since it enables the database
to handle data sizes that are much larger than the size of main memory.
The storage manager implements several data structures as part of the

physical system implementation:
• Data files, which store the database itself.
• Data dictionary, which stores metadata about the structure of the database,
in particular the schema of the database.
• Indices, which can provide fast access to data items. Like the index in this
textbook, a database index provides pointers to those data items that hold a
particular value.
10
The Query Processor

The query processor components include:
• DDL interpreter, which interprets DDL statements and records the
definitions in the data dictionary.
• DML compiler, which translates DML statements in a query language into an
evaluation plan consisting of low-level instructions that the query evaluation
engine understands.
The DML compiler also performs query optimization; that is, it picks the lowest
cost evaluation plan from among the alternatives.
Query evaluation engine, which executes low-level instructions generated by the
DML compiler.
11
In two-tier architecture, the application resides at the client machine, where it

invokes database system functionality at the server machine through query
language statements. Application program interface standards like ODBC and
JDBC are used for interaction between the client and the server.
In contrast,
in three-tier architecture, the client machine acts as merely a front end and does
not contain any direct database calls. Instead, the client end communicates with
an application server, usually through a forms interface. The application server in
turn communicates with a database system to access data. The business logic of
the application, which says what actions to carry out under what conditions, is
embedded in the application server, instead of being distributed across multiple
clients. Three-tier applications are more appropriate for large applications, and
for applications that run on the World Wide Web.
12
Database Users and Administrators

A primary goal of a database system is to retrieve information from and
store new information into the database.
Database Users and User Interfaces
There are four different types of database-system users, differentiated by the way
they expect to interact with the system.
1. Naive users are unsophisticated users who interact with the system by
invoking one of the application programs that have been written
previously.
2. Application programmers are computer professionals who write
application programs.
3. Sophisticated users interact with the system without writing programs. In-
stead, they form their requests either using a database query language or
by using tools such as data analysis software.
4. Specialized users are sophisticated users who write specialized database
applications that do not fit into the traditional data-processing framework.
Database Administrator
One of the main reasons for using DBMS is to have central control of both the
data and the programs that access those data. A person who has such central
control over the system is called a database administrator (DBA).
The functions of a DBA include:
 Schema definition. The DBA creates the original database schema by
executing a set of data definition statements in the DDL.
 Storage structure and access-method definition.
 Schema and physical organization modification. The DBA carries out
changes to the schema and physical organization to reflect the changing
needs of the organization, or to alter the physical organization to improve
performance.
 Granting of authorization for data access. By granting different types of
authorization, the database administrator can regulate which parts of the
database various users can access.
 Routine maintenance. Examples of the database administrator’s routine
maintenance activities are:
o Periodically backing up the database, either onto tapes or onto
remote servers, to prevent loss of data in case of disasters such as
flooding.
13
o Ensuring that enough free disk space is available for normal

operations, and upgrading disk space as required.
o Monitoring jobs running on the database and ensuring that
performance is not degraded by very expensive tasks submitted by
some users.
Introduction to Relational Data Base

In relational model
 relation is used to refer to a table
 tuple is used to refer to a row
 attribute refers to a column of a table
 relation instance refers to a specific instance of a relation
 For each attribute of a relation, there is a set of permitted values, called the
domain of that attribute.
 A domain is atomic if elements of the domain are considered to be
indivisible units.
o For example, suppose the table instructor had an attribute phone
number, which can store a set of phone numbers corresponding to
the instructor. Then the domain of phone number would not be
atomic, since an element of the domain is a set of phone numbers,
and it has sub parts, namely the individual phone numbers in the set.
 The null value is a special value that signifies that the value is unknown or
does not exist.
o For example, suppose as before that we include the attribute phone
number in the instructor relation. It may be that an instructor does
not have a phone number at all, or that the telephone number is
unlisted.
 Degree – Total number of columns in the relational database.
 Cardinality – Total number of unique column values – tuples in database
Database Schema
 Database schema, which is the logical design of the database
 Database instance, which is a snapshot of the data in the database at a
given instant in time.
 The concept of a relation corresponds to the programming-language notion
of a variable, while the concept of a relation schema corresponds to the
programming-language notion of type definition.
14
o student (ID, name, dept name, tot cred)

o advisor (s id, i id)
o takes (ID, course id, sec id, semester, year, grade)
o classroom (building, room number, capacity)
o time slot (time slot id, day, start time, end time)
ID Name Department Email Credits

111 Priya CSE Shan@vec.in 9.2
112 Shan EEE shanv@vec.in 8.7
113 Ajay CSE ajay@vec.in 8.2
114 Aravind EEE arav@vec.in 7.7
115 Pooja CSE Pooja@vec.in 9.1
Relation : Student
Tuple
Attributes
Relation Instance
Select * from student where id = 111 or id =112;
112 Shan EEE shanv@vec.in 8.7
Degree = Total Number of Columns = 5
Cardinality = Total Number of Rows = 5
15
Keys
 We must have a way to specify how tuples within a given relation are
distinguished.
 This is expressed in terms of their attributes.
 That is, the values of the attribute values of a tuple must be such that they
can uniquely identify the tuple.
 In other words, no two tuples in a relation are allowed to have exactly the
same value for all attributes.
A superkey is a set of one or more attributes that, taken collectively, allow us to

identify uniquely a tuple in the relation. For example, the ID attribute of the
relation instructor is sufficient to distinguish one instructor tuple from another.
Thus, ID is a superkey. The name attribute of instructor, on the other hand, is not
a superkey, because several instructors might have the same name.
A superkey may contain extraneous attributes. For example, the combination of

ID and name is a superkey for the relation instructor. If K is a superkey, then so is
any superset of K. We are often interested in superkeys for which no proper
subset is a superkey. Such minimal superkeys are called candidate keys.
Candidate Key = Super Key – Primary Key
The term Primary key used to denote a candidate key that is chosen by the
database designer as the principal means of identifying tuples within a relation.
A key (whether primary, candidate, or super) is a property of the entire relation,

rather than of the individual tuples.
A superkey of a relation is a set of one or more attributes whose values are

guaranteed to identify tuples in the relation uniquely. A candidate key is a
minimal superkey, that is, a set of attributes that forms a superkey, but none of
whose subsets is a superkey. One of the candidate keys of a relation is chosen as
its primary key.
A foreign key is a set of attributes in a referencing relation, such that for each
tuple in the referencing relation, the values of the foreign key attributes are
guaranteed to occur as the primary key value of a tuple in the referenced relation.
16
17
A schema diagram is a pictorial depiction of the schema of a database that shows

the relations in the database, their attributes, and primary keys and foreign keys.
The relational query languages define a set of operations that operate on tables,
and output tables as their results. These operations can be combined to get
expressions that express desired queries.
The relational algebra provides a set of operations that take one or more
relationsasinputandreturnarelationasanoutput.Practicalquerylanguages such as
SQL are based on the relational algebra, but add a number of useful syntactic
features.
18
SQL Fundamentals
SQL is a database computer language designed for the retrieval and management
of data in relational database. SQL stands for Structured Query Language.
What is SQL?
SQL is Structured Query Language, which is a computer language for storing,

manipulating and retrieving data stored in relational database.
SQL is the standard language for Relation Database System. All relational
database management systems like MySQL, MS Access, Oracle, Sybase, Informix,
postgres and SQL Server use SQL as standard database language.
Also, they are using different dialects, such as:

 MS SQL Server using T-SQL,
 Oracle using PL/SQL,
 MS Access version of SQL is called JET SQL (native format) etc.
Why SQL?
 Allows users to access data in relational database management systems.
 Allows users to describe the data.
 Allows users to define the data in database and manipulate that data.
 Allows embedding within other languages using SQL modules, libraries &
pre-compilers.
 Allows users to create and drop databases and tables.
 Allows users to create view, stored procedure, functions in a database.
 Allows users to set permissions on tables, procedures, and views
History:
 1970 -- Dr. Edgar F. "Ted" Codd of IBM is known as the father of relational
databases. He described a relational model for databases.
 1974 -- Structured Query Language appeared.
 1978 -- IBM worked to develop Codd's ideas and released a product named
System/R.
 1986 -- IBM developed the first prototype of relational database and
standardized by ANSI. The first relational database was released by
Relational Software and its later becoming Oracle.
SQL Process:
19
When you are executing an SQL command for any RDBMS, the system
determines the best way to carry out your request and SQL engine figures out
how to interpret the task. There are various components included in the process.
These components are Query Dispatcher, Optimization Engines, Classic Query
Engine and SQL Query Engine, etc. Classic query engine handles all non-SQL
queries but SQL query engine won't handle logical files.
Following is a simple diagram showing SQL Architecture
Overview of the SQL Query Language

The SQL language has several parts:
 Data-definition language(DDL).The SQL DDL provides commands for
defining relation schemas, deleting relations, and modifying relation
schemas.
 Data-manipulation language (DML). The SQL DML provides the ability to
query information from the database and to insert tuples into, delete
tuples from, and modify tuples in the database.
 Integrity. The SQL DDL includes commands for specifying integrity
constraints that the data stored in the database must satisfy. Updates that
violate integrity constraints are disallowed.
 View definition. The SQL DDL includes commands for defining views.
20
 Transaction control. SQL includes commands for specifying the beginning

and ending of transactions.
 Embedded SQL and dynamic SQL. Embedded and dynamic SQL define how
SQL statements can be embedded within general-purpose programming
languages, such as C, C++, and Java. Authorization. The SQL DDL includes
commands for specifying access rights to relations and views.
Basic Types
The SQL standard supports a variety of built-in types, including:
 char(n): A fixed-length character string with user-specified length n. The full
form, character, can be used instead.
 varchar(n): A variable-length character string with user-specified maximum
length n. The full form, character varying, is equivalent.
 int: An integer(a finite subsetof the integersthat ismachine dependent).The
full form, integer, is equivalent.
 smallint: A small integer (a machine-dependent subset of the integer type).
 numeric(p,d):Afixed-pointnumberwithuser-specifiedprecision.Thenum- ber
consists of p digits (plus a sign), and d of the p digits are to the right of the
decimal point. Thus, numeric(3,1) allows 44.5 to be stored exactly, but
neither 444.5 or 0 .32 can be stored exactly in a field of this type.
 real, double precision: Floating-point and double-precision floating-point
numbers with machine-dependent precision.
 float(n): A floating-point number, with precision of at least n digits.
Integrity Constraints
 Integrity constraints ensure that changes made to the database by
authorized users do not result in a loss of data consistency.
 Thus, integrity constraints guard against accidental damage to the
database.
Integrity constraints include
o not null
o unique
o check(<predicate>)
21
1. Not Null Constraint

name varchar(20) not null
budget numeric(12,2) not null
The not null specification prohibits the insertion of a null value for the attribute.
Any database modification that would cause a null to be inserted in an attribute
declared to be not null generates an error diagnostic.
2. Unique Constraint
SQL also supports an integrity constraint:
unique (Aj1, Aj2,...,Ajm)
The unique specification says that attributes Aj1, Aj2,...,Ajm form a candidate key;
that is, no two tuples in the relation can be equal on all the listed attributes.
However, candidate key attributes are permitted to be null unless they have
explicitly been declared to be not null.
3. The check Clause

 The clause check(P) specifies a predicate P that must be satisfied by every
tuple in a relation.
 A common use of the check clause is to ensure that attribute values satisfy
specified conditions, in effect creating a powerful type system.
 For instance, a clause check (budget > 0) in the create table command for
relation department would ensure that the value of budget is nonnegative.
As another example, consider the following:
create table section
(course id varchar (8),
sec id varchar (8),
semester varchar (6),
year numeric (4,0),
building varchar (15),
primary key (course id, sec id, semester, year),
check (semester in (’Fall’, ’Winter’, ’Spring’, ’Summer’)));
22
4. Referential Integrity
 To ensure that a value that appears in one relation for a given set of
attributes also appears for a certain set of attributes in another relation.
 This condition is called referential integrity.
More generally, let r1 and r2 be relations whose set of attributes are R1 and R2,
respectively, with primary keys K1 and K2. We say that a subset of R2 is a foreign
key referencing K1 in relation r1 if it is required that, for every tuple t2 in r2, there
must be a tuplet1 in r1 such that t1.K1 = t2.. Requirements of this form are called
referential-integrity constraints, or subset dependencies.
create table course
(course id varchar (8),
title varchar (50),
dept name varchar (20),
credits numeric (2,0)
check (credits > 0),
primary key (course id),
foreign key (dept name) references department)
DDL (Data Definition Language) : DDL or Data Definition Language actually

consists of the SQL commands that can be used to define the database schema. It
simply deals with descriptions of the database schema and is used to create and
modify the structure of database objects in database.
Examples of DDL commands:
 CREATE – is used to create the database or its objects (like table, index,
function, views, store procedure and triggers).
 DROP – is used to delete objects from the database.
 ALTER-is used to alter the structure of the database.
 TRUNCATE–is used to remove all records from a table, including all spaces
allocated for the records are removed.
 COMMENT –is used to add comments to the data dictionary.
 RENAME –is used to rename an object existing in the database.
23
DML(Data Manipulation Language) : The SQL commands that deals with the
manipulation of data present in database belong to DML or Data Manipulation
Language and this includes most of the SQL statements.
Examples of DML:
 SELECT – is used to retrieve data from the a database.

 INSERT – is used to insert data into a table.
 UPDATE – is used to update existing data within a table.
 DELETE – is used to delete records from a database table.
DCL(Data Control Language) : DCL includes commands such as GRANT and

REVOKE which mainly deals with the rights, permissions and other controls of the
database system.
Examples of DCL commands:
 GRANT-gives user’s access privileges to database.

 REVOKE-withdraw user’s access privileges given by using the GRANT
command.
TCL(transaction Control Language) : TCL commands deals with the transaction

within the database.
Examples of TCL commands:
 COMMIT– commits a Transaction.

 ROLLBACK– rollbacks a transaction in case of any error occurs.
 SAVEPOINT–sets a save point within a transaction.
 SET TRANSACTION–specify characteristics for the transaction.
24
25
Underlined Column names are Primary Key Attributes
DDL (Data Definition Language)
CREATE TABLE
 Specifies a new base relation by giving it a name, and specifying each of its
attributes and their data types (INTEGER, FLOAT, DECIMAL(i,j), CHAR(n),
VARCHAR(n))
 A constraint NOT NULL may be specified on an attribute
In SQL2, can use the CREATE TABLE command for specifying the primary
key attributes, secondary keys, and referential integrity constraints (foreign
keys).
 Key attributes can be specified via the PRIMARY KEY and UNIQUE phrases
CREATE TABLE DEPARTMENT

( DNAME VARCHAR(10) NOT NULL,
DNUMBER INTEGER NOT NULL,
MGRSSN CHAR(9),
MGRSTARTDATE CHAR(9) );
CREATE TABLE DEPT
26

MGRSSN CHAR(9),
MGRSTARTDATE CHAR(9),
PRIMARY KEY (DNUMBER),
UNIQUE (DNAME),
FOREIGN KEY (MGRSSN) REFERENCES EMP );
DROP TABLE
 Used to remove a relation (base table) and its definition
 The relation can no longer be used in queries, updates, or any other
commands since its description no longer exists
DROP TABLE DEPENDENT;
TRUNCATE
TRUNCATE removes all rows from a table. The operation cannot be rolled
back and no triggers will be fired. As such, TRUCATE is faster and doesn't use as
much undo space as a DELETE.
TRUNCATE TABLE emp;
ALTER TABLE
 Used to add an attribute to one of the base relations

 The new attribute will have NULLs in all the tuples of the relation right
after the command is executed; hence, the NOT NULL constraint is not
allowed for such an attribute
ALTER TABLE EMPLOYEE ADD JOB VARCHAR (12);
The database users must still enter a value for the new attribute JOB for each
EMPLOYEE tuple. This can be done using the UPDATE command.
27
REFERENTIAL INTEGRITY OPTIONS
We can specify RESTRICT, CASCADE, SET NULL or SET DEFAULT on referential

integrity constraints (foreign keys)
CREATE TABLE DEPT

MGRSSN CHAR(9),
MGRSTARTDATE CHAR(9),
PRIMARY KEY (DNUMBER),
UNIQUE (DNAME),
FOREIGN KEY (MGRSSN) REFERENCES EMP
ON DELETE SET DEFAULT ON UPDATE CASCADE );
CREATE TABLE EMP

( ENAME VARCHAR(30) NOT NULL,
ESSN CHAR(9),
BDATE DATE,
DNO INTEGER DEFAULT 1,
SUPERSSN CHAR(9),
PRIMARY KEY (ESSN),
FOREIGN KEY (DNO) REFERENCES DEPT
ON DELETE SET DEFAULT ON UPDATE CASCADE,
FOREIGN KEY (SUPERSSN) REFERENCES EMP
ON DELETE SET NULL ON UPDATE CASCADE );
28
DML(Data Manipulation Language)
 SQL has one basic statement for retrieving information from a database; the
SELECT statement
 This is not the same as the SELECT operation of the relational algebra
 Important distinction between SQL and the formal relational model; SQL
allows a table (relation) to have two or more tuples that are identical in all their
attribute values
 Hence, an SQL relation (table) is a multi-set (sometimes called a bag) of
tuples; it is not a set of tuples
 SQL relations can be constrained to be sets by specifying PRIMARY KEY or
UNIQUE attributes, or by using the DISTINCT option in a query
 Basic form of the SQL SELECT statement is called a mapping or a SELECT-
FROM-WHERE block
SELECT <attribute list>

FROM <table list>
WHERE <condition>
 <attribute list> is a list of attribute names whose values are to be retrieved by

the query
 <table list> is a list of the relation names required to process the query
 <condition> is a conditional (Boolean) expression that identifies the tuples to be
retrieved by the query
29
Query 0: Retrieve the birthdate and address of the employee whose name is
'John B. Smith'.
SELECT BDATE, ADDRESS

FROM EMPLOYEE
WHERE FNAME='John' AND MINIT='B’
AND LNAME='Smith’
Similar to a SELECT-PROJECT pair of relational algebra operations; the SELECT-

clause specifies the projection attributes and the WHERE-clause specifies the
selection condition. However, the result of the query may contain duplicate
tuples.
Query 1: Retrieve the name and address of all employees who work for the
'Research' department.
SELECT FNAME, LNAME, ADDRESS

FROM EMPLOYEE, DEPARTMENT
WHERE DNAME='Research' AND DNUMBER=DNO
Query 2: For every project located in 'Stafford', list the project number, the
controlling department number, and the department manager's last name,
address, and birthdate.
SELECT PNUMBER, DNUM, LNAME, BDATE, ADDRESS

FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN AND
PLOCATION='Stafford'
In Q2, there are two join conditions

The join condition DNUM=DNUMBER relates a project to its controlling
department
The join condition MGRSSN=SSN relates the controlling department to the
employee who manages that department
30
Query 3: For each employee, retrieve the employee's name, and the name of
his or her immediate supervisor.
SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME

FROM EMPLOYEE E S
WHERE E.SUPERSSN=S.SSN
In Q3, the alternate relation names E and S are called aliases or tuple
variables for the EMPLOYEE relation
Query 4: Retrieve the SSN values for all employees.
SELECT SSN
FROMEMPLOYEE
Query 5:
SELECT SSN, DNAME
FROM EMPLOYEE, DEPARTMENT
If more than one relation is specified in the FROM-clause and there is no join
condition, then the CARTESIAN PRODUCT of tuples is selected
To retrieve all the attribute values of the selected tuples, a * is used, which
stands for all the attributes
Examples:
SELECT *
FROMEMPLOYEE
WHERE DNO=5
SELECT *
FROMEMPLOYEE, DEPARTMENT
WHERE DNAME='Research' AND
DNO=DNUMBER
31
USE OF DISTINCT
 SQL does not treat a relation as a set; duplicate tuples can appear
 To eliminate duplicate tuples in a query result, the keyword DISTINCT is
used
 For example, the result of Q6 may have duplicate SALARY values whereas
Q11A does not have any duplicate values
Q6: SELECT SALARY
FROMEMPLOYEE
Q6A: SELECT DISTINCT SALARY
FROMEMPLOYEE
SET OPERATIONS
 SQL has directly incorporated some set operations
 There is a union operation (UNION), and in some versions of SQL there
are set difference (MINUS) and intersection (INTERSECT) operations
 The resulting relations of these set operations are sets of tuples;
duplicate tuples are eliminated from the result
 The set operations apply only to union compatible relations ; the two
relations must have the same attributes and the attributes must appear
in the same order
Query 7: Make a list of all project numbers for projects that involve an
employee whose last name is 'Smith' as a worker or as a manager of the
department that controls the project.
(SELECT PNAME
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN AND
LNAME='Smith')
UNION
(SELECT PNAME
FROM PROJECT, WORKS_ON, EMPLOYEE
WHERE PNUMBER=PNO AND ESSN=SSN AND
LNAME='Smith')
32
NESTING OF QUERIES
 A complete SELECT query, called a nested query , can be specified within
the WHERE-clause of another query, called the outer query
Query 8: Retrieve the name and address of all employees who work for the
'Research' department.
SELECT FNAME, LNAME, ADDRESS

FROM EMPLOYEE
WHERE DNO IN (SELECT DNUMBER
FROM DEPARTMENT
WHERE DNAME='Research' )
 The nested query selects the number of the 'Research' department

 The outer query select an EMPLOYEE tuple if its DNO value is in the
result of either nested query
 The comparison operator IN compares a value v with a set (or multi-set)
of values V, and evaluates to TRUE if v is one of the elements in V
CORRELATED NESTED QUERIES

 If a condition in the WHERE-clause of a nested query references an
attribute of a relation declared in the outer query, the two queries are
said to be correlated
 The result of a correlated nested query is different for each tuple (or
combination of tuples) of the relation(s) the outer query
Query 9: Retrieve the name of each employee who has a dependent with the
same first name as the employee.
Q9: SELECT E.FNAME, E.LNAME

FROM EMPLOYEE AS E
WHERE E.SSN IN
(SELECT ESSN
FROMDEPENDENT
WHERE ESSN=E.SSN AND
E.FNAME=DEPENDENT_NAME)
33
 In Q9, the nested query has a different result for each tuple in the outer
query
 A query written with nested SELECT... FROM... WHERE... blocks and
using the = or IN comparison operators can always be expressed as a
single block query. For example, Q9 may be written as in Q9A
Q9A
SELECT E.FNAME, E.LNAME
FROM EMPLOYEE E, DEPENDENT D
WHERE E.SSN=D.ESSN AND
E.FNAME=D.DEPENDENT_NAME
Advanced SQL Features
SQL provides a powerful declarative query language. Writing queries in SQL is

usually much easier than coding the same queries in a general-purpose pro-
gramming language. However, a database programmer must have access to a
general-purpose programming language for at least two reasons:
1. Not all queries can be expressed in SQL, since SQL does not provide
the full expressive power of a general-purpose language. That is,
there exist queries that can be expressed in a language such as C,
Java, or Cobol that cannot be expressed in SQL. To write such
queries, we can embed SQL within a more powerful language.
2. Non declarative actions—such as printing a report, interacting with a
user, or sending the results of a query to a graphical user interface—
cannot be done from within SQL. Applications usually have several
components, and querying or updating data is only one component;
other components are written in general-purpose programming
languages. For an integrated application, there must be a means to
combine SQL with a general-purpose programming language.
34
There are two approaches to accessing SQL from a general-purpose program-

ming language:
 Dynamic SQL: A general-purpose program can connect to and

communicate with a database server using a collection of functions (for
procedural languages) or methods (for object-oriented languages). Dynamic
SQL allows the program to construct an SQL query as a character string at
runtime, submit the query, and then retrieve the result into program
variables a tuple at a time. The dynamic SQL component of SQL allows
programs to construct and submit SQL queries at runtime.
 Embedded SQL: Like dynamic SQL, embedded SQL provides a means by
which a program can interact with a database server. However, under
embedded SQL, the SQL statements are identified at compile time using a
preprocessor. The preprocessor submits the SQL statements to the
database system for pre compilation and optimization; then it replaces the
SQL statements in the application program with appropriate code and
function calls before invoking the programming-language compiler.
Dynamic SQL JDBC – Java Database Conncectivity
The JDBC standard defines an application program interface (API) that Java
programs can use to connect to database servers.
The programming involved to establish a JDBC connection is fairly simple. Here

are these simple four steps −
 Import JDBC Packages: Add import statements to your Java program to

import required classes in your Java code.
 Register JDBC Driver: This step causes the JVM to load the desired driver
implementation into memory so it can fulfill your JDBC requests.
 Database URL Formulation: This is to create a properly formatted address

that points to the database to which you wish to connect.
35
 Create Connection Object: Finally, code a call to the DriverManager

object's getConnection( ) method to establish actual database connection.
Import JDBC Packages
The Import statements tell the Java compiler where to find the classes you
reference in your code and are placed at the very beginning of your source code.
To use the standard JDBC package, which allows you to select, insert, update, and
delete data in SQL tables, add the following imports to your source code −
import java.sql.* ; // for standard JDBC programs
import java.math.* ; // for BigDecimal and BigInteger support
Register JDBC Driver
You must register the driver in your program before you use it. Registering the
driver is the process by which the Oracle driver's class file is loaded into the
memory, so it can be utilized as an implementation of the JDBC interfaces.
You need to do this registration only once in your program. You can register a
driver in one of two ways.
Approach I - Class.forName()
The most common approach to register a driver is to use Java's Class.forName()

method, to dynamically load the driver's class file into memory, which
automatically registers it. This method is preferable because it allows you to make
the driver registration configurable and portable
The following example uses Class.forName( ) to register the Oracle driver −
try {
Class.forName("oracle.jdbc.driver.OracleDriver");
catch(ClassNotFoundException ex) {
36
System.out.println("Error: unable to load driver class!");
System.exit(1);
You can use getInstance() method to work around noncompliant JVMs, but then
you'll have to code for two extra Exceptions as follows −
try {
Class.forName("oracle.jdbc.driver.OracleDriver").newInstance();
System.exit(1);
catch(IllegalAccessException ex) {
System.out.println("Error: access problem while loading!");
System.exit(2);
catch(InstantiationException ex) {
System.out.println("Error: unable to instantiate driver!");
System.exit(3);
Approach II - DriverManager.registerDriver()
The second approach you can use to register a driver, is to use the static
DriverManager.registerDriver() method.
You should use the registerDriver() method if you are using a non-JDK compliant
JVM, such as the one provided by Microsoft.
37
The following example uses registerDriver() to register the Oracle driver −
try {
Driver myDriver = new oracle.jdbc.driver.OracleDriver();
DriverManager.registerDriver( myDriver );
System.exit(1);
Database URL Formulation
After you've loaded the driver, you can establish a connection using the
DriverManager.getConnection() method. For easy reference, let me list the three
overloaded DriverManager.getConnection() methods −
 getConnection(String url)
 getConnection(String url, Properties prop)
 getConnection(String url, String user, String password)
Here each form requires a database URL. A database URL is an address that points
to your database.
Formulating a database URL is where most of the problems associated with
establishing a connection occurs.
38
Following table lists down the popular JDBC driver names and database URL.
RDBMS JDBC driver name URL format
MySQL com.mysql.jdbc.Driver jdbc:mysql://hostname/

databaseName
ORACLE oracle.jdbc.driver.OracleDriver jdbc:oracle:thin:@hostname:port

Number:databaseName
DB2 COM.ibm.db2.jdbc.net.DB2Driver jdbc:db2:hostname:port

Number/databaseName
Sybase com.sybase.jdbc.SybDriver jdbc:sybase:Tds:hostname: port

Number/databaseName
All the highlighted part in URL format is static and you need to change only the
remaining part as per your database setup.
Create Connection Object

We have listed down three forms of DriverManager.getConnection() method to
create a connection object.
Using a Database URL with a username and password
The most commonly used form of getConnection() requires you to pass a
database URL, a username, and a password:
Assuming you are using Oracle's thin driver, you'll specify a
host:port:databaseName value for the database portion of the URL.
If you have a host at TCP/IP address 192.0.0.1 with a host name of amrood, and
your Oracle listener is configured to listen on port 1521, and your database name
is EMP, then complete database URL would be −
39
jdbc:oracle:thin:@amrood:1521:EMP
Now you have to call getConnection() method with appropriate username and
password to get a Connection object as follows −
String URL = "jdbc:oracle:thin:@amrood:1521:EMP";
String USER = "username";
String PASS = "password"
Connection conn = DriverManager.getConnection(URL, USER, PASS);
Using Only a Database URL
A second form of the DriverManager.getConnection( ) method requires only a
database URL −
DriverManager.getConnection(String url);
However, in this case, the database URL includes the username and password and
has the following general form −
jdbc:oracle:driver:username/password@database
So, the above connection can be created as follows −
String URL = "jdbc:oracle:thin:username/password@amrood:1521:EMP";
Connection conn = DriverManager.getConnection(URL);
Using a Database URL and a Properties Object
A third form of the DriverManager.getConnection( ) method requires a database
URL and a Properties object −
DriverManager.getConnection(String url, Properties info);
A Properties object holds a set of keyword-value pairs. It is used to pass driver
properties to the driver during a call to the getConnection() method.
To make the same connection made by the previous examples, use the following
code −
import java.util.*;
40
String URL = "jdbc:oracle:thin:@amrood:1521:EMP";

Properties info = new Properties( );
info.put( "user", "username" );
info.put( "password", "password" );
Connection conn = DriverManager.getConnection(URL, info);
Closing JDBC Connections
At the end of your JDBC program, it is required explicitly to close all the
connections to the database to end each database session. However, if you
forget, Java's garbage collector will close the connection when it cleans up stale
objects.
Relying on the garbage collection, especially in database programming, is a very
poor programming practice. You should make a habit of always closing the
connection with the close() method associated with connection object.
To ensure that a connection is closed, you could provide a 'finally' block in your
code. A finally block always executes, regardless of an exception occurs or not.
To close the above opened connection, you should call close() method as follows
conn.close();
Explicitly closing a connection conserves DBMS resources, which will make your
database administrator happy.
41
ODBC
The Open Database Connectivity (ODBC) standard defines an API that

applications can use to open a connection with a data base, send queries and
updates and get back results.
Each database system supporting ODBC provides a library that must be linked
with the client program. When the client program makes an ODBC API call, the
code in the library communicates with the server to carry out the requested
action, and fetch results.
Once the connection is set up, the program can send SQL commands to the
database by using SQLExecDirect. C language variables can be bound to attributes
42
of the query result, so that when a result tuple is fetched using SQLFetch, its
attribute values are stored in corresponding C variables.
ADO.NET
The ADO.NET API, designed for the Visual Basic .NET and C# languages, provides
functions to access data, which at a high level are not dissimilar to the JDBC
functions, although details differ. Like JDBC and ODBC, theADO.NET API allows
access to results of SQL queries, as well as to metadata, but is considerably
simpler to use than ODBC. A database that supports ODBC can be accessed using
the ADO.NET API, and theADO.NET calls are translated into ODBC calls.
43
Embedded SQL
The SQL standard defines embeddings of SQL in a variety of programming

languages, such as C, C++, Cobol, Pascal, Java, PL/I, and Fortran. A language in
which SQL queries are embedded is referred to as a host language, and the SQL
structures permitted in the host language constitute embedded SQL.
An embedded SQL program must be processed by a special preprocessor

prior to compilation. The preprocessor replaces embedded SQL requests with
host-language declarations and procedure calls that allow runtime execution of
the database accesses. Then, the resulting program is compiled by the host-
language compiler. This is the main distinction between embedded SQL and JDBC
or ODBC.
To identify embedded SQL requests to the preprocessor, we use the EXEC SQL
statement; it has the form:
EXEC SQL <embedded SQL statement >;
The exact syntax for embedded SQL requests depends on the language in which
SQL is embedded. In some languages, such as Cobol, the semicolon is replaced
with END-EXEC.
Before executing any SQL statements, the program must first connect to the
database. This is done using:
EXEC SQL connect to server user user-name using password;
The syntax for declaring the variables, however, follows the usual host language
syntax.
EXEC SQL BEGIN DECLARE SECTION;

int credit amount;
EXEC SQL END DECLARE SECTION;
Embedded SQL statements are similar in form to regular SQL statements.
44
Consider the university schema. Assume that we have a host-language variable
credit amount in our program, declared as we saw earlier, and that we wish to
find the names of all students who have taken more than credit amount credit
hours. We can write this query as follows:
EXEC SQL
declare c cursor for
select ID, name from student
where tot cred > :credit amount;
SQLJ
The Java embedding of SQL, called SQLJ, provides the same features as other
embedded SQL implementations, but using a different syntax that more closely
matches features already present in Java, such as iterators. For example, SQLJ
uses the syntax #sql instead of EXEC SQL, and instead of cursors, uses the Java
iterator interface to fetch query results.
The code snippet below illustrates the use of iterators.
#sql iterator deptInfoIter ( String dept name, int avgSal);

deptInfoIter iter = null;
#sql iter = { select dept name, avg(salary) from instructor group by dept name };
while (iter.next()) { String deptName = iter.dept name();
int avgSal = iter.avgSal();
System.out.println(deptName + " " + avgSal); }
iter.close();
45
File System vs DBMS – Difference between File System and DBMS
46
47
48
Super Key Vs Candidate Key
BASIS FOR
SUPER KEY CANDIDATE KEY
COMPARISON
Basic A single attribute or a set of A proper subset of a super
attributes that uniquely key, which is also a super
identifies all attributes in a key is a candidate key.
relation is super key.
One in other It is not compulsory that all All candidate keys are
super keys will be candidate super keys.
keys.
Selection The set of super keys forms The set of candidate keys
the base for selection of form the base for selection
candidate keys. of a single primary key.
Count There are comparatively more There are comparatively
super keys in a relation. less candidate keys in a
relation.
49

Vec - Cse IV Semester - II Year - CS8492 - DBMS

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Vec - Cse IV Semester - II Year - CS8492 - DBMS

Hochgeladen von

Copyright:

Verfügbare Formate

VEC – CSE IV Semester – II Year – CS8492 – DBMS

CS8492 – DATABASE MANAGEMENT SYSTEMS

UNIT 1 RELATIONAL DATABASES 10

Purpose of Database System

These difﬁculties, among others, prompted the development of database

type instructor = record

Instances and Schemas

Student(StudentID, StudentName, Department, DOB)

 Rectangle: Represents Entity sets.

Object-Based Data Model.

Semistructured Data Model.

There are basically two types:

A query is a statement requesting the retrieval of information. The portion of a

Domain Constraints. A domain of possible values must be associated with every

Data Storage and Querying

Database System Architecture

 The architecture of a database system is greatly inﬂuenced by the

The storage manager components include:

The storage manager implements several data structures as part of the

The Query Processor

In two-tier architecture, the application resides at the client machine, where it

Database Users and Administrators

o Ensuring that enough free disk space is available for normal

Introduction to Relational Data Base

o student (ID, name, dept name, tot cred)

ID Name Department Email Credits

Degree = Total Number of Columns = 5

Cardinality = Total Number of Rows = 5

A superkey is a set of one or more attributes that, taken collectively, allow us to

A superkey may contain extraneous attributes. For example, the combination of

Candidate Key = Super Key – Primary Key

A key (whether primary, candidate, or super) is a property of the entire relation,

A superkey of a relation is a set of one or more attributes whose values are

A schema diagram is a pictorial depiction of the schema of a database that shows

SQL is Structured Query Language, which is a computer language for storing,

Also, they are using different dialects, such as:

Overview of the SQL Query Language

 Transaction control. SQL includes commands for specifying the beginning

1. Not Null Constraint

3. The check Clause

DDL (Data Definition Language) : DDL or Data Definition Language actually

Examples of DDL commands:

 SELECT – is used to retrieve data from the a database.

DCL(Data Control Language) : DCL includes commands such as GRANT and

 GRANT-gives user’s access privileges to database.

TCL(transaction Control Language) : TCL commands deals with the transaction

 COMMIT– commits a Transaction.

Underlined Column names are Primary Key Attributes

DDL (Data Definition Language)

CREATE TABLE DEPARTMENT

( DNAME VARCHAR(10) NOT NULL,

DROP TABLE DEPENDENT;

 Used to add an attribute to one of the base relations

ALTER TABLE EMPLOYEE ADD JOB VARCHAR (12);

REFERENTIAL INTEGRITY OPTIONS

We can specify RESTRICT, CASCADE, SET NULL or SET DEFAULT on referential

CREATE TABLE DEPT

CREATE TABLE EMP

DML(Data Manipulation Language)

SELECT <attribute list>

 <attribute list> is a list of attribute names whose values are to be retrieved by

SELECT BDATE, ADDRESS

Similar to a SELECT-PROJECT pair of relational algebra operations; the SELECT-