Syllabus Data Base Management System: Unit-I

SYLLABUS
DATA BASE MANAGEMENT SYSTEM
Unit- I
Database System Architecture - Basic Concepts : Data System, Operational Data, Data
Independence, Architecture for a Database System, Distributed Databases, Storage Structures :
Representation of Data. Data Structures and Corresponding Operators: Introduction, Relation
Approach, Hierarchical Approach, Network Approach.
Unit - II
Relational Approach : Relational Data Structure : Relation, Domain, Attributes, Key Relational
Algebra - Introduction, Traditional Set Operation. Attribute Attribute names for derived
relations - Special Relational Operations.
Unit - III
Embedded SQL: Introduction – Operations not involving cursors, involving cursors – Dynamic
statements, Query by Example – Retrieval operations, Built-in Functions, update operations -
QBE Dictionary. Normalization : Functional dependency, First, Second, Third normal forms,
Relations with more than one candidate key, Good and bad decomposition.
Unit - IV
Hierarchical Approach : IMS data structure - Physical Database, Database Description-
Hierarchical sequence - External level of IMS : Logical Databases, the program communication
block IMS Data manipulation : Defining the Program communication Block : DL / 1 Examples.
Unit - V
Network Approach : Architecture of DBTG System. DBTG Data Structure : The set construct,
Singular sets, Sample Schema, the external level of DBTG – DBTG Data Manipulation.
REFERENCE BOOKS
1. C.J.Date - An introduction to Database Systems, Seventh Edition
2. Abraham Silberschatz, Henry F Korth- Database Systems Concepts
3. Bipin C Desai - An introduction to Database Systems
Unit- I
Database System Architecture - Basic Concepts: Data System, Operational Data, Data
Independence, Architecture for a Database System, Distributed Databases, Storage Structures:
Representation of Data. Data Structures And Corresponding Operators: Introduction, Relation
Approach, Hierarchical Approach, Network Approach.
DATABASE
Data - In computing, data is information or a raw fact that has been translated into a form
that is more convenient to move or process.
DataBase - A database is a collection of data that is organized so that its contents can easily
be accessed, managed, and updated.
DataBase Management System - A database management system (DBMS) is a software
package designed to define, manipulate, retrieve and manage data in a database.
BASIC CONCEPTS
PURPOSE OF DATABASE SYSTEMS

A DBMS has evolved into a complex software system and its development typically
requires thousands of person-years of development effort.[4] Some general-purpose DBMSs
such as Adabas, Oracle and DB2 have been undergoing upgrades since the 1970s. General-
purpose DBMSs aim to meet the needs of as many applications as possible, which adds to the
complexity. However, the fact that their development cost can be spread over a large number of
users means that they are often the most cost-effective approach.
However, a general-purpose DBMS is not always the optimal solution: in some cases a
general-purpose DBMS may introduce unnecessary overhead. Therefore, there are many
examples of systems that use special-purpose databases. A common example is an email system:
email systems are designed to optimize the handling of email messages, and do not need
significant portions of a general-purpose DBMS functionality.
Many databases have application software that accesses the database on behalf of end-
users, without exposing the DBMS interface directly. Application programmers may use a wire
protocol directly, or more likely through an application programming interface. Database
designers and database administrators interact with the DBMS through dedicated interfaces to
build and maintain the applications' databases, and thus need some more knowledge and
understanding about how DBMSs operate and the DBMSs' external interfaces and tuning
parameters.
General-purpose databases are usually developed by one organization or community of
programmers, while a different group builds the applications that use it. In many companies,
specialized database administrators maintain databases, run reports, and may work on code that
runs on the databases themselves (rather than in the client application).
Consider a Savings Bank enterprise that keeps information about customers and savings
accounts. One way to keep the information in a computer is – to store it in permanent system
files.
To allow the users to manipulate the information, the system has a number of application
programs that manipulate the files including:
 A program to debit or credit an account

 A program to add new accounts
 A program to find the balance of an account
 A program to generate monthly statements
These application programs are written by system programmers in response to the needs of the
bank organization.
New application programs are added to the system when need arises. As a result, new files are
created and new application programs are needed. As time passes, more files and more
application programs are added to the system.
‘File Processing System’ - is supported by convenient operating systems. Permanent records are
stored in various files and different programs are written to extract records from, and to add
records to the appropriate files.
Before the advent of DBMS, organizations use only ‘File Processing Systems’
The Disadvantages of File Processing System:

1. Data Redundancy & Inconsistency
2. Difficulty in accessing the data
3. Data Isolation
4. Integrity Problems
5. Atomicity Problems
6. Concurrent User Anamolies
7. Security Problems
1. Data Redundancy & Inconsistency
Since the files and application programs are created by different programmers over a long
period, the various files are likely to have different formats and the programs may be written in
several programming languages.
The same information may be duplicated in several places (copied in several files)
For Example, the address and telephone number of a particular customer may appear in a file
that consists of savings account record and in a file that contains checking account records.
This redundancy leads to higher storage and access cost.
In addition, it leads to ‘Data Inconsistency’ (i.e.) various copies of the same data may no longer
agree. For example, a Changed customer address may be reflected in Savings account but not
elsewhere.
2. Difficulty in accessing the data.

Consider the following example:
One of the officers need to ‘Find out the names of all customers live within the city’s 641 042’s
pincode’ the officer asks the data processing department to generate such list. Because this
request was not included when the original system was designed, there is no application program
to meet it.
There is a program to generate ‘List of all customers’. The officer has 2 choices.
Obtain the list of customers and have the needed information to be extracted manually. Asking
the data processing department to have a system programmer to write the necessary application
program.
Both alternatives are ‘Unsatisfactory’. Conventional Data Processing Environments do not allow
the needed data be retrieved in a conveneient and efficient manner.
3. Data Isolation
The data are scattered in various files and the files may be in different formats, it is very difficult
to write new applications programs to retrieve the appropriate data.
4. Integrity Problems
The data values stored in the database – must satisfy certain types of ‘Consistency Constraints”
E.g. The balance of a bank account may never fall below a prescribed amount (say 25$).
Developers enforce these constraints in the system by ‘Adding appropriate code’ in the various
application programs. When new constraints are added, it is difficult to change the program.
The problem is compounded when the constraint involves several data items from different files.
5. Atomicity Problems
A computer system – may subject to failure. In many applications, it is crucial to ensure that,
once a failure has occurred and has been detected, the data are restored in the state existed prior
to failure.
E.g. Consider a program to transfer 50$ from account A to B. If a system failure occurs during
the execution of the program, it is possible that the $50 was removed from account A, but not
credited to account B, resulting in inconsistent database state.
That is transfer or any operation must happen fully, or not at all. This property is very difficult to
achieve in conventional file processing systems.
6. Concurrent Access Anamolies
Many systems allow multiple users to access and update the data simultaneously.
In such an environment, interaction of concurrent updates results in inconsistent data.
E.g. Consider the bank account A, containing $500. If two customers withdraw funds (Say $50
and $100) from account A at the same time, the result of concurrent executions leave the account
in an incorrect or inconsistent state.
Suppose that the programmer executing on behalf of each withdrawal read the old balance,
reduce that value by the amount being withdrawn and write the result back.
If the two programs run concurrently, they read the values $500 and update the data
simultaneously.
7. Security Problems
Consider the example: In a banking system, payroll users – need to see only the part of the
database that has the information about it. They need not have access to see the information
about customer accounts.
These difficulties, among others, have prompted the development of DBMS.
DATA SYSTEM
DBMS contains information about a particular enterprise
 Collection of interrelated data
 set of programs to access the data
 An environment that is both convenient and efficient to use.
Database Design
 It is important to design the database in such a way that:
 A specific item can be reached easily (maximum guarantee that the desired record will be
reached)
 The database can respond to the user’s different questions easily
(necessary relationships are provided)
 The database occupies minimum storage space (choosing data types and how to express a
certain concept is important)
 The database contains no unnecessary data (storing the gross salary is enough, the net salary
can be calculated from the gross salary)
 Data can be added and updated easily without causing mistakes
(no data redundancy)
Steps In Database Design

 Requirement analysis
What does the user want?
 Conceptual database design
Defining the entities and attributes, and the relationships between these --> The ER model
 Physical database design
Implementation of the conceptual design using a Database Management System
Foundation Data Concept

A hierarchy of several levels of data has been devised that differentiates between different
groupings, or elements, of data. Data are logically organized into:
Character
It is the most basic logical data element. It consists of a single alphabetic, numeric, or other
symbol.
Field
It consists of a grouping of characters. A data field represents an attribute (a characteristic or
quality) of some entity (object, person, place, or event).
Record
The related fields of data are grouped to form a record. Thus, a record represents a collection
of attributes that describe an entity. Fixed-length records contain, a fixed number of fixed-length
data fields. Variable-length records contain a variable number of fields and field lengths.
File
A group of related records is known as a data file, or table. Files are frequently classified by
the application for which they ar primarily used, such as a payroll file or an inventory file, or the
type of data they contain, such as a document file or a graphical image file. Files are also
classified by their permanence, for example, a master file versus a transaction file. A transaction
file would contain records of
all transactions occurring during a period, whereas a master file contains all the permanent
records. A history file is an obsolete transaction or master file retained for backup purposes or
for long-term historical storage called archival storage.
Database
It is an integrated collection of logically related records or objects. A database consolidates
records previously stored in separate files into a common pool of data records that provides data
for many applications. The data stored in a database is independent of the application programs
using it and o the ‘type of secondary storage devices on which it is stored.
CHARACTERISTICS OF DBMS
1. Self-contained nature
DBMS system contains data plus a full description of the data (called “metadata”)
“metadata” is data about data - data formats, record structures, locations, how to access, indexes
metadata is stored in a catalog and is used by DBMS software to know how to access the data.
Contrast this with the file processing approach where application programs need to know the
structure and format of records and data.
2. Program-data independence
Data independence is immunity of application programs to changes in storage structures and
access techniques. E.g. adding a new field, changing index structure, changing data format, In a
DBMS environment these changes are reflected in the catalog. Applications aren’t affected.
Traditional file processing programs would all have to
change, possibly substantially.
3. Data abstraction
A DBMS provides users with a conceptual representation of data (for example, as objects
with properties and inter-relationships). Storage details are hidden. Conceptual representation is
provided in terms of a data model.
4. Support for multiple views

DBMS may allow different users to see different “views” of the DB, according to the
perspective each one requires. E.g. a subset of the data - For example; the people using the
payroll system need not/should not see data about students and class schedules. E.g. data
presented in a different form from the way it is stored - For example someone interested in
student transcripts might get a view which is formed by combining information from separate
files or tables.
5. Centralized control of the data resource

The DBMS provides centralized control of data in an organization.
This brings a number of advantages:
(a) reduces redundancy
(b) avoids inconsistencies
(c) data can be shared
(d) standards can be enforced
(e) security restrictions can be applied
(f) integrity can be maintained
DATABASE APPLICATIONS
 banking: all transactions
 airlines: reservations, schedules
 universities: registration, grades
 sales: customers, products, purchases
 online retailers: order tracking, customized recommendations.
 manufacturing: production, inventory, orders, supply chain
 human resources: employee records, salaries, tax Deductions.
The interactions catered for by most existing DBMSs fall into four main groups:
 Data definition – Defining new data structures for a database, removing data structures from
the database, modifying the structure of existing data.
 Update – Inserting, modifying, and deleting data.
 Retrieval – Obtaining information either for end-user queries and reports or for processing by
applications.
 Administration – Registering and monitoring users, enforcing data security, monitoring
performance, maintaining data integrity, dealing with concurrency control, and recovering
information if the system fails.
TYPES OF DATABASES
Continuing developments in information technology and its business applications have
resulted in the evolution of several major types of databases. Several major conceptual
categories of databases that may be found in computer-using organizations include:
Operational Databases
The databases store detailed data needed to support the operations of the entire organization.
They are also called subject area databases (SADB), transaction databases, and production
databases: Examples are customer databases, personnel databases, inventory databases, and
other databases containing data generated by business operations.
Distributed Databases
Many organizations replicate and distribute copies or parts of databases to network
sewers at a variety of sites. These distributed databases can reside on network servers
on the World Wide Web, on corporate Intranets or extranets, or on other company networks.
Distributed databases may be copies of operational or analytical. databases,
hypermedia or discussion databases, or any other type of database. Replication and distribution
of databases is done to improve database performance and security.
External Databases
Access to external, privately owned online databases or data banks is available for a fee to
end users and organizations from commercial online services, and with or without charge from
many sources on the Internet, especially the Web.
Hypermedia Databases. It consists of hyperlinked pages of multimedia (text, graphics, and
photographic images, video clips, audio segments, etc
VARIOUS COMPONENTS OF DBMS

A database system has four components. These four
components are important for understanding and designing the database system. These
are:
1. Data
2. Hardware
3. Software
4. Users
1. Data
As we have discussed above, data is raw hand information collected by us. Data is made up
of data item or data aggregate. A Data item is the smallest unit of named data: It may consist of
bits or bytes. A Data item is often referred to as field or data element. A Data aggregate is the
collection of data items within the record, which is given a name and referred as a whole. Data
can be collected orally or written. A database can be integrated and shared. Data stored in a
system is partition into one or two databases. So if by chance data lost or damaged at one place,
then it can be accessed from the second place by using the sharing facility of data base system.
So a shared data also cane be reused according to the user’s requirement. Also data must be in
the integrated form. Integration means data should be in unique form i.e. data collected by using
a well-defined manner with no redundancy, for example Roll number in a class is non-redundant
form and so these have unique resistance, but names in class may be in the redundant form and
can create lot of problems later on in using and accessing the data.
2. Hardware
Hardware is also a major and primary part of the database. Without hardware nothing can be
done. The definition of Hardware is “which we can touch and see”, i.e. it has physical
existences. All physical quantity or items are in this category. For example, all the hardware
input/output and storage devices like keyboard, mouse, scanner, monitor, storage devices (hard
disk, floppy disk, magnetic disk, and magnetic drum) etc. are commonly used with a computer
system.
3. Software
Software is another major part of the database system. It is the other side of hardware.
Hardware and software are two sides of a coin. They go side by side. Software is a system.
Software are further subdivided into two categories, First type is system software (like all the
operating systems, all the languages and system packages etc.) and second one is an application
software (payroll, electricity billing, hospital management and hostel administration etc.). We
can define software as which we cannot touch and see. Software only can execute. By using
software, data can be manipulated, organized and stored.
4. Users
Without user all of the above said components (data, hardware & software) are meaning less.
User can collect the data, operate and handle the hardware. Also operator feeds the data and
arranges the data in order by executing the software. Other components
1. People - Database administrator; system developer; end user.
2. CASE tools: Computer-aided Software Engineering (CASE) tools.
3. User interface - Microsoft Access; PowerBuilder.
4. Application Programs - PowerBuilder script language; Visual Basic; C++; COBOL.
5. Repository - Store definitions of data called METADATA, screen and report formats,
menu definitions, etc.
6. Database - Store actual occurrences data.
7. DBMS - Provide tools to manage all of this - create data, maintain data, control security
access to data and to the repository, etc.
Advantages of a DBMS
One of the major advantages of using a database system is that the organization
can be handled easily and have centralized management and control over the data by
the DBA. Some more and main advantages of database management system are given
below:
The main advantages of DBMS are
 Data independence
 Efficient data access
 Data integrity & security
 Data administration
 Concurrent access, crash recovery
 Reduced application development time
1. Controlling Redundancy
In a DBMS there is no redundancy (duplicate data). If any type of duplicate data arises, then
DBA can control and arrange data in non-redundant way. It stores the data on the basis of a
primary key, which is always unique key and have non-redundant information. For
example, Roll no is the primary key to store the student data.
In traditional file processing, every user group maintains its own files. Each group independently
keeps files on their db e.g., students. Therefore, much of the data is stored twice or more.
Redundancy leads to several problems:
2. Duplication of effort
Storage space wasted when the same data is stored repeatedly
Files that represent the same data may become inconsistent (since the updates are applied
independently by each users group).We can use controlled redundancy.
Restricting Unauthorized Access
A DBMS should provide a security and authorization subsystem.
Some db users will not be authorized to access all information in the db (e.g., financial data).
Some users are allowed only to retrieve data.
Some users are allowed both to retrieve and to update database.
3. Providing Persistent Storage for Program Objects and Data Structures

Data structure provided by DBMS must be compatible with the programming language’s
data structures. E.g., object oriented DBMS are compatible with programming languages such as
C++, SMALL TALK, and the DBMS software automatically performs conversions between
programming data structure and file formats.
4. Permitting Inferencing and Actions Using Deduction Rules

Deductive database systems provide capabilities for defining deduction rules for inferencing
new information from the stored database facts.
5. Inconsistency can be reduced

In a database system to some extent data is stored in, inconsistent way. Inconsistency is
another form of delicacy. Suppose that an em1oyee “Japneet” work in department “Computer” is
represented by two distinct entries in a database. So way inconsistent data is stored and DBA can
remove this inconsistent data by using DBMS.
6. Data can be shared

In a database system data can be easily shared by different users. For example, student data
can be share by teacher department, administrative block, accounts branch arid laboratory etc.
7. Standard can be enforced or maintained

By using database system, standard can be maintained in an organization. DBA is overall
controller of database system. Database is manually computed, but when DBA uses a DBMS and
enter the data in computer, then standard can be enforced or maintained by using the
computerized system.
8. Security can be maintained
Passwords can be applied in a database system or file can be secured by DBA. Also in a
database system, there are different coding techniques to code the data i.e. safe the data from
unauthorized access. Also it provides login facility to use for securing and saving the data either
by accidental threat or by intentional threat. Same recovery procedure can be also maintained to
access the data by using the DBMS facility.
9. Integrity can be maintained
In a database system, data can be written or stored in integrated way. Integration means
unification and sequencing of data. In other words it can be defined as “the data contained in the
data base is both accurate and consistent”. ‘Data can be accessed if it is
compiled in a unique form. We can take primary key ad some secondary key for integration of
data. Centralized control can also ensure that adequate checks are
incorporated in the DBMS to provide data integrity.
10. Confliction can be removed

In a database system, data can be written or arranged in a well-defined manner by DBA. So
there is no confliction between the databases. DBA select the best file structure and accessing
strategy to get better performance for the representation and use of the
data.
11. Providing Multiple User Interfaces

For example query languages, programming languages interfaces, forms, menu- driven
interfaces, etc.
12. Representing Complex Relationships Among Data

It is used to represent Complex Relationships Among Data
13. Providing Backup and Recovery

The DBMS also provides back up and recovery features.
Disadvantages of a DBMS
Database management system has many advantages, but due to some major problem
arise in using the DBMS, it has some disadvantages. These are explained as:
1.Cost
A significant disadvantage of DBMS is cost. In addition to the cost of purchasing or
developing the software, the organization *111 also purchase or upgrade the hardware
and so it becomes a costly system. Also additional cost occurs due to migration of data
from one environment of DBMS to another environment.
2. Problems associated with centralization

Centralization also means that data is accessible from a single source. As we know the
centralized data can be accessed by each user, so there is no security of data from unauthorized
access and data can be damaged or lost.
3. Complexity of backup and recovery
Backup and recovery are fairly complex in DBMS environment. As in a DBMS, if you take a
backup of the data then it may affect the multi-user database system which is in operation.
Damage database can be recovered from the backup floppy, but iterate duplicacy in loading to
the concurrent multi-user database system.
4. Confidentiality, Privacy and Security

When information is centralized and is made available to users from remote locations, the
possibilities of abuse are often more than in a conventional system. To reduce the chances of
unauthorized users accessing sensitive information, it is necessary to take technical,
administrative and, possibly, legal measures. Most, databases store valuable information that
must be protected against deliberate trespass and destruction.
5. Data Quality
Since the database is accessible to users remotely, adequate controls are needed to control
users updating data and to control data quality. With increased number of users accessing data
directly, there are enormous opportunities for users to damage the data. Unless there are suitable
controls, the data quality may be compromised.
6. Data Integrity
Since a large number of users could be using .a database concurrently, technical safeguards
are necessary to ensure that the data remain correct during operation. The main threat to data
integrity comes from several different users attempting to update the same data at the same time.
The database therefore needs to be protected against inadvertent changes by the users.
7. Enterprise Vulnerability
Centralizing all data of an enterprise in one database may mean that the database becomes an
indispensable resource. The survival of the enterprise may depend on reliable information being
available from its database. The enterprise therefore becomes vulnerable to the destruction of the
database or to unauthorized modification of the database.
8. The Cost of using a DBMS

Conventional data processing systems are typically designed to run a number of well-
defined, preplanned processes. Such systems are often “tuned” to run efficiently for the
processes that they were designed for. Although the conventional systems are usually fairly
inflexible in that new applications may be difficult to implement and/or expensive to run, they
are usually very efficient for the applications they are designed for.
The database approach on the other hand provides a flexible alternative where new
applications can be developed relatively inexpensively. The flexible approach is not without its
costs and one of these costs is the additional cost of running applications that the conventional
system was designed for. Using standardized software is almost always less machine efficient
than specialized software.
INSTANCES AND SCHEMAS

 Databases change over time as the information is inserted and deleted.
 The Collection of information stored in the database at a particular moment is called as an
‘Instance’ of the database.
 The ‘Overall design of the database ‘is called the ‘Database Schema” Schemas are changed
infrequently, if at all.
 E.g. If we consider the a student database, the student database as such is called the ‘Schema’
of the application.
Analogies to the concepts of data types, variables and values in programming languages is
useful.
E.g. consider again the pascal example:
In declaring the type ‘customer’ – no variables are declared. To declare such variables, we write:
Var customer1: customer
Corresponds to an area of storage containing a ‘customer’ type record i.e. what all instances
declared in the customer type definition holds good for this instance also.
A database schema corresponds to the programming language ‘type’ definition of ‘pascal’ or
‘structure’ data type of C.
A variable of a given type has a particular value at a given instance. The value of a variable in
programming language corresponds to an ‘instance’ of a database schema.
Database systems have several schemas.
They are partitioned according to the levels of abstraction.
Lowest Level Physical Schema
Intermediate Level Logical Schema
Highest Level Subschema
In general, a database system – supports One Physical Schema. One Logical schema and several
sub schemas.
MODEL
Data Model is defined as “A collection of Conceptual Tools for (1) Describing Data, (2) Setting
Data Relationships (3) Setting the syntax and semantics for the data and (4) Setting up
Consistency Constraints.
The Various data models fall under 3 different Groups:
 Object Based Logical Models
 Record Based Logical Models
 Physical Models
 Object Based Logical Models
Used in describing the data at the ‘Logical’ and ‘View’ Level
They are characterized by the fact that they provide fairly flexible structuring capabilities and
allow data constraints to be specified explicitly.
Many different models falling under this category are:
 Entity Relationship Model
 Object Oriented Model
 Semantic Data Model
 Functional Data Model
ENTITY RELATIONSHIP MODEL
E-R model is based on the ‘perception’ of the ‘real world’ that consists of a collection of basic
objects, called ‘entities’ and of ‘relationships’ among those objects
An entity is a “thing” or “object” in the real world that is distinguishable from other objects.
e.g. Each person is an entity
Entities are described in a database by a ‘set of attributes’
A ‘Relationship’ is an association among several entities.
A set of all entities of the same type and set of all relationships of the same type is called ‘Entity
Set’ and ‘Relationship Set’. In addition to entities and relationships, the ER model represents
certain constraints to which the contents of database must conform.
“Mapping Cardinalities” is an important constraint which represents / expresses the no.of entities
to which another entity can be associated via a relationship set.
The overall logical structure of a database can be expressed graphically by an E-R diagram
which is built up from the following components.
----à Rectangles – Entity Sets
----à Ellipses – Attributes
----à Diamonds – relationships among the entity sets
----à Lines – link attributes to entity sets and entity sets to relationships.
OBJECT ORIENTED MODEL
Like the E-R model, the OO model is based on a ‘Collection of Objects’. An object contains
values stored in ‘Instance Variables’ within the object.
An object contains ‘Bodies of Code’ that operates on the object. These bodies of code are called
‘Methods’
Objects that contain the same type of values and same methods are grouped into ‘Classes’.
The only way in which one object can access the data of another object is ‘By invoking a
method’ of that other object.
Object DBMSs add database functionality to object programming languages. They bring much
more than persistent storage of programming language objects. Object DBMSs extend the
semantics of the C++, Smalltalk and Java object programming languages to provide full-featured
database programming capability, while retaining native language compatibility. A major benefit
of this approach is the unification of the application and database development into a seamless
data model and language environment. As a result, applications require less code, use more
natural data modeling, and code bases are easier to maintain. Object developers can write
complete database applications with a modest amount of additional effort.
According to Rao (1994), "The object-oriented database (OODB) paradigm is the combination
of object-oriented programming language (OOPL) systems and persistent systems. The power of
the OODB comes from the seamless treatment of both persistent data, as found in databases, and
transient data, as found in executing programs."
In contrast to a relational DBMS where a complex data structure must be flattened out to fit into
tables or joined together from those tables to form the in-memory structure, object DBMSs have
no performance overhead to store or retrieve a web or hierarchy of interrelated objects. This one-
to-one mapping of object programming language objects to database objects has two benefits
over other storage approaches: it provides higher performance management of objects, and it
enables better management of the complex interrelationships between objects. This makes object
DBMSs better suited to support applications such as financial portfolio risk analysis systems,
telecommunications service applications, world wide web document structures, design and
manufacturing systems, and hospital patient record systems, which have complex relationships
between data.
‘Sending a Message’ to the object. The instance variable and methods code are not visible
externally and hence these are the 2 levels of abstraction.
e.g.
Consider an object representing a bank account. The object contains ‘instance variables’ account
number and balance. It contains a ‘method’ – pay interest – which adds interest to that balance.
Assume that the bank is paying 6% interest on all accounts, but now changing the policy to pay
5% if the balance is < 1000 or 6% if the balance > 1000.
Under most data models, making this adjustment involves – changing codes in one or more
application programs.
Under the OO model, the only change is made within ‘pay interest’ method. The external
interface to objects remain unchanged.
The basic unit that an object-oriented (OO-DBMS) manages is the object. It is based on four
basic concepts of abstraction:
 Classification: Mapping of several objects (instances) to common class
 Generalization: Group several classes which have the same properties in common (roads,
railway)-transportation network.
 Association: Relation between similar objects is considered a higher level set object
 Aggregation: Objects which consist of several other objects (Composed objects)
OO model uses objects rather than records to manage data.An object is a collection of data
elements and operations that together are considered a single entity
An object has associated with it a set of variables that contain the data for the object, a set of
messages to which the object respond, and a method which response to the message.Once the
structure is setup, the details of it need not be user visible.This approach has the attraction that
query is very natural
Objects are typed and the format and operations of an object instance are the same as some
object prototype
Example of an object might be a lake:
 List of border chain: C1, C2, C3, Cn
 List of nodes: N1, N2, N3, Nn
 Attribute: Depth, soil type
Object oriented Model Examples
Student
First year Second year
For example student can be a superclass. First and second year student may represented by a
classes that are specialization of a student class variables and methods specific to first year
students are associated with fist year student class.Variables and methods that apply both to first
and second year students are associated with student class.The variables associated with each
class may be:
Student: Name, ID, address
First year student: Subject
Second year student:Practical course
RECORD BASED LOGICAL MODELS

Used in describing data at the logical and view level.
They are used both to specify the overall logical structure of the database and to provide higher
level description of implementation.
Record based models are named because the database is structured in ‘fixed format’ records of
several types.
Each record type defines a fixed no.of fields or attributes and each field is of a fixed length.
The 3 most widely accepted record based data models are:
Relational
Network
Hierarchical Used in large no.of older databases.
Relational model
The ER model is a conceptual data model that views the real world as entities and
relationships. A basic component of the model is the Entity-Relationship diagram, which is used
to visually represent data objects. Entity Relationship Diagram(ER Diagram) is used to
represent the requirement analysis at the conceptual design stage .The database is designed from
the ER Diagram or we can say that the ER digram is converted to the database. Each entity in
the ER diagram corresponds to a table in the database.The attributes of an entity corresponds to
fields of a tables.
For the database designer, the utility of the ER model is:
 It maps well to the relational model. The constructs used in the ER model can easily be
transformed into relational tables.
 It is simple and easy to understand with a minimum of training. Therefore, the model can be
used by the database designer to communicate the design to the end user.
 In addition, the model can be used as a design plan by the database developer to implement a
data model in specific database management software.
Elements of E-R
A relational database consists of a collection of tables, each of which is assigned a unique
name.The relational models differs from network and hierarchical models in that it does not use
pointers or links. Instead , the relational model relate records by the value they contain.This
freedom from the use of pointers allows formal mathematical foundation to be defined
Examples of RDBMS are Oracle, Informix, and Sybase
Independence of the physical data storage and logical database structure. Results in users do not
need to understand the underlying physical layout of the data to access data from a logical
structure, such as a table.Variable and easy access to all data. Results in access to data is not
predefined as in hierarchical databases in which users must understand and navigate through the
hierarchy to retrieve data
Flexible in database design. i.e complex objects are expressed as simple tables and
relationships.Applying relational design methods reduces data redundancy (Normalization) and
storage requirements
Advantages and Disadvantages of E-R Data Model
Advantages of an E-R Model
Straightforward relation representation: Having designed an E-R diagram for a database
application, the relational representation of the database model becomes relatively
straightforward.
Easy conversion for E-R to other data model: Conversion from E-R diagram to a network or
hierarchical data model can· easily be accomplished.
Graphical representation for better understanding: An E-R model gives graphical and
diagrammatical representation of various entities, its attributes and relationships between
entities. This is turn helps in the clear understanding of the data structure and in minimizing
redundancy and other problems.
Disadvantages of an E-R Model:
No industry standard for notation: There is no industry standard notation for developing an E-
R diagram.
Popular for high-level design: The E-R data model is especially popular for high level.
HIERARCHICAL MODEL
A hierarchical database model is a data model in which the data is organized into atree-like
structure. The structure allows representing information using parent/child relationships: each
parent can have many children, but each child has only one parent (also known as a 1-to-many
relationship). All attributes of a specific record are listed under an entity type.
In a database an entity type is the equivalent of a table. Each individual record is represented as a
row, and each attribute as a column. Entity types are related to each other using 1:N mappings,
also known as one-to-many relationships. This model is recognized as the first database model
created by IBM in the 1960s.
 A parent may have an arrow pointing to a child, but a child must have an arrow pointing to
its parent. Database schema is represented as a collection of tree-structure diagrams.
 single instance of a database tree
 The root of this tree is a dummy node
 The children of that node are actual instances of the appropriate record type
 When transforming E-R diagrams to corresponding tree-structure diagrams, we must ensure
that the resulting diagrams are in the form of rooted trees.
A hierarchical database consists of a collection of records that are connected to each other
through links. A record is similar to a record in the network model. Each record is a collection of
ﬁelds (attributes), each of which contains only one data value. A link is an association between
precisely two records. Thus, a link here is similar to a link in the network model
NETWORK MODEL
Network model is a scheme of database management in an organisation that allows the flow of
information or data from one user to another. The network data models are normally used for
representing a complex data relationship effectively to users, improving the performance of
database, and creating a database standard. The Network Model replaces the hierarchical tree
with graph thus allowing more general connection with the nodes
Advantages and Disadvantages of Network Model
The Network model retains almost all the advantages of the hierarchical model while eliminating
some of its shortcomings.
The main advantages of the network model are
Conceptual simplicity: Just like the hierarchical model, the network model IS also conceptually
simple and easy to design.
Capability to handle more relationship types: The network model can handle the one to- many
(l:N) and many to many (N:N) relationships, which is a real help in modeling the real life
situations.
Ease of data access: The data access is easier and flexible than the hierarchical model.
Data Integrity: The network model does not allow a member to exist without an owner. Thus, a
user must first define the owner record and then the member record. This ensures the data
integrity.
Data independence: The network model is better than the hierarchical model in isolating the
programs from the complex physical storage details.
Database Standards: One of the major drawbacks of the hierarchical model was the non-
availability of universal standards for database design and modeling. The network model is
based on the standards formulated by the DBTG and augmented by ANSI/SP ARC (American
National Standards Institute/Standards Planning and Requirements Committee) in the 1970s. All
the network database management systems conformed to these standards. These standards
included a Data Definition Language (DDL) and the Data Manipulation Language (DML), thus
greatly enhancing database administration and portability.
Disadvantages of Network Model
Even though the network database model was significantly better than the hierarchical database
model, it also had many drawbacks. Some of them are:
System complexity: All the records are maintained using pointers and hence the whole database
structure becomes very complex.
Operational Anomalies: As discussed earlier, network model's insertion, deletion and updating
operations of any record require large number of pointer adjustments, which makes its
implementation very complex and complicated.
Absence of structural independence: Since the data access method in the network database
model is a navigational system, making structural changes to the database is very difficult in
most cases and impossible in some cases. If changes are made to the database structure then all
the application programs need to be modified before they can access data. Thus, even though the
network database model succeeds in achieving data independence, it still fails to achieve
structural independence.
Because of the disadvantages mentioned and the implementation and administration
complexities, the relational database model replaced both the hierarchical and network database
models in the 1980s. The evolution of the relational database model is considered as one of the
greatest events-a major breakthrough in the history of database management.
OPERATIONAL DATA
1. data for the daily operations of the business
2. mostly stored in relational database, optimized to support transactions representing daily

operations
3. Data and knowledge are probably among the most important assets an organization has. Data
being stored for operational, backup or archive purposes. operational data is typically stored in:
databases- data from transaction processing or customer data files- business documents, images,
or company brochures, or disk storage
4. lack of data standards & definitions, poor quality data, not useful formats and organization
5. database for transaction processing systems that use data warehouse concepts to provide clean
data, cheaper
6. typically stored in: databases—data from transaction processing systems or customer data;
files—business documents, images, or company brochures; or
7. contains data that are continually updated as transactions are
8. Type of data includes such information as the number of aircraft on the mission, lost,
damaged, or aborted?
9. data that specifies as possible every operation, procedure, and instrument needed to measure a
construct
10. time span, granularity, dimensionality
11. is sometimes too large to be queried and may bring whole system down
12. Mostly stored in relational database. optimized to support transactions representing daily
operations (crud)
13. sales, costs, inventory
14. databases-data from transaction processing systems or customer data; files-business

documents, images, or company brochures, on disk storage
15. lack of data standards & definitions, poor quality data, not useful formats and organization
16. informatics
17. data used for operational needs, data that is used to help a function/operation follow
through, produced through queries
18. continuous; expense areas; marketing, hr, acc etc
19. what does it take to run day to day business
20. serves the needs of day-to-day operations
DATA INDEPENDENCE
The ability to modify a schema definition in one level without affecting a schema definition in
the next higher level is called as ‘Data Independence’
There are 2 levels of Data Independence:
Physical Data Independence :
Protection from changes in physical structure of data
Logical Data Independence :
Protection from changes in logical structure of data.
Physical Data Independence

It is the ability to modify the physical schema without causing application programs to be
rewritten. Modifications at the physical level is occasionally necessary to improve the
performance.The ability to change the internal schema without having to change the conceptual
schema. By extension, the external schema should not change as well.
Physical file reorganization to improve performance (such as creating access structures) results
in a change to the internal schema. If the same data as before remains in the database, the
conceptual schema should not change.
For example, providing an access path to improve retrieval speed of section records by semester
and year, should not require a query to be changed, although it should become more efficient by
utilizing the access path.With a multi-level DBMS, the catalogue must be expanded to include
information on how to map requests and data among the levels. The DBMS uses additional
software to accomplish the mappings.
Data independence occurs because when the schema is changed at some level, the schema at the
next higher level remains unchanged. Only the mapping between the levels is changed.
e.g.Changing the width of a particular variable – i.e. changing from varchar2(10)
to varchar2(25) etc.
– the application programs are not affected and they need not be changed.
Logical Data Independence
The ability to modify the logical schema without causing application programs to be rewritten.
Modifications at the logical level are necessary whenever the logical structure of the database is
altered.Logical data independence is more difficult to achieve than the physical independence,
since application programs are heavily dependent on the logical structure of the data that they
access.
The concept of data independence is similar in many respects to the concept of ‘Abstract Data
Types’ in programming languages.Both of them ‘hide implementation details’ from the users, to
allow the users to concentrate on the general structure rather than on low-level implementation
details.
The ability to change the conceptual schema without having to change the external schemas or
application programs. When data is added or removed, only the view definition and the
mappings need to be changed in the DBMS that support logical data independence.
If the conceptual schema undergoes a logical reorganization, application programs that reference
the external schema
VIEW OF THE DATA
A major purpose of a database system is “To provide the users with an “Abstract View” of the
data” i.e. The system holds certain details of how the data are stored and maintained.
Data Abstraction
For the system to be usable, it must retrieve the data efficiently. This led to the design of
complex data structures for the representation of data in the database.
Since many database system users are not computer trained, developers hide the complexity
from users through several levels of abstraction, to simplify users’ interaction with the system.
Physical Level
The Lowest level of abstraction describes “how” the data are actually stored. At the Physical
level, complex low level data structures are described in detail.
Logical Level
The next higher level of abstraction. It describes what data are stored in the database and what
relationships exists among the data.
The entire database is expressed in terms of small number of simple structures.
Although implementation of simple structure at the logical level involves complex physical level
structure, the user of logical level does not need to be aware of this complexity. The logical
level of abstraction, used by the DBA, who must decide what information is to be kept in the
database.
View Level
Highest level of abstraction.
Describes only part of the entire database.
Some complexity occurs. Many users of the database system will not be concerned with all this
information.Instead, such users need to access only part of a database so that their interaction
with the system is simplified; the view level of abstraction is defined.
The system provides many views for the same database.The 3 levels of data abstraction can be
diagrammatically represented as:
The concept of ‘datatypes’ in programming languages’ – used to provide a distinction among the
various levels of abstraction.
E.g. consider a pascal ‘record’ data type (Similar to Structure data type of C)
type customer = record
customer_name : string;
customer_street: string;
customer_city : string;
end;
Example: University Database
Conceptual schema
Students(sid: string, name: string, login: string, age: integer, gpa:real)
Courses(cid: string, cname:string, credits:integer)
Enrolled(sid:string, cid:string, grade:string)
Physical schema
Relations stored as unordered files.
Index on first column of Students.
External Schema (View):
Course_info(cid:string,enrollment:integer)
ARCHITECTURE FOR A DATABASE SYSTEM
A Database system is partitioned into modules that deal with each of the responsibilities of the
overall system. The design of a database system must include consideration of the interface
between the database system and the operating system. The functional components of a database
system is broadly divided into:
 QUERY PREPROCESSOR
 STORAGE MANAGEMENT COMPONENTS

QUERY PROCESSOR INCLUDES
1. DML Compiler
2. Embedded DML Precompiler
3. DDL Interpreter
4. Query Evaluation Engine.
DML compiler
Translates the DML statements in a query language into low level instructions that a query
evaluation engine can understand. The DML compiler attempts to transform a user’s request in
an efficient form.
Embedded DML Precompiler
Converts DML statements embedded in an application program to normal procedure calls in the
host language. The precompiler must interact with the DML compiler to generate appropriate
code.
DDL Interpreter
Interprets DDL statements and records them in set of tables containing metadata.
Query Evaluation Engine
Executes low level instructions generated by the DML compiler.
STORAGE MANAGER COMPONENTS INCLUDE:
1. Authorisation and Integrity manager
2. Transaction Manager
3. File Manager
4. Buffer Manager.
Authorisation and Integrity Manager
Tests for the satisfaction of integrity constraints and checks the authority of users to access the
data.
Transaction Manager
Which ensures that the database remains in a consistent (correct) state despite system failures
and that concurrent transaction executions proceed without conflicting.
File Manager
Manages the allocation of space on the disk storage and the data structures used to represent the
information stored on disk.
Buffer Manager:
Responsible for fetching data from the disk storage into main memory and deciding what data to
cache in memory.
Database System Environment
DBMS Components
Stored Data Manager
 The database and the database catalogue are stored on disk
 Access to the disk is handled by the Operating System.
 A higher-level stored data manager controls access to DBMS information that is stored on
disk, whether part of the database or the catalogue.
 The stored data manager may use basic OS services for carrying out low-level data transfer,
such as handling buffers.
 Once data is in buffers, the other DBMS modules, as well as other application programs can
process it.
DDL Compiler
Processes the schema definitions and stores the descriptions (meta-data) in the catalogue.
 Runtime Database Processor
 Handles database access at runtime.
 Received retrieval or update operations and carries them out on the database.
 Access to the disk goes through the stored data manager.
Query Compiler
Handles high-level queries entered interactively.
Parses, analyzes and interprets a query, then generates calls to the runtime processor for
execution.
Precompiler
Extracts DML commands from an application program written in a host language.
Commands are sent to DML compiler for compilation into code for database access. The rest is
sent to the host language compiler.
Client Program
Accesses the DBMS running on a separate computer from the computer on which the database
resides. It is called the client computer, and the other is the database server. In some cases a
middle level is called the application server.
Database System Utilities

DBMSs have database utilities that help the DBA manage the system. Functions include:
Loading - used to load existing text/sequential files into the database. Source format and desired
target file are specified to the utility, and the utility reformats the data to load into a table.
Backup – creates a backup copy of the database, usually by dumping database onto tape. Can
be used to restore the database in case of failure. Incremental backup can be used which records
only the changes since the last backup.
File Reorganization – reorganize database files into different file organizations to improve
performance.
Performance Monitoring – monitors database usage and provides statistics to the DBA. DBA
uses the statistics for decision-making.
Tools, Environments and Communication Facilities
CASE Tools – used in the design phase to help speed up the development process.
Data dictionary system – stores catalog information about schemas and constraints, as well as
design decisions, usage standards, application program descriptions, user information. Also
called an information repository. Can be accesses directly by DBA or users when needed.
Application development environments – (i.e. JBuilder) provide environment for developing
database applications, and include facilities to help in database design, GUI development,
querying and updating and application development.
Communication software – allow users at remote locations to access the database through
computer terminals, workstations or personal computers. Connected to the database through
data communications hardware such as phone lines, local area networks etc.
DISTRIBUTED DATABASES
A distributed database is not stored in its entirety at a single physical location. Instead, it is
spread across a network of computers that are geographically dispersed and connected via
communications links.
A distributed database allows faster local queries and can reduce network traffic. With these
benefits comes the issue of maintaining data integrity. During 1950s & 1960s there was trend to
use independent or decentralized system. There was a duplication of hardware and facilities. In a
centralized database system, the DBMS & data reside at a single place and all the control &
location is limited to a single location, but the PCs are distributed geographically. Distributed
system is parallel computing using multiple independent computers communicating over a
network to accomplish a common objective or task. The type of hardware, programming
languages, operating systems and other resources may vary drastically. It is similar to computer
clustering with the main difference being a wide geographic dispersion of the resources
For example an organization may have an office in a building and have many sub- buildings
that are connected using LAN. The current trend is towards distributed systems. This is a
centralized system connected to intelligent remote sites. Each remote site have own storage and
processing capabilities - but in a centralized or network there is a single storage.
A key objective for a distributed system is that it looks like a centralized system to the user. The
user should not need to know where a piece of data is stored physically.
A database user accesses the distributed database through:
Local applications
applications which do not require data from other sites.
Global applications
applications which do require data from other sites.
A homogeneous distributed database has identical software and hardware running all databases
instances, and may appear through a single interface as if it were a single database.
A heterogeneous distributed database may have different hardware, operating systems, database
management systems, and even data models for different databases.
Forms of Distributed Data

There are five categories of distributed data:
 Replicated data,
 Horizontally fragmented data,
 Vertically fragmented data,
 Reorganized data,
 Separate-schema data.
Replicated Data
Replicated data means that copies of the same data are maintained in more than one location.
Data may be replicated across multiple machines to avoid transmitting data between systems.
Replicas can be read only or writable. Read only replicas have changes made to the original
and then propagated outwards to the replicas. Writable replicas propagate changes back to the
original using either a "write through" or a "write back" strategy.
Replicated data is most effective when data is not updated frequently. This tool suite
maintains replicated databases for each installation.
Horizontally Fragmented Data

Horizontally fragmented data means that data is distributed across different sites based on
one or more primary keys. This type of data distribution is typical where, for example, branch
offices in an organization deal mostly with a set of local customers and the related customer data
need not be accessed by other branch offices.
Vertically Fragmented Data

Vertically fragmented data is data that has been split by columns across multiple systems.
The primary key is replicated at each site. For example, a district office may maintain client
information such as name and address keyed on client number while head office maintains client
account balance and credit information, also keyed on the same client number.
Reorganized Data
Reorganized data is data that has been derived, summarized, or otherwise manipulated in
some way. This type of data organization is common where decision-support processing is
performed.
There may be some instances where the on-line transaction processing (OLTP) and decision-
support database management systems are different. Decision-support typically requires better
query optimization and ad hoc SQL support than does OLTP. OLTP usually requires
optimization for high-volume transaction processing.
Separate-Schema Data
Separate-schema data maintains separate databases and application programs for different
systems. For example, one system may manage inventory and one may handle customer orders.
There may be a certain amount of duplication with separate-schema data.
ADVANTAGES OF DISTRIBUTED DATABASE

 Management of distributed data with different levels of transparency like network
transparency, fragmentation transparency, replication transparency, etc.
 Increase reliability and availability
 Easier expansion
 Reflects organizational structure — database fragments potentially stored within the
departments they relate to
 Local autonomy or site autonomy — a department can control the data about them (as they
are the ones familiar with it)
 Protection of valuable data — if there were ever a catastrophic event such as a fire, all of the
data would not be in one place, but distributed in multiple locations
 Improved performance — data is located near the site of greatest demand, and the database
systems themselves are parallelized, allowing load on the databases to be balanced among
servers. (A high load on one module of the database won't affect other modules of the database
in a distributed database)
 Economics — it may cost less to create a network of smaller computers with the power of a
single large computer
 Modularity — systems can be modified, added and removed from the distributed database
without affecting other modules (systems)
 Reliable transactions - due to replication of the database
 Hardware, operating-system, network, fragmentation, DBMS, replication and location
independence
 Continuous operation, even if some nodes go offline (depending on design)
 Distributed query processing can improve performance
 Distributed transaction management
 Single-site failure does not affect performance of system.
 All transactions follow A.C.I.D. property:
A-atomicity, the transaction takes place as a whole or not at all
C-consistency, maps one consistent DB state to another
I-isolation, each transaction sees a consistent DB
D-durability, the results of a transaction must survive system failures
STORAGE STRUCTURES: REPRESENTATION OF DATA

 Physical Database Design
 Key issues are efficiency & performance
 How to store records efficiently on disk.
 Disk Access, Physical Sequence, Virtual Sequence
 Available Storage Mechanisms
 ISAM, B-Trees, Hashing
 Disk Access Review
 Physical database design is the process of selecting the appropriate storage representation
for database tables.
 requires details & frequency of common accesses
 Basic Storage Concepts (Hard Disk)
 disk access time = seek time + rotational delay
 Disk access times are much slower than access to main memory.
 Overriding DBMS performance objective is to minimize the number of disk accesses (disk
I/Os)
EXAMPLE
Hard Disk
Data representation is generally how information is conceived, manipulated, and recorded. The
term can also be defined as the form in which data and information is kept in a certain
environment. How data is stored varies from one environment to another, with each environment
having its own set of rules and standards.
Data is raw, unprocessed information. In and of itself it may not mean much.
Information, on the other hand, is processed data that has meaning.
Examples:
Data: A person's age, a person's gender, or the color of a car. Individually, it means not much.
Information: If a program takes the above data and processes it, now the results take meaning.
Example: A 25 year old man likes to drive a red car.
DATA STRUCTURES AND CORRESPONDING OPERATORS
Introduction
Data Structure Diagram (DSD) is the diagram of a conceptual data model which documents
the entities and their relationships, as well as the constraints that binds them.
The basic graphic notation elements of DSDs are boxes, representing entities, and arrows,
representing relationships. Data structure diagrams are most useful for documenting complex
data entities.
Overview
Data Structure Diagram.
Data Structure Diagram is a diagram type that is used to depict the structure of data elements in
the data dictionary. The data structure diagram is a graphical alternative to the composition
specifications within such data dictionary entries.
The data structure diagram is a predecessor of the entity-relationship model (E-R model). In
DSDs, attributes are specified inside the entity boxes rather than outside of them, while
relationships are drawn as boxes composed of attributes which specify the constraints that bind
entities together. DSDs differ from the E-R model in that the E-R model focuses on the
relationships between different entities, whereas DSDs focus on the relationships of the elements
within an entity.
There are several styles for representing data structure diagrams, with the notable difference in
the manner of defining cardinality. The choices are between arrow heads, inverted arrow heads
(crow's feet), or numerical representation of the cardinality.
Corresponding Operators
 Transvering.
 Searching.
 Inserting.
 Deleting.
 Sorting.
 Merging.
Transvering
Accessing each record exactly once so that certain items in the record may be processed. (This
accessing / processing are sometimes called “Visiting” the records)
Searching
Finding the location of record with a given key value or finding the locations of all records
which satisfy one or more datas.
Inserting
Adding a new record to the location.
Deleting
Removing or deleting the existing record.
Sorting
Managing the record in some order (either in Ascending or Descending)
Merging
Combining the records in two sorted files into a single sorted file.
RELATIONAL APPROACH
The relational model for database management is a database model based on first-order
predicate logic, first formulated and proposed in 1969 by Edgar F. Codd. In the relational model
of a database, all data is represented in terms of tuples, grouped into relations. A database
organized in terms of the relational model is a relational database.
In Data Modeling, an entity-relationship model (ERM) is a representation of structured

data; entity-relationship modeling is the process of generating these models. The end-product of
the modeling process is an entity-relationship diagram (ERD), a type of conceptual data model
or semantic data model. The first stage of information system design uses these models to
describe information needs or the type of information that is to be stored in a database during the
requirements analysis.
An Entity is an object that exists and is distinguishable from other objects. An entity may
be concrete (a person or a book, for example) or abstract (like a holiday or a concept).
An entity set is a set of entities of the same type (e.g., all persons having an account at a bank).
Entity sets need not be disjoint. For example, the entity set employee (all employees of a bank)
and the entity set customer (all customers of the bank) may have members in common.
The purpose of the relational model is to provide a declarative method for specifying data and
queries: users directly state what information the database contains and what information they
want from it, and let the database management system software take care of describing data
structures for storing the data and retrieval procedures for answering queries.
Most implementations of the relational model use the SQL data definition and query language. A
table in an SQL database schema corresponds to a predicate variable; the contents of a table to a
relation; key constraints, other constraints, and SQL queries correspond to predicates. However,
SQL databases, including DB2, deviate from the relational model in many details; Codd fiercely
argued against deviations that compromise the original principles.
Diagram of an example database according to the Relational model
A graphical representation of the entities and the relationships between them.Entity relationship
diagrams are a useful medium to achieve a common understanding of data among users and
application developers. In data modeling, an entity-relationship model (ERM) is a
representation of structured data; entity-relationship modeling is the process of generating these
models. The end-product of the modeling process is an entity-relationship diagram (ERD), a type
of Conceptual Data Model or Semantic Data Model.
3 Operations:
 Insert
 Delete
 Update
All operations can be performed in this approach.
Properties of Relational Tables
 Values Are Atomic
 Each Row is Unique
 Column Values Are of the Same Kind
 The Sequence of Columns is Insignificant
 The Sequence of Rows is Insignificant
 Each Column Has a Unique Name
Rules for Relational DBMS
1. The Information rule
2. The Guaranteed Access rule
3. The Systematic Treatment of Null Values rule
4. The Dynamic Online Catalog Based on the Relational Model rule
5. The Comprehensive Data Sublanguage rule
6. The View Updating rule
7. The High-level Insert, Update, and Delete rule
8. The Physical Data Independence rule

9. The Logical Data Independence rule
10. The Integrity Independence rule
11. The Distribution Independence rule
12. The No subversion rule
HIERARCHICAL APPROACH
A hierarchical database model is a data model in which the data is organized into a tree-like
structure. The structure allows representing information using parent/child relationships: each
parent can have many children, but each child has only one parent (also known as a
1-to-many relationship). All attributes of a specific record are listed under an entity type.
Example of a hierarchical model
The data is sorted hierarchically, using a downward tree. This model uses pointers to navigate
between stored data. It was the first DBMS model.
In a database an entity type is the equivalent of a table. Each individual record is represented as a
row, and each attribute as a column. Entity types are related to each other using 1: N mappings,
also known as one-to-many relationships. This model is recognized as the first database model
created by IBM in the 1960s.
Mother–child relationship: Child may only have one mother but a mother can have multiple
children. Mothers and children are tied together by links called "pointers". A mother will have a
list of pointers to each of her children.
The Hierarchical Data Model is a way of organizing a database with multiple one to many
relationships. The structure is based on the rule that one parent can have many children but
children are allowed only one parent. This structure allows information to be repeated through
the parent child relationsreated by IBM and was implemented mainly in their Information
Management System. (IMF)
The database keeps track of the different record types, their attributes, and the hierarchical
relationships between them.The attribute which assigns records to levels in the database structure
is called the key .
Features
 a set of record "types"
e.g. supplier record type, department record type, part record type
 a set of links connecting all record types in one data structure diagram (tree)
 at most one link between two record types, hence links need not be named
 for every record, there is only one parent record at the next level up in the tree
e.g. every county has exactly one state, every part has exactly one department
 no connections between occurrences of the same record type
 cannot go between records at the same level unless they share the same paren
3 Operations:
 Insert - It’s not possible without introducing a special dummy part.
 Delete – Its get deleted entirely
 Update- It may happen but its an inconsistency.
Advantages and disadvantages
 data must possess a tree structure
tree structure is natural for geographical data
 data access is easy via the key attribute, but difficult for other attributes
in the business case, easy to find record given its type (department, part or supplier)
in the geographical case, easy to find record given its geographical level (state, county, city,
census tract), but difficult to find it given any other attribute
e.g. find the records with population 5,000 or less
 tree structure is inflexible
cannot define new linkages between records once the tree is established
e.g. in the geographical case, new relationships between objects
 cannot define linkages laterally or diagonally in the tree, only vertically
NETWORK APPROACH
The network model is a database model conceived as a flexible way of representing
objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in
which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy
or lattice. A set consists of an owner record type, a set name, and a member record type. A
member record type can have that role in more than one set, hence the multiparent concept is
supported. An owner record type can also be a member or owner in another set.
3 Operations:
 Insert - There will be no connector records for the new place because its like an chain
 Delete – Entire link (chains) will be deleted. Two chains will be adjusted appropriately (such
adjustments may happen automatically).
 Update- We can updates but having some problems.

DBMS LANGUAGES
The DBMS mainly provides two database languages, namely, data definition language and data
manipulation language to implement the databases. Data definition language (DDL) is used for
defining the database schema. The DBMS comprises DDL compiler that identifies and stores the
schema description in the DBMS catalog. Data manipulation language (DML) is used to
manipulate the database.
 DDL – the data definition language, used by the DBA and database designers to define the
conceptual and internal schemas.
 The DBMS has a DDL compiler to process DDL statements in order to identify the schema
constructs, and to store the description in the catalogue.
 In databases where there is a separation between the conceptual and internal schemas, DDL is
used to specify the conceptual schema, and SDL, storage definition language, is used to specify
the internal schema.
 Once the schemas are compiled, and the database is populated with data, users need to
manipulate the database. Manipulations include retrieval, insertion, deletion and modification.
 The DBMS provides operations using the DML, data manipulation language.
 In most DBMSs, the VDL, DML and the DML are not considered separate languages, but a
comprehensive integrated language for conceptual schema definition, view definition and data
manipulation. Storage definition is kept separate to fine-tune the performance, usually done by
the DBA staff.
An example of a comprehensive language: SQL, which represents a VDL, DDL, DML as well as
statements for constraint specification, etc.
Data Definition Language
In DBMSs where no strict separation between the levels of the database is maintained, the data
definition language is used to define the conceptual and internal schemas for the database. On
the other hand, in DBMSs, where a clear separation is maintained between the conceptual and
internal levels, the DDL is used to specify the conceptual schema only. In such DBMSs, a
separate language, namely, storage definition language (SDL) is used to define the internal
schema. Some of the DBMSs that are based on true three-schema architecture use a third
language, namely, view definition language (VDL) to define the external schema.
The DDL statements are also used to specify the integrity rules (constraints) in order to maintain
the integrity of the database. The various integrity constraints are domain constraints, referential
integrity, assertions and authorization. These constraints are discussed in detail in subsequent
chapters. Like any other programming language, DDL also accepts input in the form of
instructions (statements) and generates the description of schema as output. The output is placed
in the data dictionary, which is a special type of table containing metadata. The DBMS refers the
data dictionary before reading or modifying the data. Note that the database users cannot update
the data dictionary; instead it is only modified by database system itself.
Data Manipulation Language
Once the database schemas are defined and the initial data is loaded into the database, several
operations such as retrieval, insertion, deletion, and modification can be applied to the database.
The DBMS provides data manipulation language (DML) that enables users to retrieve and
manipulate the data. The statement which is used to retrieve the information is called a query.
The part of the DML used to retrieve the information is called a query language. However, query
language and DML are used synonymously though technically incorrect. The DML are of two
types, namely, non-procedural DML and procedural DML.
The non-procedural or high-level or declarative DML enables to specify the complex

database operations concisely. It requires a user to specify what data is required without
specifying how to retrieve the required data. For example, SQL (Structured Query Language) is
a non-procedural query language as it enables user to easily define the structure or modify the
data in the database without specifying the details of how to manipulate the database. The high-
level DML statements can either be entered interactively or embedded in a general purpose
programming language. It can be used on its own to specify complex database operations.
DBMSs allow DML statements to be entered interactively from a terminal, or to be embedded in

a programming language. If the commands are embedded in a general purpose programming
language, the statements must be identified so they can be extracted by a pre-compiler and
processed by the DBMS.
The procedural or low-level DML requires user to specify what data is required and how to
access that data by providing step-by-step procedure. For example, relational algebra is
procedural query language, which consists of set of operations such as select, project, union, etc.,
to manipulate the data in the database. It Must be embedded in a general purpose programming
language.Typically retrieves individual records or objects from the database and processes each
separately.Therefore it needs to use programming language constructs such as loops.
Low-level DMLs are also called record at a time DMLS because of this.High-level DMLs, such
as SQL can specify and retrieve many records in a single DML statement, and are called set at a
time or set oriented DMLs.High-level languages are often called declarative, because the DML
often specifies what to retrieve, rather than how to retrieve it.
DDL
Data Definition Language (DDL) statements are used to define the database structure or schema.
Some examples:
 CREATE - to create objects in the database
 ALTER - alters the structure of the database
 DROP - delete objects from the database
 TRUNCATE - remove all records from a table, including all spaces allocated for the records
are removed
 COMMENT - add comments to the data dictionary
 RENAME - rename an object
DML
Data Manipulation Language (DML) statements are used for managing data within schema
objects. When DML commands are embedded in a general purpose programming language, the
programming language is called the host language and the DML is called the data sub-language.
High-level languages used in a standalone, interactive manner is called a query language.
Casual end users use high-level query language to specify requests, where programmers usually
use embedded DML.Parametric end users usually interact with user-friendly interfaces, which
can also be used by casual users who don’t want to learn the high-level languages.
Some examples:
 SELECT - retrieve data from the a database
Select * from table_name;
 INSERT - insert data into a table

 UPDATE - updates existing data within a table
 DELETE - deletes all records from a table, the space for the records remain
Delete from table_name;
 MERGE - UPSERT operation (insert or update)
 CALL - call a PL/SQL or Java subprogram
 EXPLAIN PLAN - explain access path to data
 LOCK TABLE - control concurrency

DBA (Database Administrator)
A database administrator (short form DBA) is a person responsible for the installation,
configuration, upgrade, administration, monitoring and maintenance of databases in an
organization.The role includes the development and design of database strategies, system
monitoring and improving database performance and capacity, and planning for future expansion
requirements. They may also plan, co-ordinate and implement security measures to safeguard the
database.
RESPONSIBILITIES OF DBA
DBA is the overall commander of a computer system, so it has number of duties, but some
of his/her major responsibilities are as follows:
1. DBA can control the data, hardware, and software and gives the instructions to the
application programmer, end user and naive user.
2. DBA decides the information contents of the database. He decides the suitable database file
structure for arrangement of data. He/She uses the proper DDL techniques.
3. DBA compiles the whole data in a particular order and sequence.
4. DBA decides where data can be stored i.e. take decision about the storage structure.
5. DBA decides which access strategy and technique should be used for accessing the data.
6. DBA communicates with the user by appropriate meeting, DBA co-operates with
7. user.
8. DBA also define and, apply authorized checks and validation procedures.
9. DBA also takes backup of the data on a backup storage device so that if data can be lost then
it can be again recovered and compiled. DBA also recovers the damaged data.
10. DBA also changes the environment according to user or industry requirement and monitor
the performance.
11. DBA should be good decision-maker. The decision taken by DBA should be correct, accurate
& efficient.
12. DBA should have leadership quality.
13. DBA liaise with the user in the business to take confidence of the customer about availability
of data.
Database User
(a) Naïve user
Naïve user has no knowledge of database system and its any supporting software. These are
used at the end form. These are like a layman, which have little bit knowledge or computer
system. These users are mainly used for collecting the data on the notebooks or on the pre-
deigned forms. An automated teller machine (ATMs) user are in these categories. Naïve user can
work on any simple GUI base menu driven system. Internet using non-computer based person
are in this form.
(b) End User or Data Entry Operators

Data entry operators are preliminary computer based users. The function of data entry
operators are only to operate the computer (start! stop the computer) and feed or type the
collected information (data) in menu driven application program and to execute it according to
the analyst’ requirement. These user are also called On line users. These user communicate the
database directly via an on line terminal or indirectly via a user interface. These users require
certain amount of expertise in the computer programming language, but require complete
knowledge of computer operations.
(c) Application programmer

He is also called simple programmer. The working of application programmer is to develop a
new project i.e. program for a particular application or modify an existing program. Application
programmer works according to some instructions given by database administrator (DBA).
Application programmer can handle all the programming language like Fortran, Cobol, dbase
etc.
(d) DBA (Data Base Administrator)

DBA is a major user. DBA either a single person or a group of persons. DBA is only the
custodian of the business firm or organization but not the owner of the organization. As bank
manager is the DBA of a bank, who takes care about the bank money and not use it. Only DBA
can handle the information collected by end user and give the instructions to the application
programmer for developing a new program or modifying an existing program. DBA is also
called an overall controller of the organization. In computer department of a firm either system
analysts or an EDP (Electronic Data Processing) Manager works as DBA. In other words DBA is
the overall controller of complete hardware and software.
CERTIFICATION
Employing organizations may require that a database administrator have a certification for the
particular RDBMS being used.
Skills
List of skills required to become database administrators are:
 Communication skills
 Knowledge of database theory
 Knowledge of database design
 Knowledge about the RDBMS itself, e.g. Oracle Database, IBM DB2, Microsoft SQL Server,
MaxDB
 Knowledge of Structured Query Language (SQL) and procedural extension language, e.g.
PL/SQL, SQL/PSM, Transact-SQL
 General understanding of distributed computing architectures, e.g. Client/Server,
Internet/Intranet, Enterprise
 General understanding of the underlying operating system, e.g. Windows, Unix, Linux
 General understanding of storage technologies, memory management, disk arrays.
Duties
A database administrator's responsibilities can include the following tasks:
 Installing and upgrading the database server and application tools
 Allocating system storage and planning future storage requirements for the database system
 Modifying the database structure, as necessary, from information given by application
developers
 Enrolling users and maintaining system security
 Ensuring compliance with database vendor license agreement
 Controlling and monitoring user access to the database
 Monitoring and optimizing the performance of the database
 Planning for backup and recovery of database information
 Maintaining archived data
 Backing up and restoring databases
 Contacting database vendor for technical support
 Game server hosting, administration of games, using a database.
Job titles
DBAs are also known by the titles Database Coordinator or Database Programmer, although a
database programmer requires more advanced skills in SQL programming than a DBA may
have, and a Database Programmer may not have and does not require the skills of database
administration, backing up, restoring, monitoring or tuning to do their job well.The role is
closely related to the other jobs of Database Analyst, Database Modeller, Programmer Analyst,
and Systems Manager.
Some organizations have a hierarchical level of database administrators, generally:
 Data Analysts/Query designers
 Junior DBAs
 Midlevel DBAs
 Senior DBAs
 DBA consultants
 Manager/Director of Database Administration/Information Technology
GENERAL DISCUSSION
Back-end database
A back-end database is a database that is accessed by users indirectly through an external
application rather than by application programming stored within the database itself or by low
level manipulation of the data (e.g. through SQL commands).
A back-end database stores data but does not include end-user application elements such as
stored queries, forms, macros or reports. Front-end and back-end are terms used to characterize
program interfaces and services relative to the initial user of these interfaces and services. (The
"user" may be a human being or a program.) A "front-end" application is one that application
users interact with directly. A "back-end" application or program serves indirectly in support of
the front-end services, usually by being closer to the required resource or having the capability to
communicate with the required resource. The back-end application may interact directly with the
front-end or, perhaps more typically, is a program called from an intermediate program that
mediates front-end and back-end activities.
Back end systems are corporate systems that are used to run a company such as systems to
manage orders, inventory and supply processing. Back end systems support the company's back
office. This system collects input from users or other systems for processing.
Oracle
The Oracle Database (commonly referred to as Oracle RDBMS or simply as Oracle) is

an object-relational database management system [2]produced and marketed by Oracle
Corporation.
Larry Ellison and his friends, former co-workers Bob Miner and Ed Oates, started the
consultancy Software Development Laboratories (SDL) in 1977. SDL developed the original
version of the Oracle software. The name Oracle comes from the code-name of a CIA-funded
project Ellison had worked on while previously employed by Ampex.[3]
STORAGE
The Oracle RDBMS stores data logically in the form of tablespaces and physically in the form of
data files ("datafiles").[6] Tablespaces can contain various types of memory segments, such as
Data Segments, Index Segments, etc. Segments in turn comprise one or more extents. Extents
comprise groups of contiguous data blocks. Data blocks form the basic units of data storage.
A DBA can impose maximum quotas on storage per user within each tablespace
Partitioning
Newer versions of the database can also include a partitioning feature: this allows the
partitioning of tables based on different set of keys. Specific partitions can then be easily added
or dropped to help manage large data sets.
Monitoring
Oracle database management tracks its computer data storage with the help of information stored
in the SYSTEM tablespace. The SYSTEM tablespace contains the data dictionary—and often
(by default) indexes and clusters. A data dictionary consists of a special collection of tablesthat
contains information about all user-objects in the database. Since version 8i, the Oracle RDBMS
also supports "locally managed" tablespaces which can store space management information in
bitmaps in their own headers rather than in the SYSTEM tablespace (as happens with the default
"dictionary-managed" tablespaces). Version 10g and later introduced the SYSAUXtablespace
which contains some of the tables formerly stored in the SYSTEM tablespace, along with
objects for other tools such as OEM which previously required its own tablespace
Disk files
Disk files primarily represent one of the following structures:
Data and index files: These files provide the physical storage of data, which can consist of the
data-dictionary data (associated to the tablespace SYSTEM), user data, or index data. These files
can be managed manually or managed by Oracle itself ("Oracle-managed files"). Note that a
datafile has to belong to exactly one tablespace, whereas a tablespace can consist of multiple
datafiles.
Redo log files, consisting of all changes to the database, used to recover from an instance
failure. Note that often a database will store these files multiple times, for extra security in case
of disk failure. The identical redo log files are said to belong to the same group.
Undo files: These special datafiles, which can only contain undo information, aid in recovery,
rollbacks, and read-consistency.
Archive log files: These files, copies of the redo log files, are usually stored at different
locations. They are necessary (for example) when applying changes to a standby database, or
when performing recovery after a media failure. It is possible to archive to multiple locations.
Tempfiles: These special datafiles serve exclusively for temporary storage data (used for
example for large sorts or for global temporary tables)
Control file, necessary for database startup. "A binary file that records the physical structure of a
database and contains the names and locations of redo log files, the time stamp of the database
creation, the current log sequence number, checkpoint information, and so on."[9]
At the physical level, data files comprise one or more data blocks, where the block size can vary
between data files.
Data files can occupy pre-allocated space in the file system of a computer server, utilize raw disk
directly, or exist within ASM logical volumes
ORACLE ERP
Problems with Non-ERP Systems
 In-house design limits connectivity outside the company
 Tendency toward separate IS’s within firm
 lack of integration limits communication within the company
 Strategic decision-making not supported
 Long-term maintenance costs high
 Limits ability to engage in process reengineering
ERP
Smooth and seamless flow of information across organizational boundaries. Standardized
environment with shared database independent of applications and integrated applications.
ERP Applications
Core applications
 Online Transaction Processing (OLTP)

 transaction processing systems
 support the day-to-day operational activities of the business
 support mission-critical tasks through simple queries of operational databases
 include Sales and Distribution, Business Planning, Production Planning, Shop Floor
Control, and Logistics modules
Business analysis applications
 Online Analytical Processing (OLAP)

 decision support tool for management-critical tasks through analytical investigation of
complex data associations
 supplies management with “real-time” information and permits timely decisions to
improve performance and achieve competitive advantage
 includes decision support, modeling, information retrieval, ad-hoc reporting/analysis, and
what-if analysis
Evolution of ERP
Oracle ERP Application Developer

The Oracle ERP Application Developer is responsible for the maintenance of Oracle E-Business
applications. An employee in this classification designs, develops, and maintains Oracle E-
Business functionality through the use of Oracle E-Business development tools and
methodology.
The employee will be assigned to perform complex technical duties involving the analysis,
design, implementation and maintenance of the following Oracle E-business suite applications
including but not limited to: Oracle Financials, Oracle Human Resource Management (HRMS),
Oracle Hyperion Budgeting, and Oracle Procurement. The employee provides development
and/or customization support across the organization for the aforementioned modules. Performs
detailed analysis and evaluation, and makes recommendations to resolve simple to complex
business problems with the appropriate technology. This job is not of a routine clerical or
ministerial nature and requires the exercise of independent judgment. An employee in this
classification may be responsible for directing the activities of subordinates. Reports to the
Manager of Enterprise Resource Planning or designee.
Essential Functions
Performs the analysis, development, modification, and maintenance of one or more of the
following Oracle applications: Oracle Financials, Oracle Human Resources Management
(HRMS), Oracle Hyperion Budgeting, and Oracle Procurement.
Meets with department managers, technical and functional resources, and users to help define
business and application requirements and resolve technical questions. Creates and maintains
custom code, custom menus, custom forms, and custom reports based on functional
specifications.
May be responsible for providing technical direction and coordinating the activities of
subordinates participating in the maintenance of Oracle E-Business applications;
Supports applications by troubleshooting systems, working with users, collaborating with

vendors and consultants.
Provides guidance on the capabilities of the city’s Oracle E-Business applications and their
application in order to ensure compliance with collective bargaining agreements and the City’s
policies and procedures;
Plans and implements solutions to system issues by identifying needs, researching solutions,
creating implementation plans, evaluating alternatives, executing applications, and completing
testing. Prepares, submits and tracks Oracle service requests.
Opens, monitors and tracks progress of Oracle service requests through the use of Metalink.
Performs analysis, design and development of Oracle applications utilizing the following tools:
PL/SQL, BI Publisher, Oracle Reports, Oracle Forms, Oracle Discoverer, Oracle FastFormulas,
Toad, and other suitable tools.
Oracle’s Solutions Lines are
 ERP (e-Business Suite, JD Edwards, Peoplesoft Finance, Supply Chain …)

 CRM (Siebel, RightNow, ...)
 HCM (Fusion HCM, Taleo, eBusiness Suite and PeopleSoft)
 Edge (Product Life Cycle Management, Oracle Transportation Management and eTax)
 Business Analytics (BI, EPM, Data Warehouse)
 IDM/Security
 Fusion Middleware (SOA & Integration, Content Management & User Experience)
 Database & Platform (DB, Options, Hardware Infrastructure & Systems).
SQL
Structured Query Language is a special-purpose programming language designed for managing
data held in a relational database management system(RDBMS).
Originally based upon relational algebra and tuple relational calculus, SQL consists of a data
definition language and a data manipulation language. The scope of SQL includes data insert,
query, update and delete, schema creation and modification, and data access control. Although
SQL is often described as, and to a great extent is, a declarative language (4GL), it also
includes procedural elements.
SQL was one of the first commercial languages for Edgar F. Codd's relational model, as
described in his influential 1970 paper, "A Relational Model of Data for Large Shared Data
Banks." Despite not entirely adhering to the relational model as described by Codd, it became the
most widely used database language.
SQL became a standard of the American National Standards Institute (ANSI) in 1986, and of the
International Organization for Standardization (ISO) in 1987. Since then, the standard has been
enhanced several times with added features. Despite these standards, code is not completely
portable among different database systems,
Syntax
Language elements
The SQL language is subdivided into several language elements, including:
 Clauses, which are constituent components of statements and queries. (In some cases,
these are optional.)[16]
 Expressions, which can produce either scalar values, or tables consisting
of columns androws of data.
 Predicates, which specify conditions that can be evaluated to SQL three-valued logic
(3VL)(true/false/unknown) or Boolean truth values and which are used to limit the effects
of statements and queries, or to change program flow.
 Queries, which retrieve the data based on specific criteria. This is an important element
ofSQL.
 Statements, which may have a persistent effect on schemata and data, or which may
controltransactions, program flow, connections, sessions, or diagnostics.
 SQL statements also include the semicolon (";") statement terminator. Though not
required on every platform, it is defined as a standard part of the SQL grammar.
 Insignificant whitespace is generally ignored in SQL statements and queries, making it
easier to format SQL code for readability.
Queries
The most common operation in SQL is the query, which is performed with the
declarative SELECTstatement. SELECT retrieves data from one or more tables, or expressions.
Standard SELECTstatements have no persistent effects on the database. Some non-standard
implementations ofSELECT can have persistent effects, such as the SELECT INTO syntax that
exists in some databases.[18]
Queries allow the user to describe desired data, leaving the database management system
(DBMS) responsible for planning, optimizing, and performing the physical operations necessary
to produce that result as it chooses.
A query includes a list of columns to be included in the final result immediately following
theSELECT keyword. An asterisk ("*") can also be used to specify that the query should return
all columns of the queried tables. SELECT is the most complex statement in SQL, with optional
keywords and clauses that include:
 The FROM clause which indicates the table(s) from which data is to be retrieved.
The FROMclause can include optional JOIN subclauses to specify the rules for joining
tables.
 The WHERE clause includes a comparison predicate, which restricts the rows returned by
the query. The WHERE clause eliminates all rows from the result set for which the
comparison predicate does not evaluate to True.
 The GROUP BY clause is used to project rows having common values into a smaller set of
rows. GROUP BY is often used in conjunction with SQL aggregation functions or to
eliminate duplicate rows from a result set. The WHERE clause is applied before
the GROUP BY clause.
 The HAVING clause includes a predicate used to filter rows resulting from the GROUP
BYclause. Because it acts on the results of the GROUP BY clause, aggregation functions
can be used in the HAVING clause predicate.
 The ORDER BY clause identifies which columns are used to sort the resulting data, and in
which direction they should be sorted (options are ascending or descending). Without
anORDER BY clause, the order of rows returned by an SQL query is undefined.
Hints
1. What's the difference between a primary key and a unique key?
Both primary key and unique key enforces uniqueness of the column on which they are defined.
But by default primary key creates a clustered index on the column, where are unique creates a
nonclustered index by default. Another major difference is that, primary key doesn't allow
NULLs, but unique key allows one NULL only.
2. What is the difference between a HAVING CLAUSE and a WHERE CLAUSE?

They specify a search condition for a group or an aggregate. But the difference is that HAVING
can be used only with the SELECT statement. HAVING is typically used in a GROUP BY
clause. When GROUP BY is not used, HAVING behaves like a WHERE clause. Having Clause
is basically used only with the GROUP BY function in a query whereas WHERE Clause is
applied to each row before they are part of the GROUP BY function in a query.
3. What are the authentication modes in SQL Server? How can it be changed?
Windows mode and Mixed Mode - SQL and Windows. To change authentication mode in SQL
Server click Start, Programs, Microsoft SQL Server and click SQL Enterprise Manager to run
SQL Enterprise Manager from the Microsoft SQL Server program group. Select the server then
from the Tools menu select SQL Server Configuration Properties, and choose the Security page.
4. What is PRIMARY KEY?

A PRIMARY KEY constraint is a unique identifier for a row within a database table. Every table
should have a primary key constraint to uniquely identify each row and only one primary key
constraint can be created for each table. The primary key constraints are used to enforce entity
integrity.
5. What is UNIQUE KEY constraint?

A UNIQUE constraint enforces the uniqueness of the values in a set of columns, so no duplicate
values are entered. The unique key constraints are used to enforce entity integrity as the primary
key constraints.
6. What is FOREIGN KEY?

A FOREIGN KEY constraint prevents any actions that would destroy links between tables with
the corresponding data values. A foreign key in one table points to a primary key in another
table. Foreign keys prevent actions that would leave rows with foreign key values when there are
no primary keys with that value. The foreign key constraints are used to enforce referential
integrity.
7. What is CHECK Constraint?

A CHECK constraint is used to limit the values that can be placed in a column. The check
constraints are used to enforce domain integrity.
8. What are the advantages of using Stored Procedures?

 Stored procedure can reduced network traffic and latency, boosting application performance.
 Stored procedure execution plans can be reused, staying cached in SQL Server's memory,
reducing server overhead.
 Stored procedures help promote code reuse.
 Stored procedures can encapsulate logic. You can change stored procedure code without
affecting clients.
 Stored procedures provide better security to your data.
9. Can SQL Servers linked to other servers like Oracle?

SQL Server can be linked to any server provided it has OLE-DB provider from Microsoft to
allow a link. E.g. Oracle has an OLE-DB provider for oracle that Microsoft provides to add it as
linked server to SQL Server group.
10. Write SQL Query to display current date.

Select getdata();
11. Write an SQL Query to find employee whose Salary is equal or greater than 10000
SELECT EmpName FROM Employees WHERE Salary>=10000;
12. Write an SQL Query to find name of employee whose name Start with ‘M’
SELECT * FROM Employees WHERE EmpName like 'M%';
13. find all Employee records containing the word "Joe", regardless of whether it was stored
as JOE, Joe, or joe.
SELECT * from Employees WHERE upper(EmpName) like upper('joe%');
14. Write a SQL Query to find year from date.

SELECT YEAR(GETDATE()) as "Year";
Find out
1. Which TCP/IP port does the SQL Server run on? How can it be Changed?
2. What are the Difference between Clustered and a Non-clustered Index?
3. What are the Different Index Configurations a Table can have?
4. What are Different Types of Collation Sensitivity?
5. What is OLTP (Online Transaction Processing)?
6. What’s the Difference between a Primary Key and a Unique Key?
7. What is Difference between DELETE and TRUNCATE Commands?
QUESTIONS
Section – A
1) A __________ is basically just a computerized record-keeping system.

2) SQL stands for __________
3) Database Schema have _____ , ______ and ________ levels.
4) Architecture Classified as __________
5) DBMS used for __________ and __________ the information.
6) The Types of users are __________
7) The major components of DBMS are __________
8) The collection of information stored in database at a particular moment is called ________
9)Data independence is divided as __________ and __________
10) What is Datum?
Section – B
1)Define DBMS.
2)What is Importance of database?
3)Explain instance and schema.
4)Write a short note on Query Language.
5)Discuss File Manager in Database Architecture.
6)How to represent a data in database.
7)Discuss Storage Manager in Database Architecture.
8)Discuss Query Processor in Database Architecture.
9)Write a short note on Network model.
10) What are Operational Data’s?
Section- C
1)Explain DBMS and its Purposes.
2)Explain briefly about Data Independence.

3)Describe the three levels in View of Data with an example.
4)Explain the functions of Data Administrator.
5)Discuss about the three level architecture of a database.
6)Explain about Distributed Database with neat diagram.
7)Discuss the applications of Database Management system.
8)Explain in detail about Operational data.
9)Explain the storage record interface.
10) Explain briefly about relationship between entities.

Syllabus Data Base Management System: Unit-I

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Syllabus Data Base Management System: Unit-I

Hochgeladen von

Copyright:

SYLLABUS

DATA BASE MANAGEMENT SYSTEM

PURPOSE OF DATABASE SYSTEMS

 A program to debit or credit an account

The Disadvantages of File Processing System:

2. Difficulty in accessing the data.

Steps In Database Design

Foundation Data Concept

4. Support for multiple views

5. Centralized control of the data resource

VARIOUS COMPONENTS OF DBMS

3. Providing Persistent Storage for Program Objects and Data Structures

4. Permitting Inferencing and Actions Using Deduction Rules

5. Inconsistency can be reduced

6. Data can be shared

7. Standard can be enforced or maintained

10. Confliction can be removed

11. Providing Multiple User Interfaces

12. Representing Complex Relationships Among Data

13. Providing Backup and Recovery

2. Problems associated with centralization

4. Confidentiality, Privacy and Security

8. The Cost of using a DBMS

INSTANCES AND SCHEMAS

First year Second year

RECORD BASED LOGICAL MODELS

2. mostly stored in relational database, optimized to support transactions representing daily

7. contains data that are continually updated as transactions are

10. time span, granularity, dimensionality

13. sales, costs, inventory

14. databases-data from transaction processing systems or customer data; files-business

18. continuous; expense areas; marketing, hr, acc etc

19. what does it take to run day to day business

20. serves the needs of day-to-day operations

Physical Data Independence

 STORAGE MANAGEMENT COMPONENTS

Database System Utilities

Forms of Distributed Data

Horizontally Fragmented Data

Vertically Fragmented Data

ADVANTAGES OF DISTRIBUTED DATABASE

STORAGE STRUCTURES: REPRESENTATION OF DATA

DATA STRUCTURES AND CORRESPONDING OPERATORS

Data Structure Diagram.

Adding a new record to the location.

Removing or deleting the existing record.

Managing the record in some order (either in Ascending or Descending)

In Data Modeling, an entity-relationship model (ERM) is a representation of structured

Diagram of an example database according to the Relational model

All operations can be performed in this approach.

Properties of Relational Tables

 Values Are Atomic

 Each Row is Unique

 Column Values Are of the Same Kind

 The Sequence of Columns is Insignificant

 The Sequence of Rows is Insignificant

 Each Column Has a Unique Name

Rules for Relational DBMS

1. The Information rule

2. The Guaranteed Access rule

3. The Systematic Treatment of Null Values rule

4. The Dynamic Online Catalog Based on the Relational Model rule

5. The Comprehensive Data Sublanguage rule

6. The View Updating rule

7. The High-level Insert, Update, and Delete rule

8. The Physical Data Independence rule

3) Database Schema have _ , and ______ levels.

5) DBMS used for and the information.

9)Data independence is divided as and