Sie sind auf Seite 1von 15

Course Plan DBMS

Introduction
Database and database systems have become an essential component of everyday life in modern society.
In the course of a day, most of us encounter several activities that involve some interaction with a
database. You have heard about the name database in various areas such as institution, library, shops etc.
The database has its own critical role in all area. So let us see what a database is. A database is a
collection of related data. For example, consider a bank. The bank has a number of customers and they
should know each of the customer details such as their name, account number, type of account and so on.
These all data that we want to keep or in other words need for future use are known by the name
database. But it should have some implicit properties
• It should represent some aspect of the real world, called universe of discourse.
• It is a logical collection of data.
• It is designed, built and populated with data for a specific purpose.

E.g.: Student Database


Name Roll No Branch % of marks Grade

What is DBMS?
DBMS is the acronym of Data Base Management System. DBMS is a collection of interrelated data and
a set of programs to access this data in a convenient and efficient way. It controls the organization,
storage, retrieval, security and integrity of data in a database. In other words, it is a collection of
programs that enables users to create and maintain a database. It is a general purpose software that
facilitates the processes of defining, constructing, manipulating and sharing of database among various
users and applications.
Defining a database means specifying the data types, structures and constraints for the data to be stored
in the database. Constructing the database is the process of storing the data itself on some storage
medium that is controlled by the DBMS. Manipulating the database means processing the database. It
includes functions such as querying the database to retrieve specific data, updating the database to reflect
changes in the mini world and generating reports from the data. Sharing a database allows multiple users
and programs to access the database concurrently.

RT503 1 Prepared by Lakshmi K.S


Course Plan DBMS

File system Verses Database approach


One way to store information in a computer system is to store it as in traditional file system. In this
method each data is stored in different files and there should be an application program for each of the
application.
The disadvantages of file system are:
• Data redundancy and inconsistency
In traditional file systems the data may be duplicated. For e.g.: Consider a bank having two accounts
savings bank account and credit check account. In this case, the address of customer is stored in two files:
one with SB account and other with checking record. Thus this duplication will result in need of high
storage space. This will also lead to inconsistency. That is, if the address of a customer changes, then the
change may be reflected only in one account.
• Difficulty in accessing information
Suppose the bank needs a list of customers with an account higher than Rs.10, 000. But, we do not
have an application at hand to list out this request. Thus, to access this information we have two choices.
First one is that list out the SB account customers and then extracts the needed list manually. In the
second option, we have to develop a new program to satisfy the new request. Both are difficult.
• Data Isolation
Data are scattered in different files and files may be in various formats. So it is difficult to extract the
appropriate data.
• Integrity problems
The constraint of data is enforced through the programs by appropriate code. So if we need to add a
new constraint, we have to change the code. Then, it is very difficult to add or change the constraints.
The problem will be compounded when constraints involves several constraints from different files.
• Atomicity problems
Suppose a failure occurs during execution of the program. Then the execution stops in the middle of
the program resulting in an inconsistency. But the execution of a program should end to a consistency
state. For a traditional file system the failure mostly result to an inconsistency state.

RT503 2 Prepared by Lakshmi K.S


Course Plan DBMS

Traditional File System

Application Application Application


Program1+Data Program2+Data Program3+Data

Database System

Database

Application Application Application


Program1 Program2 Program3

Features (characteristics) of DBMS


Database system is
1. Self describing:
The database system contains not only the database itself but also a complete definition or description
and structure of database. This structure is stored in a catalog with type, storage format and constraints.
The information stored in database is called meta-data.
2. Data Security
The DBMS can prevent unauthorized users from viewing or updating the database. Using passwords,
users are allowed access to the entire database or a subset of it known as a "subschema." For example, in
a student database, some users may be able to view payment details while others may view only mark list
of students.
3. Data Integrity
The DBMS can ensure that no more than one user can update the same record at the same time. It
can keep duplicate records out of the database; for example, no two customers with the same
customer number can be entered.

RT503 3 Prepared by Lakshmi K.S


Course Plan DBMS

4. Interactive Query
Most DBMSs provide query languages and report writers that let users interactively interrogate
the database and analyze its data. This important feature gives users access to all management
information as needed. I.e. we will get easily all details of each student at any time.
5. Interactive Data Entry and Updating
Many DBMS provide a way to interactively enter and edit data, allowing you to manage your own files
and databases. However, an interactive operation does not leave an audit trail and does not provide the
controls necessary in a large organization. These controls must be programmed into the data entry and
update programs of the application.
6. Data Independence
With DBMSs, the details of the data structure are not stated in each application program. The program
asks the DBMS for data by field name. Without a DBMS, the programmer must reserve space for the full
structure of the record in the program. Any change in data structure requires changing all application
programs.
DBMS Components
• Data:
Data stored in a database include numerical data which may be integers (whole numbers only) or floating
point numbers (decimal), and non-numerical data such as characters (alphabetic and numeric characters),
date or logical (true or false). More advanced systems may include more complicated data entities such as
pictures and images as data types.
• Standard operations:
Standard operations are provided by most DBMS. These operations provide the user basic capabilities for
data manipulation. Examples of these standard operations are sorting, deleting and selecting records.
• Data definition language (DDL):
DDL is the language used to describe the contents of the database. It is used to describe, for example,
attribute names (field names), data types, location in the database, etc.
• Data manipulation and query language:
Normally a query language is supported by a DBMS to form commands for input, edit, analysis, output,
reformatting, etc. Some degree of standardization has been achieved with SQL (Structured Query
Language).

RT503 4 Prepared by Lakshmi K.S


Course Plan DBMS

• Programming tools:
Besides commands and queries, the database should be accessible directly from application programs
through function calls (subroutine calls) in conventional programming languages.
• File structures:
Every DBMS has its own internal structures used to organize the data although some common data
models are used by most DBMS.

Abstraction
We all know that each application program have some data relevant to a particular task.
And an application program needs to use a portion of data, which is used by some other programs. In
early days of computerization, each application programmer designs the file structure, metadata of the
file and the access method each record. That is, each application program use its own data, details
concerning the structure of data as well as the access and to interpret each data. The application programs
are implemented independently and by hence itself, any change in storage media requires changes to
these structures and access methods. Because the files were structured for one application, it was difficult
to use the data in these files to new applications requiring data from several files belonging to different
existing applications. E.g.: Consider two application programs that require the data on an entity set
EMPLOYEE. The first application program involves the public relation department sending each
employee a newsletter and related material. This application program is interested in the record type
EMPLOYEE, that containing the values for the attributes of EMPL_Name and EMPL_Address.
Architecture of DBMS
The generalized architecture of DBMS is called ANSI/SPARC model. The architecture is divided into
three levels: External level, Conceptual level and Internal level.
A schema describes the view at each of this level. Schema describes the records and its relationships in
the view.
a. External view or User view
It is the highest level of data abstraction. This includes only those portions of database of concern to a
user or Application program. Each user has a different external view and it is described by means of a
scheme called external schema. The schema contains the definition of the logical records and
relationships in external view. It also contains the method of deriving the objects in the external view
from the objects in the conceptual view.

RT503 5 Prepared by Lakshmi K.S


Course Plan DBMS

b. Conceptual view
At this level of database abstraction, all the database entities and the relationship among them are
included. One conceptual view represents the entire database called conceptual schema. It describes the
method of deriving the objects in the conceptual view from the objects in the internal view. And also
specify the checks to retain the data consistency and integrating.
c. Internal view
It is the lowest level of abstraction, closest to the physical storage method. It describes how the data is
stored, what is the structure of data storage and the method of accessing these data. It is represented by
internal schema.
View Level …Defined by User
……………………………………………………………………………………………..

View 1 View 2 View 3 View 4


………………………………………………………………………………………………...

Defined by DBA

Logical Level

Defined by DBA for optimization

Physical Level

Data independence
Data independence of DBMS is the capacity to change the schema at one level of database system
without having to change the next high levels. The three-schema architecture can be used to achieve this
data independence. We can define data independence into two types:
1. Logical data independence
It is the capacity to change the conceptual schema without having to change the external schema.
Sometimes, we may need to change the conceptual schema to expand the database, to change the
constraints, or to reduce the database. Only the view definitions and mappings need to be changed in
DBMS that supporting logical data independence. Application programmer cannot feel any change in the
schema construct of DBMS.

RT503 6 Prepared by Lakshmi K.S


Course Plan DBMS

2. Physical data independence


Physical data independence is the capacity to change the internal schema without having to change the
conceptual schema and external schema. The internal schema may change to improve the performance of
retrieval or update. Then the conceptual schema need not change if the data remains same. For e.g.: We
need not change the Query to retrieve a student progress report even though the DBMS take a new
method to store the student record.

Advantages
1. Controlling Redundancy
In traditional file processing, every user group maintains its own files. Each group independently keeps
files on their db e.g., students. Therefore, much of the data is stored twice or more. And the redundancy
leads to several problems:
a. Duplication of effort - i.e. storage space wasted when the same data is stored repeatedly
b. Files that represent the same data may become inconsistent (since the updates are applied
independently by each users group).

2. Restricting Unauthorized Access


A DBMS should provide a security and authorization subsystem. Some db users will not be authorized to
access all information in the db (e.g., financial data). Some users are allowed only to retrieve data. Some
users are allowed both to retrieve and to update database.

3. Providing Persistent Storage for Program Objects and Data Structures


Data structure provided by DBMS must be compatible with the programming language’s data structures.
E.g., Object oriented DBMS are compatible with programming languages such as C++, SMALLTALK,
and the DBMS software automatically performs conversions between programming data structure and
file formats.
4. Permitting Inference and Actions Using Deduction Rules
Deductive database systems provide capabilities for defining deduction rules to inference new
information from the stored database facts.
5. Providing Multiple User Interfaces
(e.g., query languages, programming languages interfaces, forms, menu-driven interfaces, etc.)

RT503 7 Prepared by Lakshmi K.S


Course Plan DBMS

6. Representing Complex Relationships Between Data - The complex relationship between data is
easily represented.
7. Enforce Integrity Constraints - The integrity constraint for information is reasonably enforced by the
database management system.

DBMS Disadvantages
A database system generally provides on-line access to the database for many users. In contrast, a
conventional system is often designed to meet a specific need and therefore generally provides access to
only a small number of users. Because of the larger number of users accessing the data when a database
is used, the enterprise may involve additional risks as compared to a conventional data processing system
in the following areas.

1. Confidentiality, Privacy and Security - When information is centralized and is made available to
users from remote locations, the possibilities of abuse are often more than in a conventional system. To
reduce the chances of unauthorized users accessing sensitive information, it is necessary to take
technical, administrative and, possibly, legal measures. Most databases store valuable information that
must be protected from deliberate attack and destruction.

2. Data Quality - Since the database is accessible to users remotely, adequate controls are needed to
control users updating data and to control data quality. With increased number of users accessing data
directly, there are enormous opportunities for users to damage the data. Unless there are suitable
controls, the data quality may be compromised.
3. Data Integrity - Since a large number of users could be using a database concurrently, we should
have to ensure that data remain correct during operation. The main threat to data integrity comes from
several different users attempting to update the same data at the same time. The database therefore
needs to be protected against accidental changes by the users.

4. Enterprise Vulnerability - Centralizing all data of an enterprise in one database may mean that the
database becomes critical resource. The survival of the enterprise may depend on reliable information
being available from its database. The enterprise therefore becomes vulnerable to the destruction of the
database or to unauthorized modification of the database.

RT503 8 Prepared by Lakshmi K.S


Course Plan DBMS

5. The Cost of using a DBMS - Conventional data processing systems are typically designed to run a
number of well-defined preplanned processes. Such systems are often "tuned" to run efficiently for the
processes that they were designed for. Although the conventional systems are usually fairly inflexible in
that new applications may be difficult to implement and/or expensive to run, they are usually very
efficient for the applications they are designed for. The database approach on the other hand provides a
flexible alternative where new applications can be developed relatively inexpensively. The flexible
approach is not without its costs and one of these costs is the additional cost of running applications that
the conventional system was designed for. Using standardized software is almost always less machine
efficient than specialized software.

Entities and Attributes - Entities are distinguishable objects of concern and are modeled using their
characteristics or attributes. A database usually contains large number of similar entities. For eg: A
company database consists of a large number of employees may want to store similar information for
each employee. Then each of the employees can be termed as an entity. An entity can be an object with
physical existence. For e.g.: a car, a person or an employee. But each entity will have its own value.
Each entity has properties that describe the entity called attribute of that entity. Collection of entities with
same attributes termed as an entity type.
For e.g.: Employee (Employee_id, Address, Designation, Salary)
Here Employee is an entity and Employee_id, Address, Designation, Salary represents the attribute of
entity Employee. There can be several types of attributes such as Simple versus composite, single-
valued verses multi-valued and stored verses derived.
1. Composite versus Simple - Composite attributes are those attributes that can be divided into smaller
sub parts with independent meaning. Consider the above e.g.: in which the attribute Address can be
divided into small sub parts like City, State and Street_address. The attributes that are not divisible are
called simple or atomic attributes. The value of a composite attribute is the concatenation of the value of
its constituent simple attributes.
2. Single-valued verses multi-valued - Most of the attributes will have only single value for a particular
entity. Such attributes are called single valued. In some cases there may be having more than one value
for an attribute of a particular entity. These attributes are called multi-valued. The attribute age of an

RT503 9 Prepared by Lakshmi K.S


Course Plan DBMS

entity person will have only one value, while the college degree of that person will have more than one
degree. So the attribute age can be consider as single-valued and college degree as multi-valued.

3. Stored verses derived - In some case the attribute values can be related so that one can be
derived from the other. Consider a person as an entity. The attributes age and DateOfBirth of
person is related. i.e. the age of a person can be derived from the current date and his
DateOfBirth. The age attribute hence is called Derived attribute and the DateOfBirth is called
stored attribute from where age of person calculated.

Database Design - Database design is the process of organizing the data into related record types.
The DBMS is chosen so that it can support the organization and can process the transaction for the
organization.
Data model - A data model is a plan for building a database. The model represents data conceptually, the
way the user sees it, rather than how computers store it. Data models focus on required data elements and
associations; most often they are expressed graphically using
Entity-relationship diagrams. On a more abstract level, the term is also used in describing a database's
overall structure. Mostly used data modeling techniques are
1. Entity- Relational model
2. Hierarchical model
3. Network model
4. Object-oriented model
Hierarchical Model
The hierarchical data model organizes data in a tree structure. There is a hierarchy of
parent and child data segments. This structure implies that a record can have repeating information,
generally in the child data segments. Data in a series of records have a set of field values attached to it. It
collects all the instances of a specific record together as a record type. These record types are the
equivalent of tables in the relational model, and with the individual records being the equivalent of rows.
To create links between these record types, the hierarchical model uses Parent Child Relationships
Hierarchical databases link records like an organization chart. A record type can be owned
by only one owner. In the following example, orders are owned by only one customer. Hierarchical

RT503 10 Prepared by Lakshmi


K.S
Course Plan DBMS

structures were widely used with early mainframe systems; however, they are often restrictive in linking
real-world structures.

Customer

Order

Advantages:

• Hierarchical Model is simple to construct and operate on

• Corresponds to a number of natural hierarchically organized domains - e.g., assemblies in

manufacturing, personnel organization in companies

• Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN

PARENT etc.

Disadvantages:
• Navigational and procedural nature of processing
• Database is visualized as a linear arrangement of records
• Little scope for "query optimization"

Network Model

In 1971, the Conference on Data Systems Languages (CODASYL) formally defined the network model.

The basic data-modeling construct in the network model is the set construct. A set consists of an owner

record type, a set name, and a member record type. A member record type can have that role in more than

one set, hence the multiparent concept is supported. An owner record type can also be a member or

RT503 11 Prepared by Lakshmi


K.S
Course Plan DBMS

owner in another set. In network databases, a record type can have multiple owners. In the example

below, orders are owned by both customers and products, reflecting their natural relationship in business.

Customer Product

Order

Advantages:

• Network Model is able to model complex relationships and represents semantics of add/delete on the

relationships.

• Can handle most situations for modeling using record types and relationship types.

• Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within

set, GET etc. Programmers can do optimal navigation through the database.

Disadvantages:

• Navigational and procedural nature of processing

• Database contains a complex array of pointers that thread through a set of records.

• Little scope for automated "query optimization"

Object-Oriented Model
Object DBMSs add database functionality to object programming languages. They bring much more than
persistent storage of programming language objects. Object DBMSs extend the semantics of the C++,
Smalltalk and Java object programming languages to provide full-featured database programming
capability, while retaining native language compatibility. A major benefit of this approach is the
RT503 12 Prepared by Lakshmi
K.S
Course Plan DBMS

unification of the application and database development into a seamless data model and language
environment. As a result, applications require less code, use more natural data modeling, and code bases
are easier to maintain. Object developers can write complete database applications with a modest

Objects

Ord
er
Entity set
An entity set is a set of entities of the same type that share the same properties, or
attributes. It is represented by a set of attributes. An attribute, as used in the E-R model can be
characterized by the following attributes.
• Simple and composite attributes
• Single and multi-valued attributes
• Null attributes
• Derived attributes
A relationship is an association among several entities. And a relationship set is a set of relationships of
the same type.
Keys
Before designing a database we should be able to specify how entities within a given entity set and
relationships within a given relationship set are distinguished. Conceptually the individual entities and
relationships are distinct; but from a database perspective, the difference must be expressed by their
attributes. The concept of key is used to make such distinctions.
Super key is a set of attributes that, taken collectively, to identify uniquely an entity in the entity set.
For e.g.: the social_security_no attribute of the entity set employee is sufficient to distinguish one
employee entity from another. Thus social_security_no is a superkey for the entity set employee.
RT503 13 Prepared by Lakshmi
K.S
Course Plan DBMS

Superkeys with minimal subset is known as the candidate key. For eg: it is possible to combine the
attributes, employ_id & employ_name form a superkey. But the social_security_no is sufficient to
distinguish the two employees. Thus social_security_no is a candidate key. Usually primary key is used
to denote the candidate key that is chosen by the database designer to identify an entity from an entity set.
A key (super, candidate and primary) is a property of the entity set rather than the individual entities.

Entity- Relationship (E-R) Diagram


The overall logical structure of a database can be expressed graphically by an E-R diagram. The diagram
consists of the following major components.
• Rectangles: represent entity set.
• Ellipses: represent attributes.
• Diamonds: represents relationship sets.
• Lines: links attribute set to entity set and entity set to relationship set.
• Double ellipses: represent multi-valued attributes.
• Dashed ellipses: denote derived attributes.
For e.g.: Consider an E-R diagram, which consists of two-entity sets customer and loan.

Addr
Designatio
n Produc
t Location
Emp_i
d Salary

Wor
Employee ks Company
For

Entity relational model (RDBMS - relational database management system)


A database based on the relational model developed by E.F. Codd. A relational database allows the
definition of data structures, storage and retrieval operations and integrity constraints. In such a database

RT503 14 Prepared by Lakshmi


K.S
Course Plan DBMS

the data and relations between them are organized in tables. A table is a collection of records and each
record in a table contains the same fields.
It permits the database designer to create a consistent, logical representation of
information. Consistency is achieved by including declared constraints in the database design, which is
usually referred to as the logical schema. The theory includes a process of database normalization
whereby a design with certain desirable properties can be selected from a set of logically equivalent
alternatives. The access plans and other implementation and operation details are handled by the DBMS
engine, and are not reflected in the logical model. This contrasts with common practice for SQL DBMSs
in which performance tuning often requires changes to the logical model.
The basic relational building block is the domain or data type, usually abbreviated
nowadays to type. A tuple is an unordered set of attribute values. An attribute is an ordered pair of
attribute name and type name. An attribute value is a specific valid value for the type of the attribute.
This can be either a scalar value or a more complex type. Relational databases do not link records
together physically, but the design of the records must provide a common field, such as account number,
to allow for matching. Often, the fields used for matching are indexed in order to speed up the process.
In the following example, customers, orders and products are linked by comparing
data fields and/or indexes when information from more than one record type is needed. This method is
more flexible for ad hoc inquiries. Many hierarchical and network DBMSs also provide this capability.

Relational model

Customer Order Product

RT503 15 Prepared by Lakshmi


K.S

Das könnte Ihnen auch gefallen