Sie sind auf Seite 1von 28

FILE-ORIENTED SYSTEM VERSUS DATABASE SYSTEM

Computer-based data processing systems were initially used for scientific and engineering calculations.
With increased complexity of business requirements, gradually they were introducedinto the business
applications. The manual method of filing systems of an organisation, such as to hold all internal and
external correspondence relating to a project or activity, client, task, product, customer or employee, was
maintaining different manual folders. These files or folders were labelled and stored.

The manual system worked well as data repository as long as the data collection were relatively small and
the organisation’s managers had few reporting requirements. However, as the organisation grew and as
the reporting requirements became more complex, it became difficult in keeping track of data in the
manual file system. Also, report generation from a manual file system could be slow and cumbersome.
Thus, this manual filing system was replaced with a computer-based filing system.

Figure 1. Illustrates File-oriented system structures in which application programs are written specifically
for each user department for accessing their own files. Each set of departmental programs handles data
entry, file maintenance and the generation of a fixed set of specific reports. Here, the physical structure
and storage of the data files and records are defined in the application program.

Figure 1. File-oriented system

Advantages of File-oriented System

Although the file-oriented system is now largely obsolete, following are the several advantages of
learning file-based systems:

It provides a useful historical perspective on how we handle data.

The characteristics of a file-based system helps in an overall understanding of design complexity


of database systems.
Understanding the problems and knowledge of limitation inherent in the file-based system helps
avoid these same problems when designing database systems and thereby resulting in smooth
transition.

Disadvantages of File-oriented System

Conventional file-oriented system has the following disadvantages:

a. Data redundancy (or duplication): Since a decentralised approach was taken, each department
used their own independent application programs and special files of data. This resulted into
duplication of same data and information in several files. This redundancy or duplication of data
is wasteful and requires additional or higher storage space, costs extra time and money, and
requires increased effort to keep all files up-to- date.

b. Data inconsistency (or loss of data integrity): Data redundancy also leads to data inconsistency
(or loss of data integrity), since either the data formats may be inconsistent or data values
(various copies of the same data) may no longer agree or both.

c. Poor data control: a file-oriented system being decentralised in nature, there was no centralised
control at the data element (field) level. It could be very common for the data field to have
multiple names defined by the various departments of an organisation and depending on the file it
was in. This could lead to different meanings of a data field in different context, and conversely,
same meaning for different fields. This leads to a poor data control, resulting in a big confusion.

d. Limited data sharing: There is limited data sharing opportunities with the traditional file-
oriented system. Each application has its own private files and users have little opportunity to
share data outside their own applications. To obtain data from several incompatible files in
separate systems will require a major programming effort. In addition, a major management effort
may also be required since different organisational units may own these different files.

e. Inadequate data manipulation capabilities: Since File-oriented systems do not provide strong
connections between data in different files and therefore its data manipulation capability is very
limited.

f. Excessive programming effort: There was a very high interdependence between program and
data in file-oriented system and therefore an excessive programming effort was required for a
new application program to be written. Even though an existing file may contain some of the data
needed, the new application often requires a number of other data fields that may not be available
in the existing file. As a result, the programmer had to rewrite the code for definitions for needed
data fields from the existing file as well as definitions of all new data fields. Therefore, each new
application required that the developers (or programmers) essentially start from scratch by
designing new file formats and descriptions and then write the file access logic for each new
program. Also, both initial and maintenance programming efforts for management information
applications were significant.

g. Security problems: Every user of the database system should not be allowed to access all the
data. Each user should be allowed to access the data concerning his area of application only.
Since, applications programs are added to the file-oriented system in an ad hoc manner, it was
difficult to enforce such security system.
DATABASE APPROACH

The problems inherent in file-oriented systems make using the database system very desirable. Unlike the
file-oriented system, with its many separate and unrelated files, the database system consists of logically
related data stored in a single data dictionary. Therefore, the database approach represents the change in
the way end user data are stored, accessed and managed. It emphasizes the integration and sharing of data
throughout the organisation. Database systems overcome the disadvantages of file-oriented system. They
eliminate problems related with data redundancy and data control by supporting an integrated and
centralised data structure. Data are controlled via a data dictionary (DD) system which itself is controlled
by database administrators (DBAs). Figure 2. Illustrates the database systems.

Figure 2. Database system

The difference between file processing system and database approach is as shown in Table 2.

: File oriented system vs database system

File based system Database system

The data and program are inter- dependent. The data and program are independent of each

other.

File-based system caused data redundancy. The Database system control data redundancy. The

data may be duplicated in different files data appeared only once in the system.

File –based system caused data inconsistency. In database system data always consistent.
The data in different files may be different that Because data appeared only once.

cause data inconsistency.

The data cannot be shared because data is In database data is easily shared because data is

distributed in different files. stored at one place.

In file based system data is widely spread. Due It provides many methods to maintain data

to this reason file based system provides poor security in the database.

security.

File based system does not provide consistency Database system provides a different

constrains. consistency constrains to maintain data

integrity in the system.

File based system is less complex system. Database system is very complex system.

The cost of file processing system is less then The cost of database system is much more than

database system. a file processing system.

File based system takes much space in the Database approach store data more efficiently it

system, and memory is wasted in this approach. takes less space in the system and memory is

not wasted.

To generate different report to take a crucial The report can be generated very easily in

decision is very difficult in file based system. required format in database system. Because

data in database is stored in an organized

manner. And easily retrieve to generate

different report.

File based system does not provide concurrency Database system provides concurrency facility.

facility.

File based system does not provide data Database system provides data atomicity

atomicity functionality. functionality.

The cost of file processing system is less than The cost of database system is more than file

database system. processing system.

It is difficult to maintain as it provides less Database provides many facility to maintain


controlling facility. program.

If one application fail it does not affect other If database fail it affects all application that

files in system. dependent on database

Hardware cost is less than database system Hardware cost is high in database than file

system.

DBMS Functions and Components

Various Components of database systems are as follows:

Data description language (DDL): It allows users to define the database, specify the data types,
and data structures, and the constraints on the data to be stored in the database, usually through
data definition language. DDL translates the schema written in a source language into the object
schema, thereby creating a logical and physical layout of the database.

Data manipulation language (DML) and query facility: It allows users to insert, update, delete
and retrieve data from the database, usually through data manipulation language (DML). It
provides general query facility through structured query language (SQL).

Software for controlled access of database: It provides controlled access to the database, for
example, preventing unauthorized user trying to access the database, providing a concurrency
control system to allow shared access of the database, activating a recovery control system to
restore the database to a previous consistent state following a hardware or software failure and so
on.

The database and DBMS software together is called a database system. A database system overcomes the
limitations of traditional file-oriented system such as, large amount of data redundancy, poor data control,
inadequate data manipulation capabilities and excessive programming effort by supporting an integrated
and centralized data structure.

Functions of database system (Operations Performed on Database Systems)

As discussed in the previous section, database system can be regarded as a repository or container for a
collection of computerized data files in the form of electronic filing cabinet. The users can perform a
variety of operations on database systems. Some of the important operations performed on such files are
as follows:

Inserting new data into existing data files


Adding new files to the database
Retrieving data from existing files
Changing data in existing files
Deleting data from existing files
Removing existing files from the database.
Components of Database System Environment

A database system refers to an organisation of components that define and regulate the collection, storage,
management and use of data within a database environment. It consists of four main parts:

Data
Hardware
Software
Users (People)

Data: From the user’s point of view, the most important component of database system is perhaps the
data. The totality of data in the system is all stored in a single database. These data in a database are both
integrated and shared in a system. Data integration means that the database is a function of several
distinct files, with at least partly eliminated redundancy among the files. Whereas in data sharing,
individual pieces of data in the database can be shared among different users and each of those users can
have access to the same piece of data, possibly for different purposes. Different users can effectively even
access the same piece of data concurrently (at the same time). Such concurrent access of data by different
users is possibly because of the fact that the database is integrated.

Depending on the size and requirement of an organisation or enterprise, database systems are available on
machines ranging from the small personal computers to the large mainframe computers. The requirement
could be a single-user system (in which at most one user can access the database at a given time) or
multi-user system (in which many users can access the database at the same time).

Hardware: All the physical devices of a computer are termed as hardware. The computer can range from
a personal computer (microcomputer), to a minicomputer, to a single mainframe, to a network of
computers, depending upon the organisation’s requirement and the size of the database. From the point of
view of the database system the hardware can be divided into two components:

The processor and associated main memory to support the execution of database system (DBMS)
software and

The secondary (or external) storage devices (for example, hard disk, magnetic disks, compact
disks and so on) that are used to hold the stored data, together with the associated peripherals (for
example, input/output devices, device controllers, input/output channels and so on).

A database system requires a minimum amount of main memory and disk space to run. With a large
number of users, a very large amount of main memory and disk space is required to maintain and control
the huge quantity of data stored in a database. In addition, high-speed computers, networks and
peripherals are necessary to execute the large number of data access required to retrieve information in an
acceptable amount of time. The advancement in computer hardware technology and development of
powerful and less expensive computers, have resulted into increased database technology development
and its application.

Software: Software is the basic interface (or layer) between the physical database and the users. It is most
commonly known as database management system (DBMS). It comprises the application programs
together with the operating system software. All requests from the users to access the database are
handled by DBMS. DBMS provides various facilities, such as adding and deleting files, retrieving and
updating data in the files and so on. Application software is generally written by company employees to
solve a specific common problem.
Application programs are written typically in a third-generation programming language (3GL), such as C,
C++, Visual Basic, Java, COBOL, Ada, Pascal, Fortran and so on, or using fourth-generation language
(4GL), such as SQL, embedded in a third-generation language. Application

programs use the facilities of the DBMS to access and manipulate data in the database, providing reports
or documents needed for the information and processing needs of the organisation. The operating system
software manages all hardware components and makes it possible for all other software to run on the
computers.

Users: The users are the people interacting with the database system in any form. There could be various
categories of users. The first category of users is the application programmers who write database
application programs in some programming language. The second category of users is the end users who
interact with the system from online workstations or terminals and accesses the database via one of the
online application programs to get information for carrying out their primary business responsibilities.
The third category of users is the database administrators (DBAs) who manage the DBMS and its proper
functioning. The fourth category of users is the database designers who design the database structure.

1.4.3 Advantages of DBMS

Due to the centralised management and control, the database management system (DBMS) has numerous
advantages. Some of these are as follows:

Minimal data redundancy: In a database system, views of different user groups (data files) are
integrated during database design into a single, logical, centralised structure. By having a
centralised database and centralised control of data by the DBA the unnecessary duplication of
data are avoided. Each primary fact is ideally recorded in only one place in the database. The total
data storage requirement is effectively reduced. It also eliminates the extra processing to trace the
required data in a large volume of data. Incidentally, we do not mean or suggest that all
redundancy can or necessarily should be eliminated. Sometimes there are sound business and
technical reasons for maintaining multiple copies of the same data, for example, to improve
performance, model relationships and so on. In a database system, however, this redundancy can
be carefully controlled. That is, the DBMS is aware of it, if it exists and assumes the
responsibility for propagating updates and ensuring that the multiple copies are consistent.

Program-data independence: The separation of metadata (data description) from the application
programs that use the data is called data independence. In the database environment, it allows for
changes at one level of the database without affecting other levels. These changes are absorbed by
the mappings between the levels. With the database approach, metadata are stored in a central
location called repository. This property of data systems allows an organisation’s data to change
and evolve (within limits) without changing the application programs that process the data.

Efficient data access: DBMS utilizes a variety of sophisticated techniques to store and retrieve
data efficiently. This feature is especially important if the data is stored on external storage
devices.

Improved data sharing: Since, database system is a centralised repository of data belonging to
the entire organisation (all departments), it can be shared by all authorized users. Existing
application programs can share the data in the database. Furthermore, new

application programs can be developed on the existing data in the database to share the same data
and add only that data that is not currently stored, rather having to define all data requirements
again. Therefore, more users and applications can share more of the data.

Improved data consistency: Inconsistency is the corollary to redundancy. In the file-oriented


system, when the data is duplicated and the changes made at one site are not propagated to the
other site, it results into inconsistency. Such database supplies incorrect or contradictory
information to its users. So, if the redundancy is removed or controlled, chances of having
inconsistence data is also removed and controlled. In database system, such inconsistencies are
avoided to some extent by making them known to DBMS. DMS ensures that any change made to
either of the two entries in the database is automatically applied to the other one as well. This
process is known as propagating updates.

Improved data integrity: Data integrity means that the data contained in the database is both
accurate and consistent. Integrity is usually expressed in terms of constraints, which are
consistency rules that the database system should not violate For example, an integrity check for
the data field marriage date (MRG-MTH) can be introduced between the range of 01 and 12.
Another integrity check can be incorporated in the database to ensure that if there is reference to a
certain object, that object must exit. For example, in the case of bank’s automatic teller machine
(ATM), a user is not allowed to transfer fund from a nonexistent saving to a checking account.

Improved security: Database security is the protection of database from unauthorised users. The
database administrator (DBA) ensures that proper access procedure is followed, including proper
authentication schemes for access to the DBMS and additional checks before permitting access to
sensitive data. A DBA can define (which is enforced by DBMS) user names and passwords to
identify people authorised to use the database. Different levels of security could be implemented
for various types of data and operations. The access of data by authorised user may be restricted
for each type of access (for example, retrieve, insert, modify, update, delete and so on) to each
piece of information in the database. The enforcement of security could be data-value dependent
(for example, a works manager has access to the performance details of employees in his or her
department only), as well as data-type dependent (but the manager cannot access the sensitive
data such as salary details of any employees, including those in his or her department).
Increased productivity of application development: The DBMS provides many of the standard
functions that the application programmer would normally have to write in a file-oriented
application. It provides all the low-level file-handling routines that are typical in application
programs. The provision of these functions allows the application programmer to concentrate on
the specific functionality required by the users without having to worry about low-level
implementation details. DBMSs also provide a high-level (4GL) environment consisting of
productivity tools, such as forms and report generators, to automate some of the activities of
database design and simplify the development of database applications. This results in increased
productivity of the programmer and reduced development time and cost.

Enforcement of standards: With central control of the database, a DBA defines and enforces the
necessary standards. Applicable standards might include any or all of the following:
departmental, installation, organisational, industry, corporate, national or international. Standards
can be defined for data formats to facilitate exchange of data between systems, naming
conventions, display formats, report structures, terminology, documentation standards, update
procedures, access rules and so on. This facilitates communication and cooperation among
various departments, projects and users within the organisation. The data repository provides
DBAs with a powerful set of tools for developing and enforcing these standards.
Economy of scale: Centralising of all the organisation’s operational data into one database and
creating a set of application programs that work on this source of data resulting in drastic cost
savings. The DBMS approach permits consolidation of data and applications. Thus reduces the
amount of wasteful overlap between activities of data-processing personnel in different projects
or departments. This enables the whole organisation to invest in more powerful processors,
storage devices or communication gear, rather than having each department purchase its own
(low-end) equipment. Thus, a combined low cost budget is required (instead of accumulated large
budget that would normally be allocated to each department for file-oriented system) for the
maintenance and development of system. This reduces overall costs of operation and
management, leading to an economy of scale.

Balance of conflicting requirements: Knowing the overall requirements of the organisation


(instead of the requirements of individual users), the DBA resolves the conflicting requirements
of various users and applications. A DBA can structure the system to provide an overall service
that is best for the organisation. A DBA can chose the best file structure and access methods to
get optimal performance for the response-critical operations, while permitting less critical
applications to continue to use the database (with a relatively slower response). For example, a
physical representation can be chosen for the data in storage that gives fast access for the most
important applications.

Improved data accessibility and responsiveness: As a result of integration in database system,


data that crosses departmental boundaries is directly accessible to the end-users. This provides a
system with potentially much more functionality. Many DBMSs provide query languages or
report writers that allow users to ask ad hoc questions and to obtain the required information
almost immediately at their terminal, without requiring a programmer to write some software to
extract this information from the database.

Increased concurrency: DBMSs manage concurrent databases access and prevents the problem
of loss of information or loss of integrity.

Reduced program maintenance: The problems of high maintenance effort required in file-
oriented system are reduced in database system. In a file-oriented environment, the descriptions
of data and the logic for accessing data are built into individual application programs. As a result,
changes to data formats and access methods inevitably result in the need to modify application
programs. In database environment, data are more independent of the application programs.
Improved backup and recovery services: DBMS provides facilities for recovering from
hardware or software failures through its backup and recovery subsystem. For example, if the
computer system fails in the middle of a complex update program, the recovery subsystem is
responsible and makes sure that the database is restored to the state it was in before the program
started executing. Alternatively, the recovery subsystem ensures that the program is resumed
from the point at which it was interrupted so that its full effect is recorded in the database.

Improved data quality: The database system provides a number of tools and processes to
improve data quality.

1.4.4 Disadvantages of DBMS

In spite of the advantages, the database approach entails some additional costs and risks that must be
recognized and managed when implementing DBMS. Following are the disadvantages of using DBMS:

Increased complexity: A multi-user DBMS becomes an extremely complex piece of software


due to expected functionality from it. It becomes necessary for database designers, developers,
database administrators and end-users to understand this functionality to full advantage of it.
Failure to understand the system can lead to bad design decisions, which can have serious
consequences for an organisation.

Requirement of new and specialized manpower: Because of rapid changes in database


technology and organisation’s business needs, the organisation’s need to hire, train or retrain its
manpower on regular basis to design and implement databases, provide database administration
services and manage a staff of new people. Therefore, an organisation needs to maintain
specialized skilled manpower.

Large size of DBMS: The large complexity and wide functionality makes the DBMS an
extremely large piece of software. It occupies many gigabytes of storage disk space and requires
substantial amounts of main memory to run efficiently.

Increased installation and management cost: The large and complex DBMS software has a
high initial cost. It requires trained manpower to install and operate and also has substantial
annual maintenance and support costs. Installing such a system also requires upgrades to the
hardware, software and data communications systems in the organisation. Substantial training of
manpower is required on an ongoing basis to keep up with new releases and upgrades. Additional
or more sophisticated and costly database software may be needed to provide security and to
ensure proper concurrent updating of shared data.

Additional hardware cost: The cost of DBMS installation varies significantly, depending on the
environment and functionality, size of the hardware (for example, micro-computer, mini-
computer or main-frame computer) and the recurring annual maintenance cost of hardware and
software.
Conversion cost: The cost of conversion (both in terms of money and time) from legacy system
(old file-oriented and/or older database technology) to modern DBMS environment is very high.
In some situations, the cost of DBMS and extra hardware may be insignificant compared with the
cost of conversion. This cost includes the cost of training manpower (staff) to use these new
systems and cost of employing specialists manpower to help with the conversion and running of
the system.

Need for explicit backup and recovery: For a centralised shared database to be accurate and
available all times, a comprehensive procedure is required to be developed and used for providing
backup copies of data and for restoring a database when damage occurs. A modern DBMS
normally automates many more of the backup and recovery tasks than a file-oriented system.

Organisational conflict: A centralised and shared database (which is the case with DBMS)
requires a consensus on data definitions and ownership as well as responsibilities for accurate
data maintenance. As per past history and experience, sometimes there are conflicts on data
definitions data formats and coding, rights to update shared data, and associated issues, which are
frequent and often difficult to resolve. Organisational commitment to the database approach,
organisationally astute database administrators and a sound evolutionary approach to database
development is required to handle these issues.

1.5 DBMS USERS

In large organizations, many people are involved in the design, use, and maintenance of a large database
with hundreds of users. In this section we identify the people whose jobs involve the day-to-day use of a
large database; we call them the actors on the scene. we consider people who may be called workers
behind the scene—those who work to maintain the database system environment but who are not actively
interested in the database contents as part of their daily job.

Actors on the Scene:

Huge organisations have a lot to maintain, design, develop and administer the entire data. The front end
has to be handled by the users that are directly associated with the database which includes the
administration work, designing, analysis, testing and the users for whom the database has been designed.
The detailed explanation for the front end users is given below:

Database Administrators

In any organization where many people use the same resources, there is a need for a chief administrator to
oversee and manage these resources. In a database environment, the primary resource is the database
itself, and the secondary resource is the DBMS and related software. Administering these resources is the
responsibility of the database administrator (DBA). The DBA is responsible for authorizing access to
the database, coordinating and monitoring its use, and acquiring software and hardware resources as
needed. The DBA is accountable for problems such as security breaches and poor system response time.
In large organizations, the DBA is assisted by a staff that carries out these functions.
Database Designers

Database designers are responsible for identifying the data to be stored in the database and for choosing
appropriate structures to represent and store this data. These tasks are mostly undertaken before the
database is actually implemented and populated with data. It is the responsibility of database designers to
communicate with all prospective database users in order to understand their requirements and to create a
design that meets these requirements. In many cases, the designers are on the staff of the DBA and may
be assigned other staff responsibilities after the database design is completed. Database designers
typically interact with each potential group of users and develop views of the database that meet the data
and processing requirements of these groups. Each view is then analyzed and integrated with the views of
other user groups. The final database design must be capable of supporting the requirements of all user
groups.

End Users

End users are the people whose jobs require access to the database for querying, updating, and generating
reports; the database primarily exists for their use. There are several categories of end users:

■ Casual end users occasionally access the database, but they may need different information
each time. They use a sophisticated database query language to specify their requests and are
typically middle- or high-level managers or other occasional browsers.

■ Naive or parametric end users make up a sizable portion of database end users. Their main
job function revolves around constantly querying and updating the database, using standard types
of queries and updates—called canned transactions—that have been carefully programmed and
tested. The tasks that such users perform are varied:
_ Bank tellers check account balances and post withdrawals and deposits.

_ Reservation agents for airlines, hotels, and car rental companies check availability for a given
request and make reservations. Employees at receiving stations for shipping companies enter
package identifications via bar codes and descriptive information through buttons to update a
central database of received and in-transit packages.

Sophisticated end users include engineers, scientists, business analysts, and others who
thoroughly familiarize themselves with the facilities of the DBMS in order to implement their
own applications to meet their complex requirements.

Standalone users maintain personal databases by using ready-made program packages that
provide easy-to-use menu-based or graphics-based interfaces. An example is the user of a tax
package that stores a variety of personal financial data for tax purposes.

A typical DBMS provides multiple facilities to access a database. Naive end users need to learn
very little about the facilities provided by the DBMS; they simply have to understand the user
interfaces of the standard transactions designed and implemented for their use. Casual users learn
only a few facilities that they may use repeatedly. Sophisticated users try to learn most of the
DBMS facilities in order to achieve their complex requirements. Standalone users typically
become very proficient in using a specific software package.
System Analysts and Application Programmers (Software Engineers)

System analysts determine the requirements of end users, especially naive and parametric end users, and
develop specifications for standard canned transactions that meet these requirements. Application
programmers implement these specifications as programs; then they test, debug, document, and maintain
these canned transactions. Such analysts and programmers—commonly

referred to as software developers or software engineers—should be familiar with the full range of
capabilities provided by the DBMS to accomplish their tasks.

Workers behind the Scene

In addition to those who design, use, and administer a database, others are associated with the design,
development, and operation of the DBMS software and system environment. These persons are typically
not interested in the database content itself. We call them the workers behind the scene, and they include
the following categories:

DBMS system designers and implementers design and implement the DBMS modules and interfaces
as a software package. A DBMS is a very complex software system that consists of many components, or
modules, including modules for implementing the catalog, query language processing, interface
processing, accessing and buffering data, controlling concurrency, and handling data recovery and
security. The DBMS must interface with ther system software such as the operating system and compilers
for various programming languages. Tool developers design and implement tools—the software
packages that facilitate database modeling and design, database system design, and improved
performance. Tools are optional packages that are often purchased separately. They include packages for
database design, performance monitoring, natural language or graphical interfaces, prototyping,
simulation, and test data generation. In many cases, independent software vendors develop and market
these tools.

Operators and maintenance personnel (system administration personnel) are responsible for the actual
running and maintenance of the hardware and software environment for the database system. Although
these categories of workers behind the scene are instrumental in making the database system available to
end users, they typically do not use the database contents for their own purposes.

DATABASE SCHEMA

The plan (or formulation of scheme) of the database is known as schema. A database schema is the
skeleton structure that represents the logical view of the entire database. It defines how the data is
organized and how the relations among them are associated. It formulates all the constraints that are to be
applied on the data

Schema means an overall plan of all the data item (field) types and record types stored in the database. A
database schema defines the database name, record type, components that make those records, its entities
and the relationship among them. It contains a descriptive detail of the database, which can be depicted by
means of schema diagrams. It’s the database designers who design the schema to help programmers
understand the database and make it useful.

A database schema can be divided broadly into two categories −

Physical Database Schema − this schema pertains to the actual storage of data and its form of
storage like files, indices, etc. It defines how the data will be stored in a secondary storage.

Logical Database Schema − this schema defines all the logical constraints that need to be
applied on the data stored. It defines tables, views, and integrity constraints.
Database Subschema

A subschema is the subset of the schema and thus inherits same properties that a schema possess. The
schema for a particular view is considered as subschema and collections of these subschemas constitutes
of a schema. The subschema refers to the application programmer’s view of the data item types, which he
or she uses. Individual application programmer can change their respective subschema without effecting
subschema views of other’s. Subschema also acts as a unit for enforcing controlled access to the database,
for example, it can restrict a user of a subschema from updating a certain value in the database but allows
to read it.

DATABASE INSTANCE

It is important that we distinguish these two terms individually. Database schema is the skeleton of
database. It is designed when the database doesn't exist at all. Once the database is operational, it is very
difficult to make any changes to it. A database schema does not contain any data or information.

A database instance is a state of operational database with data at any given time. It contains a snapshot of
the database. Database instances tend to change with time. A DBMS ensures that its every instance (state)
is in a valid state, by diligently following all the validations, constraints, and conditions that the database
designers have imposed.

CLASSIFICATION OF DATABASE MANAGEMENT SYSTEM

Centralised Database System

Central Database system consists of a single processor together with its associated data storage devices
and other peripherals. It is physically confined to a single location. The system offers data processing
capabilities to users who are located either at the same site, or, through remote terminals, at
geographically dispersed sites. The management of the system and its data are controlled centrally form
any one or central site. Fig. 2.3 illustrates an example of centralised database system.

3 Centralised database system


Advantages of Centralised database
system

1. Most of the functions such as update, backup query, control access and so on, are easier to
accomplish in a centralised database system.

2. The size of the database and the computer on which it resides need not have any bearing on
whether the database is centrally located. For example, a small enterprise with its database on a
personal computer (PC) has a centralised database, a large enterprise with many computers has
database entirely controlled by a mainframe.

Disadvantages of Centralised Database System

1. When the central site computer or database system goes down, then everyone (users) is blocked
from using the system until the system comes back.

Client-Server Database System

In client-server database system client's computer is called front-end. These server and client computers
are con applications and tools act as clients of the DBMS, making requests for its processes these requests
and returns the results to the client(s). Client/server user interface (GUI) and does computations and other
programming of interest to the handles parts of the job that are common to many clients.

Clientserver database architecture

As shown in Figure 2.4, the client/server database architecture consists of three components namely,
client applications, DBMS server and a communication network interface. The client application could be
a tool or user-written application or vendor-written applications. They issue SQL statements for data
access. The DBMS stores the related software, processes the SQL statements and returns results. The
communication network interface enables client applications to connect to the server, send SQL
statements and receive results or error messgaes or error return codes after the server has processed the
SQL statements. In client/server database architecture the majority of the DBMS services are performed
on the server.
Clhent/server architecture is a part of the open systems architecture in which all computing hardware,
Operating systems, network protocols and other software are interconnected as a network and work in
concert to achieve user goals. It is well suited for online transaction processing and decision support
applications which tend to generate a number of relatively short transactions and require a high degree of
concurrency.

Advantages of Client/server Database System

1. Client-server system has less expensive platforms to support applications that had previously
been running only on large and expensive mini or mainframe computers.

2. Clients offer icon-based menu-driven interface, which is superior to the traditional command-line,
dumb terminal interface typical of mini and mainframe computer systems.

3. Client/server environment facilitates in more productive work by the users and making better use
of existing data.

4. Client-server database system is more flexible as compared to the centralised system.


5. Response time and throughput is high.

6. The server (database) machine can be custom-built (tailored) to the DBMS function and thus
provide a better DBMS performance.

7. The client (application database) might be a personnel workstation, tailored to the needs of the
end users and thus able to provide better interfaces, high availability, faster responses and overall
improved ease of use to the user.

8. A single database (on server) can be shared across several distinct client (application) systems.

Disadvantages of Client/Server Database System

Labor or programming cost is high in client/server environments, particularly in initial phases.

There is a lack of management tools for diagnosis, performance monitoring and tuning and
security control, for the DBMS, client and operating systems and networking environments.

Distributed Database System

Distributed database systems are similar to client/server architecture in a number of ways. Both typically
involve the use of multiple computer systems and enable users to access data from remote system.
However, distributed database system broadens the extent to which data can be shared well beyond that
which can be achieved with the client/server system. Figure. 2.5 shows a diagram of distributed database
architecture.

As shown in Figure. 2.5 in distributed database system, data is spread across a variety of different DBMS
software’s running on a variety of different computing machines supported by a variety of different
operating systems. These machines are spread (or distributed) geographically and connected together by a
variety of communication networks. In distributed database system, one application can operate on data
that is spread geographically on different machines. Thus, in distributed database system, the enterprise
data might be distributed on different computers in such a way that data for one portion (or department) of
the enterprise is stored in one computer and the data for another department is stored in another. Each
machine can have data and applications of its own. However, the users on one computer can access to
data stored in several other computers. Therefore, each machine will act as a server for some users and a
client for others.
Advantages of Distributed Database System

1. Distributed database architecture provides greater efficiency and better performance


2. Response time and throughput is high.

3. The server (database) machine can be custom-built (tailored) to the DBMS function and thus can
provide better DBMS performance

4. The client (application database) might be a personnel workstation, tailored to the needs of the
end users and thus able to provide better interfaces, high availability, faster responses and overall
improved ease of use to the user.

5. A single database (on server) can be shared across several distinct client (application) systems.

6. A data volumes and transaction rates increase, users can grow the system incrementally.
7. It causes less impact on ongoing operations when adding new locations.
8. Distributed database system provides local autonomy.

: Distributed Database Systems


Disadvantages of Distributed Database System

1. A distributed DBMS that hides the distributed nature from the user is inherently more complex
than a centralized DBMS.

2. Increased complexity means that there will be high procurement and maintenance cost.

3. In a centralized system, access to the data can be easily controlled. However, in a distributed
DBMS not only does access to replicated data have to be controlled in multiple locations but also
the network itself has to be made secure
DATA MODELS

Record based data Models

Hierarchical Data Models

The hierarchical data model is represented by an upside-down tree. The user perceives the hierarchical
database as a hierarchy of segments. A segment is the equivalent of a file system's record type. In a
hierarchical model, the relationship between the files or records forms a hierarchy. In other words, the
hierarchical database is a collection of records that is perceived as organised to conform to the upside-
down tree structure. Figure 2.3 shows a hierarchical data model. A tree may be defined as a set of nodes
such that there is one specially designated node called the root (node), which is perceived as the parent
(like a family tree having parent-child or an organisation is having owner-member relationships between
record types) of the segments directly beneath it. The remaining nodes are portioned into disjoint sets and
are perceived as children of the segment above them. Each disjoint set in turn is a tree and the sub-tree of
the root. At the root of the tree is the single parent. The parent can have none, one or more children. A
hierarchical model can represent a one-to-many relationship between two entities where the two are
respectively parent and child. The nodes of the tree represents many record types. If we define the root
record type to level-0, then the level of its dependent record types can be defined as being level-1. The
dependents of the record types at level-1 are said to be at level-2 and so on.

Hierarchical data model

The hierarchical path that traces the parent segments to the child segments, beginning from the left,
defines the tree shown in Figure 2.3. For example, the hierarchical path for segment 'E' can be traced as
ABDE, tracing all segments from the root starting at the leftmost segment. This left-traced path is known
as preorder traversal or hierarchical sequence. Also as shown in Figure 2.3 it can be noted that each parent
can have many children but each child has only one parent.
Advantages of Hierarchical Data Model

Simplicity: Since the database is based on the hierarchical structure, the relationship between the
various layers is logically (or conceptually) simple and design of a hierarchical database is
simple.

Data sharing: Because all data are held in a common database, data sharing becomes practical.

Data security: Hierarchical model was the first database model that offered the data security that
is provided and enforced by the DBMS.

Data independence: The DBMS creates an environment in which data independence can be
maintained. This substantially decreases the programming effort and program maintenance.

Data integrity: Given the parent/child relationship, there is always a link between the parent
segment and its child segments its parent, this model promotes data integrity

Efficiency: The hierarchical data model is very efficient when the database contains a large
volume of data in one-to-many (1:m) relationships and when the users require large numbers of
transactions using data whose relationships are fixed over time.

Available expertise: Due to a large number of available installed mainframe computer base,
experienced programmers were available.

Disadvantages of Hierarchical Data Model

Implementation complexity: Although the hierarchical database is conceptually simple, easy to


design and no data-independence problem, it is quite complex to implement. The DBMS requires
knowledge of the physical level of data storage and the database designers should have very good
knowledge of the physical data storage characteristics.

Inflexibility: A hierarchical database lacks flexibility. The changes in the new relations or
segments often yield very complex system management tasks. A deletion of one segment may
lead to involuntary deletion of all the segments under it. Such an error could be very costly.

Database management problems: If you make any changes to the database structure of the
hierarchical database, then you need to make the necessary changes in all the application
programs that access the database. Thus, maintaining the database and the applications can
become very difficult.

Lack of structural independence: Structural independence exists when the changes to the
database structure does not affect the DBMS's ability to access data. The hierarchical database is
known as a navigational system because data access requires that the preorder traversal (a
physical storage path) be used to navigate to the appropriate segments. So the application
programmer should have a good knowledge of the relevant access paths to access the data from
the database. Modifications or changes in the physical structure can lead to the problems with
applications programs, which will also have to be modified. Thus, in a hierarchical database
system the benefits of data independence is limited by structural dependence.

Application programming complexity: Applications programming is very time consuming and


complicated. Due to the structural dependence and the navigational structure, the application
programmers and the end-users must know precisely how the data is distributed physically in the
database and how to write lines of control codes in order to
access data. This requires knowledge of complex pointer systems, which is often beyond the
grasp of ordinary users who have little or no programming knowledge.

Implementation limitation: Many of the common relationships do not confirm to the one-to-
many relationship format required by the hierarchical database model. For example, each student
enrolled at a university can take many courses, and each course can have many students. Thus,
such many-to-many (n:m) relationships, which are more common in real life, are very difficult to
implement in a hierarchical data model

No standards: There is no precise set of standard concepts nor the does the implementation of
model confirm to a specific standard in a hierarchical data model.

Extensive programming efforts: Use of hierarchical model requires extensive activities, and
therefore, it has been called as a system created by programmers for programmers. Modern data
processing environment does not accept such concepts.

Network Data Model

The Database Task Group of the Conference on Data System Languages (DBTG/CODASYL) formalised
the network data model in the late 1960s. The network data models were eventually standardised as the
CODASYL model. The network data model is similar to a hierarchical model except that a record can
have multiple parents. The network data model has three basic components such as record types, dat items
(fields), and links. Further, in network model terminology, a relationship is called a set in which each set
is composed of at least two record types. First record type is called an owner record that is equivalent to
parent in the hierarchical model. Second record type is called a member record that is equivalent to child
in the hierarchical model. The connection between an owner and its member records is identified by a link
to which database designers assign a set-name. This set-name is used to retrieve and manipulate data. Just
as branches in hierarchal data models represent access path, the links between owner and their members
indicate access paths in network model and are typically implemented by pointers. The members in
network model can appear in more than one set and thus can have several owners i.e. it have many-to-
many (n:m) relationship. A set represents one-to-many (1:m) relationship between the owner and the
members.

Network Data Model


Advantages of Network Data Model

Simplicity: Similar to hierarchical data model, network model is also simple and easy to design.

Facilitating more relationship types: The network model facilitates in handling of one-to-many
(1:m) and many-to-many (n:m) relationships, which helps in modelling the real life situations.

Superior data access: The data access and flexibility is superior to that is found in the
hierarchical data model. An application can access an owner record and all the members’ record
within a set. If a member record in the set has two or more (like a faculty working for two
departments), then one can move from one owner to another.

Database Integrity: Network model enforces database integrity and does not allow members to
exist without an owner. First of all, the user must define the owner record and then the member.

Data independence: The network data model provides sufficient data independence by at least
partially isolating the programs from complex physical storage details. Therefore, changes in the
data characteristics do not require changes in the application programs.

Database standards: Unlike hierarchical model, network data model is based on the universal
standards formulated by DBTG/CODASYL and augmented by ANSI-SPARC. All the network
data models confirm to these standards, which also includes DDL and

DML

Disadvantages of Network Data Model

System complexity: Like hierarchical data model, network model also provides a navigational
access mechanism to the data in which the data are accesses one record at a time. This mechanism
makes the system implementation very complex.

Absence of structural independence: It is difficult to make changes in a network database. If


changes are made to the database structure, all subschema definitions must be revalidated before
any applications programs can access the database. In other words, although the network model
achieves data independence, it does not provide structural independence.

Not a user-friendly: The network data model is not a design for user-friendly system and is a
highly skill-oriented system.

Operational Anomalies- The insertion, deletion and updating operations of any record require
large number of pointers adjustments.

Relational Data Model

E.E Codd of IBM Research first introduced the relational data model in a paper in 1970. The relational
data model is implemented using very sophisticated Relational Database Management System (RDBMS).
The RDBMS provides some basic functions of the hierarchical and network DBMSs plus a host of other
functions that make the relational data models easier to understand and implement. The relational data
model performs the same functions that make the relational
data model simplified the user's view of the database by using simple tables instead of the more complex
tree and network structures. It is a collection of tables (also called relations) in which data is stored. Each
of the tables is a matrix of a series of row and column intersections. Tables are related to each other by
sharing common entity characteristics.

Advantages of Relational Data Model

Simplicity: A relational data model is even simpler than hierarchical and network models. It frees
the designers from the actual physical data storage details, thereby allowing them to concentrate
on the logical view of the database.

Structural independence: Unlike hierarchical and network models, the relational do does not
depend on the navigational data access system. Changes in the database structure model do not
affect the data access.

Ease of design, implementation, maintenance and uses: The relational model provides both
structural independence and data independence. Therefore, it makes the database design,
implementation, maintenance and usage much easier.

Flexible and powerful query capability: It provides very powerful, flexible, and easy-to-use
query facilities. Its structured query language (SQL) capability makes ad hoc queries a reality.

Disadvantages of Relational Data Model

Hardware overheads: Relational model needs more powerful computing hardware and data
storage devices to perform RDBMS assigned tasks.

Easy-to-design capability leads to bad design: Easy-to-use feature of relational database results
into untrained people generating queries and reports without much understanding and giving any
thought to importance of proper database design, which can lead to degraded performance and
slower system in long run.

Object-based Data Model

Entity-Relationship (E-R) Model

The ER model defines the conceptual view of a database. It works around real-world entities and the
associations among them. At view level, the ER model is considered a good option for designing
databases. It was introduced by Chen in 1976. ER model is a collection of objects of similar structures
called an entity set. The relationship between entity set is represented on the basis of number of entities
from entity set that can be associated with the number of entities of another set such as one-to-one(1:1),
one-to-many(1:m), or many-to-many(n:m) relationships.

There are building blocks or symbols to represent E-R diagram as shown below:

Using the symbols one can construct an ER diagram which can help to analyse and create relational table
and later database for the organisation. Let us state an example to create ER diagram for an enterprise
where customer, order and product are the entities for which the diagram would be as shown below:
Advantages of E-R Data Model

Straightforward relational representation: Having designed an E-R diagram for


application, the relational representation of the database model becomes relatively
straightforward.

Easy conversion for E-R to other data model: Conversion from E-R diagram to a network
or hierarchical data model can easily be accomplished.

Graphical representation for better understanding: An E-R model gives graphical and
diagrammatical representation of various entities, its attributes and relationships between
entities. This in turn helps in the clear understanding of the data structure and in minimizing
redundancy and other problems.

Disadvantages of E-R Data Model

No industry standard for notation: There is no industry standard notation for developing an
E-R diagram.

Popular for high-level design: The E-R data model is especially popular for high-level
database design.

Object-oriented Data Model

Object-oriented data model is a logical data model that captures the semantics of objects supported
object-oriented programming. It is a persistent and sharable collection of defined objects. It has the
ability to model complete solution. Object-oriented database models represent an entity and a class. A
class represents both object attributes as well as the behaviour of the entity. For example, a
CUSTOMER class will have not only the customer attributes such as CUST-ID, CUST-NAME,
CUST-ADDRESS and so on, but also procedures that imitate actions expected of a customer such as
update-order. Instances of the class-object correspond to individual customers. Within an object, the
class attributes takes specific values, which distinguish one customer (object) from another. However,
all the objects belonging to the class, share the behaviour pattern of the class. The object-oriented
database maintains relationships through logical containment.

The object-oriented database is based on encapsulation of data and code related to an object into a single-
unit, whose contents are not visible to the outside world. Therefore object-oriented data model emphasise
on objects, rather than on data alone. The object-oriented database management system (OODBMS) is
among the most recent approaches to database management.
Advantages of Object-oriented Data Model

Capable of handling a large variety of data types: Unlike traditional databases (such as
hierarchical, network or relational), the object-oriented database are capable of storing different
types of data, for example, pictures, voices, video, including text, numbers and so on.

Combining object-oriented programming with database technology: Object-oriented data


model is capable of combining object-oriented programming with database technology and thus,
providing an integrated application development system.

Improved productivity: Object-oriented data models provide powerful features such as


inheritance polymorphism and dynamic binding that allow the users to compose objects and

provide solutions without writing object-specific code. These features increase the productivity of
the database application developers significantly.

Improved data access: Object-oriented data model represents relationships explicitly, supporting
both navigational and associative access to information. It further improves the data access
performance over relational value-based relationships.

Disadvantages of Object-oriented Data Model

No precise definition: It is difficult to provide a precise definition of what constitutes an object


oriented DBMS because the name has been applied to a variety of products and prototypes, some
of which differ considerably from one another.

Difficult to maintain: The definition of objects is required to be changed periodically and


migration of existing databases to confirm to the new object definition with change in
organisational information needs. It possess real challenge when changing object definitions and
migrating databases

Not suited for all applications: Object-oriented data models are used where there is a need to
manage complex relationships among data objects. They are especially suited for specific
application such as engineering, e-commerce, medicines and so on, and not for all applications.
Its performance degrades and requires high processing requirements when used for ordinary
applications.

2.7.3 Physical Data Models

Physical data model represent the model where it describes how data are stored in computer memory,
how they are scattered and ordered in the memory, and how they would be retrieved from memory.
Basically physical data model represents the data at data layer or internal layer. It represents each table,
their columns and specifications, constraints like primary key, foreign key etc. It basically represents how
each tables are built and related to each other in DB.
Above diagram shows how physical data model is designed. It is represented as UML diagram along with
table and its columns. Primary key is represented at the top. The relationship between the tables is
represented by interconnected arrows from table to table. Above STUDENT table is

related to CLASS and SUBJECT is related to CLASS. The above diagram depicts CLASS as the parent
table and it has 2 child tables – STUDENT and SUBJECT.

In short we can say a physical data model has

Tables and its specifications – table names and their columns. Columns are represented along
with their datatypes and size. In addition primary key of each table is shown at the top of the
column list.

Foreign keys are used to represent the relationship between the tables. Mapping between the
tables are represented using arrows between them.

Physical data model can have denormalized structure based on the user requirement. The tables
might not be in normalized forms.
Physical data model is dependent on the RDBMS i.e.; it varies based on the RDBMS used. This means
datatype notation varies depending on the RDBMS. For example, we have different datatypes in SQL
server and oracle server. In addition, the representation of physical data model diagram may be different,
though it contains same information as described above – some may represent primary key and foreign
keys separately at the end of the column list. This data model depends on the user / designer how he
specifies the diagram and the RDBMS servers. Below diagram shows different ways of representing a
table.
Hence object based data model is based on the real requirement from the user, whereas record based data
model is based on the actual relationships and data in DB. The Physical data model is based on the table
structure in the DB.

Advantages of Physical model

You can understand better your product.


You can improve it.
You can have a proof of concept.
You can show your idea to other people.
Sometimes you can use it to predict the behavior of the real product.

Disadvantages of Physical Model

A prototype may be expensive.


It can be inaccurate.

Conceptual Modeling

A conceptual data model identifies the highest-level relationships between the different entities.

Features of conceptual data model include:

Includes the important entities and the relationships among them.


No attribute is specified.
No primary key is specified.

The figure below is an example of a conceptual data model.

conceptual model

From the figure above, we can see that the only information shown via the conceptual data model is the
entities that describe the data and the relationships between those entities. No other information is
shown through the conceptual data model.

dvantages of Conceptual Modeling

Since conceptual models are merely representations of abstract concepts and their respective
relationships, the potential advantages of implementing a conceptual model are many, but largely
depend on your own ability to devise a strong model in the first place. Generally speaking, the
primary advantages of a include:
Establishes Entities: By establishing and defining all the various entities and concepts that are
likely to come up throughout the course of a software development life cycle, a conceptual model
can help ensure that there are fewer surprises down the road, where entities or relationships
might otherwise have been neglected or forgotten.

Defines Project Scope: A solid conceptual model can be used as a way to define project
scope, which assists with time management and scheduling.

Base Model for Other Models: For most projects, additional, less abstract models will need to
be generated beyond the rough concepts defined in the conceptual model. Conceptual models
serve as a great jumping-off point from which more concrete models can be created, such as
logical data models and the like.

High-Level Understanding: Conceptual models serve as a great tool by providing a high-level


understanding of a system throughout the software development life cycle. This can be
particularly beneficial for managers and executives, who may not be dealing directly with coding
or implementation, but require a solid understanding of the system and the relationships therein.

Disadvantages of Conceptual Modeling

Since a conceptual model is so abstract, and thus, is only as useful as you make it, there can be a few
disadvantages or caveats to watch out for when implementing your own conceptual model:

Creation Requires Deep Understanding: While conceptual models can (and should) be
adaptive, proper creation and maintenance of a conceptual model requires a fundamental and
robust understanding of the project, along with all associated entities and relationships.

Potential Time Sink: Improper modeling of entities or relationships within a conceptual model
may lead to massive time waste and potential sunk costs, where development and planning have
largely gone astray of what was actually necessary in the first place.

Possible System Clashes: Since conceptual modeling is used to represent such abstract entities

and their relationships, it’s possible to create clashes between various components. In this case,
a clash simply indicates that one component may conflict with another component, somewhere
down the line. This may be seen when design or coding clash with deployment, as the initial
assumptions of scaling during design and coding were proven wrong when actual deployment
occurred.

Implementation Challenge Scales with Size: While conceptual models are not inherently ill-
suited for large applications, it can be challenging to develop and maintain a proper conceptual
model for particularly complex projects, as the number of potential issues, or clashes, will grow
exponentially as the system size increases.