Beruflich Dokumente
Kultur Dokumente
Introduction
Definition of database
Consider a saving bank enterprise that keeps information about all customers and savings
accounts in permanent system files at the bank.
The bank will need a number of applications e.g.
Such a typical filing /processing system has the limitation of more and more files and application
programs being added to the system at any time. Such a scheme has a number of major
disadvantages:
i. Data redundancy and inconsistency - Since the files and application programs are
created by different programmers over a long period of time, the files are likely to have
different formats and the programs may be written in several programming languages.
Moreover, the same piece of information may be duplicated in several files. This
redundancy leads to higher storage and access costs. It may also lead to inconsistency i.e.
the various copies of the same data may no longer agree
ii. Difficulty in accessing - Suppose that one of the bank officers needs to find out the
names of all customers who live within the city's 78-phone code. The officer would ask
the data processing department to generate such a list. Such a request may not have been
anticipated while designing the system originally and the only options available are:-
1
Write the necessary application, therefore do not allow the data to be
accessed conveniently and efficiently
iii. Data isolation - Since data is scattered in various files and files may be in different
formats, it may be difficult to write new applications programs to retrieve the appropriate
data.
iv. Concurrent access anomalies - Interaction of concurrent updates may result in
inconsistent data e.g. if 2 customers withdraw funds say 50/= and 100/= from an account
at about the same time the result of the concurrent execution may leave the account in an
incorrect state.
v. Security problems - Not every user of the database system should be able to access all
the data. Since application programs are added to the system in an ad-hoc manner, it is
difficult to enforce security constraints.
vi. Integrity - The data value stored in the database must satisfy certain types of consistency
constraints e.g. a balance of a bank account may never fall below a prescribed value e.g.
5,000/=. These constraints are enforced in a system by adding appropriate code in the
various application programs. However, when new constraints are added there is need to
change the other programs to enforce.
Unlike the file system with many separate and unrelated files, the Database consists of logically
related data store in a single data repository. The problems inherent in file systems make using
the database system very desirable and therefore, the database represents a change in the way the
end user data are stored accessed and arranged
2
Advantages of a database
1. Centralized Control - Via the DBA it is possible to enforce centralized management and
control of data. This means that necessary modifications, which do not affect other
application changes, meet the data independence DBMS requirement.
2. Reduction of redundancies - Unnecessary duplication of data is avoided effectively
reducing total amount of data required, consequently the reduction of storage space. It also
eliminates extra processing necessary to trace the required data in a large mass of data. It
also eliminates inconsistencies. Any redundancies that exist in the DBMS are controlled and
the system ensures that his multiple copies are consistent.
3. Shared data - In a DBMS, sharing of data under its control by a number of application
programs and user is possible e.g. backups.
4. Integrity - Centralized control can also ensure that adequate checks are incorporated to the
DBMS provide data integrity. Data integrity means that the data contained in the database is
both accurate and consistent e.g. employee age must be between 28-25 years.
5. Security - Only authorized people must access confidential data. The DBA ensures that
proper access procedures are followed including proper authentication schemes process that
the DBMS and additional checks before permitting access to sensitive data. Different levels
of security can be implemented for various types of data or operations.
6. Conflict Resolution - The DBA is in a position to resolve conflicting requirements of
various users and applications. It is by choosing the best file structure and access method to
get optimum performance for the response. This could be by classifying applications into
critical and less critical applications.
7. Data Independence - It involves both logical and physical independence logical data
independence indicates that the conceptual schemes can be changed without affecting the
existing external schemes. Physical data independence indicates that the physical storage
structures/devices used for storing the data would be changed without necessitating a change
in the conceptual view or any of the external use.
You would require increased severity of security breeches and disruption of operation of the
organisation because of downtimes and failures.
Hierarchical Model
A hierarchical database model is a data model in which the data is organized into a tree-like
structure. The data is stored as records which are connected to one another through links. A
record is a collection of fields, with each field containing only one value. The entity type of a
record defines which fields the record contains.
In order to retrieve data from a hierarchical database the whole tree needs to be traversed starting
from the root node. This model is recognized as the first database model created by IBM in the
1960s.
Figure 1 shows a hierarchical structure that might be used for a human resources database. The
root segment is Employee, which contains basic employee information such as name, address,
and identification number. Immediately below it are three child segments: Compensation
(containing salary and promotion data), Job Assignments (containing data about job positions
and departments), and Benefits (containing data about beneficiaries and benefit options). The
Compensation segment has two children below it: Performance Ratings (containing data about
employees’ job performance evaluations) and Salary History (containing historical data about
employees’ past salaries). Below the Benefits segment are child segments for Pension, Life
Insurance, and Health, containing data about these benefiting plans.
Hierarchical and network DBMS are considered outdated and are no longer used for building
new database applications. They are much less flexible than relational DBMS and do not support
ad hoc, English language–like inquiries for information. All paths for accessing data must be
specified in advance and cannot be changed without a major programming effort.
4
Network Model
A network database model is a database model that allows multiple records to be linked to the
same owner file. The model can be seen as an upside down tree where the branches are the
member information linked to the owner, which is the bottom of the tree. The multiple linkages
which this information allows the network database model to be very flexible. In addition, the
relationship that the information has in the network database model is defined as many-to-many
relationship because one owner file can be linked to many member files and vice versa.
5
Evaluate the database application programs
Provide the required information flow
5. Operation
Once the database has passed the evaluation stage it is considered to be operational, the database,
its management, its users and its application programs constitute a complete I.S. The beginning
of the operational phase starts the process of system evaluation.
Preventive Maintenance
Corrective maintenance
Adaptive maintenance
Assignment and maintenance of access permission to new and old user
Generation of database access statistics to improve the efficiency and usefulness of
audits and to monitor system persons.
Periodic security based on the system generated statistics
Periodic (monthly, quarterly or yearly) system using summaries for internal billing or
budgeting purposes.
(i) Data Storage Management: It provides a mechanism for management of permanent storage
of the data. The internal schema defines how the data should be stored by the storage
management mechanism and the storage manager interfaces with the operating system to access
the physical storage.
(ii) Data Manipulation Management: A DBMS furnishes users with the ability to retrieve,
update and delete existing data in the database.
(iii) Data Definition Services: The DBMS accepts the data definitions such as external schema,
the conceptual schema, the internal schema, and all the associated mappings in source form.
(iv) Data Dictionary/System Catalog Management: The DBMS provides a data dictionary or
system catalog function in which descriptions of data items are stored and which is accessible to
users.
(v) Database Communication Interfaces: The end-user's requests for database access are
transmitted to DBMS in the form of communication messages.
6
(vi) Authorization / Security Management: The DBMS protects the database against
unauthorized access, either intentional or accidental. It furnishes mechanism to ensure that only
authorized users can access the database.
{vii) Backup and Recovery Management: The DBMS provides mechanisms for backing up
data periodically and recovering from different types of failures. This prevents the loss of data,
(viii) Concurrency Control Service: Since DBMSs support sharing of data among multiple
users, they must provide a mechanism for managing concurrent access to the database. DBMSs
ensure that the database kept in consistent state and that integrity of the data is preserved.
(x) Database Access and Application Programming Interfaces: All DBMS provide interface
to enable applications to use DBMS services. They provide data access via Structured Query
Language (SQL). The DBMS query language contains two components: (a) a Data Definition
Language (DDL) and (b) a Data Manipulation Language (DML).
ANSI-SPARC Architecture
The ANSI-SPARC Architecture, where ANSI-SPARC stands for American National Standards
Institute, Standards Planning And Requirements Committee, is an abstract design standard for
a Database Management System (DBMS), first proposed in 1975 .
The ANSI-SPARC model of a database identifies three distinct levels at which data items can be
described.
These levels form a three-level architecture comprising:
an external level,
a conceptual level, and
an internal level.
7
The objective of the three-level architecture is to separate the users’ view(s) of the database from
the way that it is physically represented. This is desirable for the following reasons:
1. It allows independent customised user views. Each user should be able to access the same
data, but have a different customised view of the data. These should be independent:
changes to one view should not affect others.
2. It hides the physical storage details from users. Users should not have to deal with
physical database storage details. They should be allowed to work with the data itself,
without concern for how it is physically stored.
3. The database administrator should be able to change the database storage structures
without affecting the users’ views. From time to time rationalisations or other changes to
the structure of an organisation’s data will be required.
4. The internal structure of the database should be unaffected by changes to the physical
aspects of the storage. For example, a changeover to a new disk.
5. The database administrator should be able to change the conceptual or global structure of
the database without affecting the users. This should be possible while still maintaining
the desired individual
8
The Conceptual Level
The conceptual level describes what data is stored in the database and the relationships among
the data. It is a complete view of the data requirements of the organisation that is independent of
any storage considerations.
The conceptual level represents:
All entities, their attributes, and their relationships.
The constraints on the data.
Security and integrity information.
The conceptual level supports each external view, in that any data available to a user must be
contained in, or derivable from, the conceptual level. The description of the conceptual level
must not contain any storage dependent details.
Database Schema
The overall description of a database is called the database schema. There are three different
types of schema corresponding to the three levels in the ANSI-SPARC architecture.
The external schemas describe the different external views of the data. There may be many
external schemas for a given database.
The conceptual schema describes all the data items and relationships between them, together
with integrity constraints (later). There is only one conceptual schema per database. At the
lowest level, the internal schema contains definitions of the stored records, the methods of
representation, the data fields, and indexes. There is only one internal schema per database.
9
Phases of Database design: Conceptual, Logical and Physical design.
Database design is the process of producing a detailed data model of database to meet an end
users requirement.
It is a process of constructing a data model for each view of the real world problem which is
ER Modelling :
Pictorial Representation of the Real world problem in terms of entities (which have
attributes) and relations between the entities is referred as ER diagram.
It is a process of constructing a model of information , which can then be mapped into storage
objects supported by the Database Management System.
10
Table Generation From ER Model
The Cardinality of relationships among the entities can be considered while deriving the
Normalization of Tables
In most cases in the enterprise world , normalization upto Third Normal form would suffice.
The physical design of the database specifies the physical configuration of the database on the
storage media.
This step involves describing the base relations, file organisations, and indexes design used to
achieve efficient access to the data, and any associated integrity constraints and security
measures.
ER Symbols
Weak Entity
11
Relationship
Identifying Relationship
Attribute
Key Attribute
Multivalued Attribute
Composite Attribute
Derived Attribute
Total participation of E2 in R
Relationship types
12
Example 1
Example 2
A software company keeps details of the computer systems that it develops. Each system is given
a unique number, a description and a scheduled completion date. The development of each
system is divided into a number of tasks, each of which is allocated a task number (which is only
unique within a system), a description and a budget. The company employs a number of
programmers to work on tasks. Each programmer has an employee number and a name. A
programmer is assigned to a number of tasks and some tasks have more than one programmer
assigned to them. When a programmer is allocated to a task, they are given a number of days to
complete that task.
13
Example 3
A marketing company has several branches located throughout the Country. Each branch has
several marketing employees, one of whom is employed as the branch manager. Each branch is
responsible for a group of contracted marketing projects, and any number of the employees,
possibly in different branches, may work on a contracted marketing project. It is also likely that
an employee could be working on many contracted marketing projects at a time, and indeed
could work on the same contracted marketing project at different points during the project’s
lifetime (which could span several months). A contracted marketing project will involve the
development of one or more marketing events, which currently relate to one of four media
alternatives - TV, radio, newspaper or the Internet. It is likely that the number of media
alternatives will increase over time, as new media channels emerge, e.g. interactive TV, wifi , tc.
14
Refining the Entity-Relationship Diagram (Enhanced Entity Relationship Diagram)
This section discusses three basic rules for modeling relationships
15
Primary and Foreign Keys
Primary and foreign keys are the most basic components on which relational theory is based.
Primary keys enforce entity integrity by uniquely identifying entity instances. Foreign keys
enforce referential integrity by completing an association between two entities. The next step in
building the basic data model to;
1. Identify and define the primary key attributes for each entity
2. Validate primary keys and relationships
3. Migrate the primary keys to establish foreign keys
Define Primary Key Attributes
The primary key is an attribute or a set of attributes that uniquely identify a specific instance of
an entity. Every entity in the data model must have a primary key whose values uniquely identify
instances of the entity.
To qualify as a primary key for an entity, an attribute must have the following properties:
• It must have a non-null value for each instance of the entity
• The value must be unique for each instance of an entity
• The values must not change or become null during the life of each entity instance
16
Name is the least desirable candidate. While it might work for a small department where it would
be unlikely that two people would have exactly the same name, it would not work for a large
organization that had hundreds or thousands of employees. Moreover, there is the possibility that
an employee's name could change because of marriage.
Employee ID would be a good candidate as long as each employee was assigned a unique
identifier at the time of hire. Social Security would work best since every employee is required to
have one before being hired.
Composite Keys
Sometimes it requires more than one attribute to uniquely identify an entity. A primary key that
made up of more than one attribute is known as a composite key. Figure 3.4 shows an example
of a composite key. Each instance of the entity Work can be uniquely identified only by a
composite key composed of Employee ID and Project ID.
Foreign Keys
A foreign key is an attribute that completes a relationship by identifying the parent entity.
Foreign keys provide a method for maintaining integrity in the data (called referential integrity)
and for navigating between different instances of an entity. Every relationship in the model must
be supported by a foreign key.
Generalization Hierarchies
Another method of characterizing entities is by both similarities and differences. For example,
suppose an organization categorizes the work it does into internal and external projects. Internal
projects are done on behalf of some unit within the organization.
External projects are done for entities outside of the organization. We can recognize that both
types of projects are similar in that each involves work done by employees of the organization
within a given schedule. Yet we also recognize that there are differences between them. External
projects have unique attributes, such as a customer identifier and the fee charged to the customer.
This process of categorizing entities by their similarities and differences is known as
generalization.
Types of Hierarchies
A generalization hierarchy can either be overlapping or disjoint. In an overlapping hierarchy an
entity instance can be part of multiple subtypes. For example, to represent people at a university
you have identified the supertype entity PERSON which has three subtypes, FACULTY,
STAFF, and STUDENT. It is quite possible for an individual to be in more than one subtype, a
staff member who is also registered as a student, for example.
In a disjoint hierarchy, an entity instance can be in only one subtype. For example, the entity
EMPLOYEE, may have two subtypes, CLASSIFIED and WAGES. An employee may be one
17
type or the other but not both. Figure 1 shows A) overlapping and B) disjoint generalization
hierarchy.
Disjoint partial (optional)
18
Structured Query Language
SQL stands for Structured Query Language use for storing, manipulating and retrieving
relational database data. SQL queries to retrieve data from database same as adding and
manipulating database data.
SQL is a very powerful and diverse database language use to storing data into databases. SQL
is loosely typed language so you can learn easily. In this SQL tutorial, we use command line
examples to know about executing speed of SQL. It's take very bit of time for executing and
retrieving result. SQL is a greater tool with web languages such as PHP, Python, Java, ASP et
cetera to build dynamic web applications. Before starting SQL, relational databases have
several point that are important to keep in mind.
SQL statements are divided into five different categories: Data definition language (DDL),
Data manipulation language (DML), Data Control Language (DCL), Transaction Control
Statement (TCS), Session Control Statements (SCS).
Data definition statement are use to define the database structure or table.
Statement Description
Data manipulation statement are used for managing data within table object.
Statement Description
19
INSERT Insert data into a table.
CALL Statements are supported in PL/SQL only for executed dynamically. CALL
EXPLAIN PLAN a PL/SQL program or EXPLAIN PATH access the data path.
Data control statement are used to give privileges to access limited data.
Statement Description
Number Datatypes
20
Data Type Description
Maximum
ANSI, DB2 Datatypes Oracle Data types
Precision
INTEGER 38 digits
DECIMAL[(precision [,
38 digits
scale ])]
NUMBER(p,s)
NUMERIC[(precision [,
38 digits
scale ])]
21
BINARY_DOUBLE datatype use double binary precision (64-bit).
BINARY_DOUBLE
This data type requires 9 bytes including length byte.
Character Datatypes
Character Data type use to store alphabetic/alphanumeric, following are character data
types in Oracle SQL:
22
SQL CREATE DATABASE Syntax
Example :
SQL> CREATE DATABASE user_data;
Example :
SQL> DROP DATABASE user_data;
23
);
Example :
SQL> CREATE TABLE users_info(
no NUMBER(3) NOT NULL,
name VARCHAR(30),
address VARCHAR(70),
contact_no VARCHAR(12),
PRIMARY KEY (no)
);
Example :
SQL> INSERT INTO users_info (no,name,address)
VALUES (1, 'Opal Kole', '63 street Ct.');
Example :
SQL> INSERT ALL
INTO users_info (no, name, address, contact_no) VALUES (4, 'Paul
Singh', '1343 Prospect St', 000-444-7141)
24
INTO users_info (no, name, address, contact_no) VALUES (5, 'Ken
Myer', '137 Clay Road', 000-444-7084)
INTO users_info (no, name, address, contact_no) VALUES (6, 'Jack
Evans', '1365 Grove Way', 000-444-7957)
INTO users_info (no, name, address, contact_no) VALUES (7, 'Reed
Koch', '1274 West Street', 000-444-4784)
SELECT * FROM dual;
UPDATE table_name
SET column_name1 = value1, column_name2 = value2 , ...
[ WHERE condition ]
[ LIMIT number ];
Example :
SQL> UPDATE users_info
SET name = "Beccaa Moss" , address ="2500 green city."
WHERE no = 3;
Example :
SQL> DELETE users_info
WHERE no = 3;
Example :
Example :
Add new column to a 'users_info' table
SQL> ALTER TABLE users_info ADD postalcode VARCHAR2(8);
26