Sie sind auf Seite 1von 37

Chapter 1: Introduction

Chapter 1: Introduction
Purpose of Database Systems View of Data Database Languages Relational Databases Database Design Object-based and semistructured databases Data Storage and Querying Transaction Management Database Architecture Database Users and Administrators Overall Structure History of Database Systems

1.2

Introduction:Any organization, be it a Bank, Manufacturing Company, Hospital, University, Conglomerate or a Government Department, all require huge amount of data in one form or another. All such organizations need to collect data, process them and store them for future use. These organizations require data for a number of purposes, say: Preparing sales report Forecasts Accounts payable & receivable Medical histories, etc.

1.3

Database Management System (DBMS)


Data: Data are the raw facts that can be recorded and that have some implicit
meaning Ex. Names, Telephone numbers and address of people you know. Database: A database is a organized collection of related data with some implicit meaning. The data collected in a database is stored in a standard format designed to be share by one or multiple users. Ex. Student database. A database should have some implicit properties.

A database should be able to represent some aspect of real world (also called mini world). Changes to mini world are reflected in the database. A database is a logically interrelated collection of data with some inherent meaning. A random collection of data thus cannot be referred to as a database. A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested.

1.4

Database System: It is the computerized management of interrelated data and a set of program files to access those data. In other words, It is a computerized record keeping system used which provides an efficient and convenient environment to store and retrieve the data. This computerized system is thus responsible for maintaining the information and makes that information available on demand. A complete database system involves four major components Data, Hardware, Software (DBMS), and Users. Except for hardware component we will discuss all the other three components.

1.5

Users / Programs
DATBASE SYSTEM

Application Programs / Queries

DBMS SOFTWARE

Software to Process Queries / Programs

Software to Access Stored Data

Stored Database Definition (MetaData)

Stored Data

Simplified Database System


1.6

Database Management System(DBMS): A DBMS is a collection of programs that enables users to create and maintain a database. The DBMS hence is a general-purpose software system that facilitates the process of Defining, Constructing, and Manipulating databases for various applications. Application package such as SQL Server, Oracle, MS-Access, FoxPro are the examples of commercially available DBMSs. The various applications of DBMS are: Defining a database involves specifying the data type, structures, and constraints for the data to be stored in the database. Constructing the database is the process of storing the data itself on some storage medium that is controlled by DBMS. Manipulating a database includes such functions as querying the database to retrieve specific data, updating the database to reflect changes real world, and generating reports from the data.

1.7

An Example of Database
STUDENT Name Amit Kr Reha COURSE Course Name Intro to IT Data Structure Mathematics DBMS Roll No 17 8 Course No. 101 102 103 104 Class B.Tech MBA Hours 4 4 3 3 Major Comp. Sc Marketing Department Comp Sc Comp Sc Math Comp Sc Instructor Mrs. Sarika Mr. P Kumar Mr. A. Sinha Mrs. P Agg. Mr. P Kumar Mr. S Kumar Grade Prerequisite No. 102 103 101
1.8

SECTION

Section AIdentifier B C D E F

Course No. 103 101 102 103 101 104 Student No. 17 17 8 8 8 8

Semester I I II I I I

Year

GRADE_REPORT

91 91 92 92 92 92 Section Identifier D E A B C F

PREREQUISITE

Course No. 104 104 102

Database Applications:
Banking: Airlines:

all transactions

reservations, schedules

Universities: Sales: Online

registration, grades

customers, products, purchases retailers: order tracking, customized recommendations production, inventory, orders, supply chain resources: employee records, salaries, tax deductions

Manufacturing: Human

1.9

Purpose of Database Systems


In the early days, database applications were built directly on top of file systems. Data needed for each user application was stored in independent data files. Ex. Consider a bank that keeps information about all its customers and their savings /current /loan accounts. To allow bank employees to manipulate this information, the system needs a number of application programs that includes

1.

A program to debit or credit an account

2.
3. 4.

A program to add a new account


A program to find the balance of an account A program to generate monthly statements. etc.

System programmers writes the application programs to meet the needs of the bank. New application programs are added to the system as the need arises. For example, suppose that the bank decides to offer zero balance in savings account of some special customers. As a result, the bank creates new permanent files that contain information about all such customers, and it may have to write new application programs to deal with this situation. Thus, as time goes by, the system becomes bulky and unmanageable by acquires more and more files and application programs.
1.10

In database management, the files are stored in to a common pool, or database or

records, available to many different application programs.


DBMS serves as a software interface between end users and databases.

Disadvantages of File-processing system :

Data

redundancy and inconsistency


file formats, duplication of information in different files

Multiple

For example, a changed customer address may be reflected in savings-account records but not elsewhere in the system.
Difficulty

in accessing data to write a new program to carry out each new task

Need

Ex. The bank officers needs to find out the names of all customers who live within a particular postal-code area. The bank officer has now two choices: either obtain the list of all customers and extract the needed information manually or ask a system programmer to write the necessary application program. Ex. list to include only those customers who have an account balance of $10,000 or more.
1.11

Purpose of Database Systems (Cont.)


Data isolation multiple files and formats

Integrity problems

Integrity constraints (e.g. account balance > 0) become buried in program code rather than being stated explicitly Hard to add new constraints or change existing ones For example, the balance of a bank account may never fall below a prescribed amount (say,INR 500). The problem is compounded when constraints involve several data items from different files.

Atomicity of updates

Failures may leave database in an inconsistent state with partial updates carried out Example: Transfer of funds from one account to another should either complete or not happen at all

1.12

Concurrent

access by multiple users


accessed needed for performance concurrent accesses can lead to inconsistencies

Concurrent

Uncontrolled

Example: Two people reading a balance and updating it at the same time

Consider bank account A, containing $500. If two customers withdraw funds (say $50 and $100 respectively) from account A at about the same time, the result of the concurrent executions may leave the account in an incorrect (or inconsistent) state.
Suppose that the programs executing on behalf of each withdrawal read the old balance, reduce that value by the amount being withdrawn, and write the result back. If the two programs run concurrently, they may both read the value $500, and write back $450 and$400, respectively. Depending on which one writes the value last, the account may contain either $450 or $400, rather than the correct value of $350.
Security

problems to provide user access to some, but not all, data

Hard

For example, in a banking system, payroll personnel need to see only that part of the database that has information about the various bank employees.

Solution : Database systems offer common solutions to all the above problems
1.13

Objectives of Database Approach


Availability Data should be available for use by applications (both current and future) and queries. Shareability Data items prepared by an application must be available for use by other applications. Data Independence Programs can be written without worrying about the future changes in data. Data Integrity There should be no redundancy or data duplication. Maintainability The database must be maintainable.
Data Integrity

Maintainability

Data Independence

Shareability

Availability

1.14

Benefits of Database Approach


Reduction In Redundancy

Avoidance of Inconsistency

Data Can Be Shared

Backup & Recovery

Enforcement Of Standards

Maintenance Of Integrity

Database Security
Data Independence
1.15

Controlling Redundancy Storing the same data multiple times leads to duplication of data, wastage of storage space. Same data stored at different places can have different formats like date of birth at one place is stored like Jan-19-1974, and at other place it is stored as January 19, 1974. Security (Restricting Unauthorized Access) When multiple users share a database, it is likely that some users will not be authorized to access all information in the database. E.g. financial data is often considered confidential and hence only authorized person is allowed to access such data. In addition, some users may be permitted only to retrieve data, where as other are allowed both to retrieve and to update. Providing MultiUser Interfaces Because of different types of users, with varying levels of technical knowledge regarding database, DBMS provides a variety of user interfaces.

These include query languages for casual users, programming language interfaces for application programmer, forms and command codes for parametric users, and menu driven interfaces and natural language interfaces for stand alone users.
Representing Complex Relationships Among Data A database may include different varieties of data that are interrelated in number of ways. A DBMS have the capability to represent a variety of complex relationship among them as well as to retrieve and update related data easily and efficiently. Enforcing Integrity Constraints A DBMS provides capabilities for defining and enforcing Integrity constraints. The simplest type of integrity constraint involves specifying a data type for each data item. Ex. the student name must be a string of not more than 30 alphabetic characters. Providing Back-up and Recovery A DBMS provides us the facilities for recovering from h/w or s/w failures. The backup and recovery subsystem of the DBMS is responsible for recovery. For example, if computer system fails in the middle of a complex update program, the recovery subsystem is responsible for making sure that the database is restored to the state it was in before the program started executing. 1.16

1.17

Components of a Database Management System


Database Engine: It is responsible for storing, retrieving and updating the data. It is the one that most affects the performance and the ability to handle large problems. Data Dictionary: The data dictionary holds the definitions of all the data tables. It describes the type of data that is being stored, allows the DBMS to keep track of the data. Query Processor: The query processor is a fundamental component of the DBMS. It enables developers and users to store and retrieve data. Report Writer: A report writer enables you to set up the report on the screen to specify how items will be displayed or calculated. Dragging data on to the screen perform most of these tasks. The report writer can be integrated in to the DBMS or it can be standalone applications that the developer uses to generate code to create the needed report. Form Generator: A form generator helps the developer to create input forms. The main task of the form generator is to create forms that represent common user asks, making it easy for users to enter data. The forms can include graphs and images. Application Generator: An application is a collection of forms and reports designed for a specific user tasks. A good DBMS contains an application generator, which consists of tools that assists the developer in creating a complete application package. Security and Other Utilities: The DBMS has the responsibility to identify the user and then provide or limit access to various parts of the database.
1.18

Services Provided by DBMS


Transaction Management A transaction is a sequence of database operations that represent a logical unit of work and that access a database and transforms it from one state to another. A transaction can update a record, delete one, modify a set of records, etc. When the DBMS does a commit the changes made by the transaction are made permanent. Concurrency Control Concurrency control is the database management activity of coordination the action of database manipulation processes that operate concurrently. Specially, those concurrent manipulations which access shared data and can potentially interfere with one another. Recovery Management The recovery management system in a database ensures that the aborted or failed transactions create no adverse effects on the database or the other transactions. It makes sure that the database is returned to a consistent state after a transaction fails or aborts. Security Management Security refers to the protection of data against unauthorized access. Security mechanism of a DBMS make sure that only authorized users are given access to the data in the database. Language Interface The DBMS provides support languages used for the definition and manipulation of the data in the database. The data structures are created using DDL commands. The data manipulation is done using DML commands. Storage Management The DBMS provides a mechanism for management of permanent storage of the data. The internal schema defines how the data should be stored by the storage management mechanism and the storage manager interfaces with the operating system to access the physical storage. Data Catalog Management Data catalog or data dictionary is a database that contains description of the data in the database (metadata). It contains information about data, relationships, constraints and the entire schema that organize these features into a unified database. The data catalog can be queried to get information about the structures of the database.
1.19

View of data/ Data Abstraction:


Abstraction is a simplified mechanism used to hide the details of an object from various non-interested users. Ex. Car is an abstraction of a transportation vehicle but does not reveal details about model, year, color, and so on. Vehicle itself is an abstraction that includes the type car, truck, and bus.

Levels of Abstraction
Physical level:

It describes how data are actually stored. This is concerned with the physical storage of the information. At this level, complex low-level data structures are described in detail.

Logical level:

This abstraction describes what data are stored in the database, and what relationship exists among those data. It is used by database administrators, who must decide what information is to kept in the database.
type customer = record customer_id : string; customer_name : string; customer_street : string; customer_city : integer; end;

View level: v

This is the highest level of abstraction, which describes only part 1.20 of the entire database.

View of Data
An architecture for a database system

1.21

1.22

Instances and Schemas


Schema The overall design of the database is called the database schema. Or it is the logical structure of the database

Example: The database consists of information about a set of customers and accounts and the relationship between them) Analogous to type information of a variable in a program Physical schema/Internal Schema: database design at the physical level. Internal schema, not only define the various types of records but also how stored fields are represented and stored in the database system.The internal schemas uses the physical data model and describes the complete detail of data storage and access paths to the database. Logical schema: database design at the logical level. The conceptual schema includes the definitions of various types of conceptual data records of the database along with the relationship between different types of records.

External Schema / Subschema: At the highest level we have the external schema/subschema of the database. Each external view is defined by external schema, which consists of basic definitions of the various types of records found in that external view. Instance the actual content of the database at a particular point in time

Analogous to the value of a variable


1.23

Data Independence: The ability to modify a schema definition in one level without affecting a schema definition in the next higher level is called data independence.
Physical Data Independence: It is the ability to modify the physical schema without causing application programs to be rewritten. Modifications at the physical level are occasionally necessary to improve performance. Logical Data Independence: It is the ability to modify the logical or conceptual view (schema) without causing application programs to be rewritten. Modifications at the logical level are necessary whenever the logical structure of a database is altered.

1.24

Data Models

Underlying the structure of a database is called Data Models. A collection of tools for describing Data Data relationships Data semantics Data constraints There are three different kinds of data models: Object-based logical models Record-based logical models Physical models

Object-based logical models: Object-based logical models are used in describing data at the logical and view levels. There are many different models like: v v v v The entity-relationship model The object-oriented model The semantic data model The functional data model

1.25

Entity-Relationship Model : The entity-relationship (E-R) data model is based on a perception of a real world that consists of a collection of basic objects, called entities, and of relationships among these objects. An entity is a "thing" or "object" in the real world that is distinguishable from other objects.

For example, A person is an entity, and bank accounts can be considered as entities. The set of all entities of the same type, and the set of all relationships of the same type, are termed an entity set and relationship set respectively. The overall logical structure of a database can, be expressed graphically by an E-R diagram, which is built up from the following components: Rectangles: represent entity sets Ellipses: represent attributes Diamonds: represent relationships among entity sets Lines: link attributes to entity sets and entity sets to relationships

1.26

Customer_Address

Customer_PAN

Customer_Name

Customer_DOB

A\c No

Balance

Customer

Depositor

Account

1.27

The Object-Oriented Model Like the E-R model, the object-oriented model is based on collection of objects. An object is similar to a variable having some value. Objects also contain methods that operate on them. Ex. JODB, EyeDB, Durus, Postgre SQL.

Record-Based Logical Models In these models each record type


defines a fixed number of fields / attributes and each field is usually of a fixed length. There are three types of record-based data models i.e. Relational Model, Network Model, and Hierarchical Model.

1.28

Relational Model This model uses a collection of tables to represent both data and relationships among them. Network Model Data in the network model are represented by collections of records and relationships among data are represented by links, which can be viewed as pointers. The records in the database are organized as collection of arbitrary graphs.
Anil Ravi Kapil Neeraj
Abhishek

ADYSZ 1971K ADYSZ 1978K ADYSZ 1973K ADYSZ 1974K ADYSZ 1975K ADYSZ 1977K

Palam Palam Vihar Rampura Vikas-puri Mayur Ganj Shahadra

New Delhi Gurgaon Faridabad New Delhi Jaipur Delhi

A 101 A 102

8982 1998

A 158
A 197 A 201 A 207 A 211

2572
1903 21476 4242 11260

Mukesh

Sample Network Model


1.29

Hierarchical Model This model is similar to the network model in the sense that records and links represent data and relationships among them. It differs from the network model in that the records are organized as collections of trees rather than arbitrary graphs.

Anil

ADYSZ 1971K Ravi

ADYSZ 1978K

Kapil

ADYSZ 1973K
Neeraj

ADYSZ 1974K Abhishek ADYSZ 1975K Mukesh ADYSZ 1977K

A - 101 8982

A - 201 21476

A - 201 21476

A - 102 1998

A - 207 4242

A - 158 2572

A - 197 1903

A - 211 11260

Sample Hierarchical Model 1.30

Data Manipulation Language (DML)


Language for accessing and manipulating the data organized by the

appropriate data model

DML also known as query language Procedural user specifies what data is required and how to get those data Declarative (nonprocedural) user specifies what data is required without specifying how to get those data

Two classes of languages


SQL is the most widely used query language

1.31

Data Definition Language (DDL)


Specification notation for defining the database schema

Example:

create table account ( account-number balance

char(10), integer)

DDL compiler generates a set of tables stored in a data dictionary Data dictionary contains metadata (i.e., data about data)

Database schema Data storage and definition language Specifies the storage structure and access methods used

Integrity constraints Domain constraints Referential integrity (references constraint in SQL) Assertions

Authorization

1.32

SQL
SQL: widely used non-procedural language

Example: Find the name of the customer with customer-id 192-83-7465 select customer.customer_name from customer where customer.customer_id = 192-83-7465
Example: Find the balances of all accounts held by the customer with customer-id 192-83-7465 select account.balance from depositor, account where depositor.customer_id = 192-83-7465 and depositor.account_number = account.account_number Language extensions to allow embedded SQL Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database

Application programs generally access databases through one of


1.33

Types of Database Users Database Designer: A database designer (DD) is the person
who is responsible for designing the actual database. The DD should interact with all the potential groups of users and develop an External as well as Logical view of the database. His Responsibilities includes a) Identifying the data to be stored in the database b) Choosing appropriate structure to represent and store this data. This is done before the database is actually implemented. c) It is the responsibility of DD to communicate with all the prospective database users, in order to understand their requirements. d) The DD should come up with a design of the database, which should meet end user requirements and should be capable enough to perform all data processing functions.
1.34

Application Programmer: These are those computer professionals who


interact with the system through DML calls, which are written in some host language (such as C, VB, D2K etc.). The programs written by them are known as application programs. It is the job of application programmer to implement the specification collected by database designer to design as programs. For e.g. Banking system would include programs that generate payroll checks that debit accounts, that credit accounts, or that transfers funds between accounts. End Users: These are the person whose jobs require accessing the database for querying, updating and generating reports. 1. Sophisticated End Users: These users interact with the system without writing programs. Instead, they form their requests in a database query language. Each such query is submitted to a query processor whose function is to break down DML statement into instructions that the storage manager understands. 2. Specialized End Users: These are knowledgeable users who write specialized database applications Ex. computer-aided design systems, knowledge-base and expert systems, systems that store data with complex data types (for e.g. Graphics or audio data). 3. Naive End Users: These are unsophisticated users who interact with the system by invoking one of the permanent application programs that have been written previously.
1.35

Database Administrator (DBA): In a database environment the


primary resource is the database itself and secondary resource is the DBMS and related software. Administering these resources is the responsibility of the database administrator (DBA). His Responsibilities includes a) Schema Definition: The DBA creates the original database schema by writing a set of definitions that is translated by the DDL compiler to a set of tables that is stored permanently in the data dictionary. b) Storage Structure & Access-method Definition: The DBA creates appropriate storage and access methods by writing a set of definitions, which is translated by the data-storage and data-definition-language compiler. c) Schema Modification: Programmers accomplish the relatively rare modifications either to the database schema or to the description of the physical storage organization by writing a set of definitions that is used by either the DDL compiler or the data-storage and data-definitionlanguage compiler.
1.36

d) Data Access authorization granting: The granting of different types of authorization allows the database administrator to regulate which parts of the database various users can access. e) IntegrityConstraint specification: The data values stored in the database must satisfy certain constraints. The database administrator must specify such a constraint explicitly. The integrity constraints are kept in a special system structure that is consulted by the database system whenever an update takes place in the system. f) Data Appraisals: It is the job of DBA to carry out, from time to time, appraisals of the data held in the database for ensuring its completeness and accuracy, or that data is not being duplicated. DBA also Organizes facilities for addition of new data in the database and deletion of data no longer required, in consultation with user departments and systems personnel. g) Preparing Data Manuals: The DBA prepares manuals to help the user in making optimal use of the database facilities available.
1.37

Das könnte Ihnen auch gefallen