Sie sind auf Seite 1von 8

DATABASE SYSTEMS SUMMARY

Database Management System


A Database Management System is a collection of interrelated data and a set of programs to access to those data. The collection of Data is referred as Database. The primary goal of a Database Management System (DBMS) is to provide a way to store and retrieve database information that is both convenient and efficient.

Data are very important in any information system. So we must have a very good way of managing such data. The DBMS contain a query language that makes it possible to give quick and exact answers to adhoc queries. The DBMS helps to create an environment in which users have better access to data. Such access makes it possible for end users to respond more quickly and efficiently to changes in their environment. DBMS helps to give an integrated view of the organizations operations. Thus it becomes much easier to see how operations in one unit affect others. Data inconsistency is considerably reduced in a properly designed database. Database systems are designed to manage large volume of information. Data management involves both defining the structure for storage of information and providing mechanism for the manipulation of data. Also the database system must ensure the safety of the information stored. The DBMS makes it possible to share the data in the database among multiple applications and users. DBMS serves as the intermediary between the database and the user by translating user request to the database and getting back exact responses to the user. Also the DBMS hides the internal complexities from the application programs that use the database.

File Processing System


A File can store records and we can extract these records using different applications programs. The simplest data retrieval task from file requires extensive programming. Also this is a time consuming and a high skill activity. To access the data in file the programmer must aware of the physical structure of the file. Security features such as effective password protection, locking parts of file etc are very difficult to program. The File system exhibits structural dependence. That is, a change in file structure such as addition or deletion of a field requires the modification of all programs using that file. Data dependence: A change in file data characteristic such as change in a field data type from integer to decimal requires changes in all programs that access the file. A typical file processing system is supported by conventional operating systems. The system stores permanent record in various files. It uses various application programs to extract records from, and add records to the appropriate files. Before using DBMS to store and retrieve data, organizations stored information in file processing systems. But as the number of files in the system expands, system administration becomes difficult too. Each file must have its own file management system, composed of programs that allow user to create the file structure, add data to the file, delete data from the file, modify the data in the file, list the file contents etc. Even a simple file processing system containing 25 files requires 5 * 25 =125 file management programs. Each department in the organization owns its data by creating its own files. So the number of files can multiply rapidly. Security features such as effective password protection, locking out part of files or part of system itself and other data confidentiality measures are difficult to program and are usually omitted. The file systems structure and lack of security makes it difficult to pool data. The same basic data is stored in different locations. But it is very unlikely that that data stored in different locations will always be updated consistently, hence maintaining different versions of same data. The file processing system is simply not suitable for modern data management and information requirement.

Disadvantages of File Processing System


The conventional file processing system suffers from the following shortcomings. Data Redundancy Data Inconsistency Difficulty in Accessing Data Data Isolation Integrity Problems Atomicity Problem Concurrent Access anomalies Security Problems Data Redundancy Data Redundancy means same information is duplicated in several files. This makes data redundancy. Data Inconsistency Data Inconsistency means different copies of the same data are not matching. That means different versions of same basic data are existing. This occurs as the result of update operations that are not updating the same data stored at different places. Example: Address Information of a customer is recorded differently in different files. Difficulty in Accessing Data It is not easy to retrieve information using a conventional file processing system. Convenient and efficient information retrieval is almost impossible using conventional file processing system. Data Isolation Data are scattered in various files, and the files may be in different format, writing new application program to retrieve data is difficult. Integrity Problems The data values may need to satisfy some integrity constraints. For example the balance field Value must be greater than 5000. We have to handle this through program code in file processing systems. But in database we can declare the integrity constraints along with definition itself. Atomicity Problem It is difficult to ensure atomicity in file processing system. For example transferring $100 from Account A to account B. If a failure occurs during execution there could be situation like $100 is deducted from Account A and not credited in Account B. Concurrent Access anomalies If multiple users are updating the same data simultaneously it will result in inconsistent data state. In file processing system it is very difficult to handle this using program code. This results in concurrent access anomalies. Security Problems Enforcing Security Constraints in file processing system is very difficult as the application programs are added to the system in an ad-hoc manner.

Data Independence
A major purpose of a database system is to provide the users with an abstract view of data. To hide the complexity from users database apply different levels of abstraction. The following are different levels of abstraction. Physical Level Logical Level View Level Physical Level Physical Level is the lowest level of abstraction and it defines the storage structure. The physical level describes complex low level data structures in detail. The database system hides many of the lowest level storage details from the database programmers. Database Administrators may be aware of certain details of physical organization of data. Logical Level This is the next higher level of abstraction which describes what data are stored in database, relation between data, types of data etc. Database programmers, DBA etc knows the logical structure of data View Level This the highest level of abstraction. It provides different view to different users. At the view level users see a set of application programs that hide details of data types. The details such as data type etc are not available at this level. Only view or Access is given to a part of data according to the users access right Physical Data Independence The changes in Physical Level does not affect or visible at the logical level. This is called physical data independence. Logical Data Independence The changes in the logical level do not affect the view level. This is called logical data independence.

KEYS in Database: Primary Key, Candidate Key, Super Key


Super Keys Candidate Key Primary Key Super Keys Super key stands for superset of a key. A Super Key is a set of one or more attributes that are taken collectively and can identify all other attributes uniquely.

Candidate Keys Candidate Keys are super keys for which no proper subset is a super key. In other words candidate keys are minimal super keys. Primary Key It is a candidate key that is chosen by the database designer to identify entities with in an entity set. Primary key is the minimal super keys. In the ER diagram primary key is represented by underlining the primary key attribute. Ideally a primary key is composed of only a single attribute. But it is possible to have a primary key composed of more than one attribute. Composite Key Composite key consists of more than one attributes. Example: Consider a Relation or Table R1. Let A, B, C, D, E are the attributes of this relation. R(A,B,C,D,E) ABCDE This means the attribute 'A' uniquely determines the other attributes B,C,D,E. BCADE This means the attributes 'BC' jointly determines all the other attributes A, D, E in the relation. Primary Key: A Candidate Keys: A, BC Super Keys: A, BC, ABC, AD ABC and AD are not Candidate Keys since both are not minimal super keys.

Database System Structure


A database System is divided into modules based on their function. The functional components of a database system can be broadly divided into the storage manager and the query processor components. Storage Manager Query Processor Storage Manager The storage manager is important because database typically require a large amount of storage space. So it is very important efficient use of storage, and to minimize the movement of data to and from disk. A storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and the queries submitted to the system. The Storage manager is responsible for the interaction with the file manager. The Storage manager translates the various DML statements into low level file system commands. Thus the storage manager is responsible for storing, retrieving, and updating data in the database. The storage manager components include the following. Authorization and Integrity Manager Transaction Manager File Manager Buffer Manger Authorization and Integrity Manger tests for the satisfaction of integrity constraints and checks the authority of users to access data. Transaction manager ensures that the database remains in a consistent state and allowing concurrent transactions to proceed without conflicting. The file manager manages the allocation of space on disk storage and the data structures used to represent information stored on disk. The Buffer manager is responsible for fetching the data from disk storage into main memory and deciding what data to cache in main memory. The storage manager implements the following data structures as part of the physical system implementation. Data File, Data Dictionary, Indices. Data files stores the database itself. The Data dictionary stores Meta data about the structure of database, in particular the schema of the database. Indices provide fast access to data items. Query Processor The Query Processor simplifies and facilitates access to data. The Query processor includes the following component. DDL Interpreter DML Compiler Query Evaluation Engine The DDL interpreter interprets DDL statements and record the definition in the data dictionary. The DML compiler translates DML statements in a query language into an evaluation plan consisting of low-level instructions that the query evaluation engine understands. The DML compiler also performs query optimization, that is it picks the lowest cost evaluation plan from among the alternatives. Query evaluation engine executes low level instructions generated by the DML compiler.

Normalization
Normalization is the process of decomposing a relation(table) based on functional dependency and primary key. Un-Normalized Form First Normal Form (1 NF) Second Normal Form (2 NF) Third Normal Form (3 NF) Boyce Codd Normal Form (BCNF) Un-Normalized Form (0NF) Un-Normalized relation contains non atomic values. Each row may contain multiple set of values for some of the columns. These multiple values in a single row are called non atomic value.

First Normal Form A Relation is said to be in 1NF if the values in the domain of each attribute of relation are atomic. Each cell of the table must have single value. No two rows in a table may be identical. Second Normal Form A relation R is said to be in 2NF if it is in 1NF and there should not be any partial dependency. Here all the non key attributes are dependent on the key alone. No attribute is dependent upon a part of the key. Any relation having a key with single attribute is in 2NF. Third Normal Form A relation R is in 3NF if it is in 2NF and has no transitive dependency. Here all the non-key attributes are dependent on the key alone. There should not be any dependency among the non-key attributes. Boyce-Codd Normal Form (BCNF) A relation R is in BCNF if every determinant is a candidate key. Problem with BCNF: Given a relation R, Functional Dependency F, BCNF may or may not preserve all given functional dependencies.

Transaction
The term transaction refers to a collection of operation that forms a single logical unit of work. A transaction T is a logical unit of database processing that includes one or more database access operation. Transaction Properties: ACID Properties Atomicity Consistency Isolation Durability Atomicity A transaction must be atomic. That is, it ensure that either all operations of the transactions are reflected properly in the database or none should Consistency If the database is in a consistent state before the execution of the transaction, the database remains consistent after the execution of the transaction. Example: Transaction T1 transfers $100 from Account A to Account B. Both Account A and Account B contains $500 each before the transaction. Transaction T1 Read (A) A=A-100 Write (A) Read (B) B=B+10 Consistency Constraint Before Transaction execution Sum = A + B Sum = 500 + 500 Sum = 1000 After Transaction execution Sum = A + B Sum = 400 + 600 Sum = 1000 Before the execution of transaction and after the execution of transaction SUM must be equal. Isolation When multiple transactions are executing concurrently, then each transaction is unaware of other transactions executing concurrently in the system. Ie the execution of one transaction must not interfere with another. Durability Changes applied to a database by a committed transaction must be made permanent even if the system fails.

Das könnte Ihnen auch gefallen