Beruflich Dokumente
Kultur Dokumente
Databases
Objectives
In this chapter we will:
1. Scrutinize the characteristics of a Database
2. Study the features of a Database Management System
3. Look at the architectures of Database Management Systems
4. Examine the evolution of Database Technology
Learning outcomes:
At the end of the chapter, students will be able to:
1. Identify the characteristics of a Database
2. Describe the features of a Database Management System
3. Explain the various architectures of a Database Management System
4. Discuss the evolution of Database Technology
9.1
Introduction
student is called a record of that student. Since there are six students there
are six records. Thus, we can define record as a collection of logically related
fields. We can now say that a database is a collection of logically related
records.
Name
9721001
Maryam 21.05.1980 M
9721002
Aditya
12.06.1981 M
9732012
Rahul
Jain
03.01.1979 F
9724004
Ahmad
Ali
23.11.1979 M
9715023
C.
Suresh
07.09.1980 M
9.1
Date of
birth
Se
Address
x
Courses
Database Characteristics
Figure 9.1.1 describes how multiple users can share the same
database.
A database system does not only contain data but also the complete
definition and description of these data [5]. A database contains metadata
which describes the data itself - the structure, the type and the format of all
data and, additionally, the relationship between the data. Metadata is
sometimes known as "data about data".
Structured Data:
Data is called structured if it can be subdivided systematically and linked.
Lets us look at an example of how data can be structured. Table 9.1.2 has
four columns.
First column = Prename, second column = Name, third column = Postcode,
forth column = City
It is known that an entry in the first column must be a prename (coded as
string) and an entry in the third column must be a postcode (coded as
number).
Familyname
Postcode
City [string]
Rohit
Hanif
.
[string]
Gupta
Salam
..
14000
46350
.
Srinagar
Klang
.
9.1.3
When using a database, the application software does not need to know
about the physical data storage like encoding, format, storage place, etc. It
only communicates with the management system of a database (DBMS) via
a standardized interface with the help of a standardized language like SQL.
The access to the data and the metadata is entirely done by the DBMS. In
this way all the applications can be totally separated from the data.
Therefore database internal reorganizations or improvement of efficiency do
not have any influence on the application software. Figure 9.1.3 describes
how this can be done.
done correctly. A DBMS should support the task to bring only correct and
consistent data into the database. Additionally, correct transactions ensure
that the consistency is maintained during the operation of the system. An
example for inconsistency would be if contradictory statements were saved
in the same database.
Student Record in the Library
Name
Haziq Hamidi
Rami Mayan
Sunny Darwish
.
Address
No.11, Yellow Road, Ipoh, Malaysia
No.22 Oxfam Road, Klang, Malaysia
No. 134, Silk Road, Singapore
.
Address
No.11, Yellow Road, Ipoh, Malaysia
No.22 Oxfam Road, Klang, Malaysia
No. 74, Lime Tree Road, Norwich, UK
.
9.1.6
Data Views
extensive database is used by several people all with different needs and
rights. A student can view only his own data: - name address, contact
number, his courses, grades and fees paid. A lecturer can view only student
metric number, names and grades of those students that he/she teaches. A
Dean can only see all student, lecturers and staff details if they are in his
Faculty.
9.2
The next question is: How do we create and manage our databases? Data
management involves creating, modifying, deleting and adding data in files,
and using this data to generate reports or answer queries. The software that
allows us to perform these functions easily is called a Data Base
Management System (DBMS) [9], [10]. Using a DBMS files can be retrieved
easily and effectively.
There are many DBMS packages available in the market. Some of them are:
MySQL,
PostgreSQL,
Microsoft Access,
SQL Server,
FileMaker,
Oracle,
RDBMS,
dBASE,
Clipper, and
FoxPro.
Accessing desired records from a large relation using a scan on the relation can be
very expensive. Indices are data structures that permit more efficient access of
records. An index is built on one or more attributes of a relation; such attributes
constitute the search key. Given a value for each of the search-key attributes, the
index structure can be used to retrieve records with the specified search-key values
quickly. Indices may also support other operations, such as fetching all records
whose search-key values fall in a specified range of values.
A database schema is specified by a set of definitions expressed by a data-definition
language. The result of execution of data-definition language statements is a set of
information stored in a special file called a data dictionary. The data dictionary
contains metadata, that is, data about data. This file is consulted before actual data
are read or modified in the database system. The data-definition language is also
used to specify storage structures and access methods.
Data manipulation is the retrieval, insertion, deletion, and modification of
information stored in the database. A data-manipulation language enables users to
access or manipulate data as organized by the appropriate data model. There are
basically two types of data-manipulation languages: Procedural data-manipulation
languages require a user to specify what data are needed and how to get those
data; nonprocedural data-manipulation languages require a user to specify what
data are needed without specifying how to get those data.
A query is a statement requesting the retrieval of information. The portion of a datamanipulation language that involves information retrieval is called a query
language. Although technically incorrect, it is common practice to use the terms
query language and data-manipulation language synonymously.
Database languages support both data-definition and data-manipulation functions.
Although many database languages have been proposed and implemented, SQL has
become a standard language supported by most relational database systems.
Databases based on the object-oriented model also support declarative query
languages that are similar to SQL. SQL provides a complete data-definition
language, including the ability to create relations with specified attribute types, and
the ability to define integrity constraints on the data.
Query By Example (QBE) is a graphical language for specifying queries. It is widely
used in personal database systems, since it is much simpler than SQL for nonexpert users.
Forms interfaces present a screen view that looks like a form, with fields to be filled
in by users. Some of the fields may be filled automatically by the forms system.
Report writers permit report formats to be defined, along with queries to fetch data
from the database; the results of the queries are shown formatted in the report.
These tools in effect provide a new language for building database interfaces and
are often referred to as fourth-generation languages (4GLs).
Often, several operations on the database form a single logical unit of work, called a
transaction. An example of a transaction is the transfer of funds from one account
to another. Transactions in databases mirror the corresponding transactions in the
commercial world. Traditionally database systems have been designed to support
commercial data, consisting mainly of structured alphanumeric data. In recent
years, database systems have added support for a number of nontraditional data
types such as text documents, images, and maps and other spatial data. The goal is
to make databases universal servers, which can store all types of data. Rather than
add support for all such data types into the core database, vendors offer add-on
packages that integrate with the database to provide such functionality.
9.3
The database architecture is the set of specifications, rules, and processes that
dictate how data is stored in a database and how data is accessed by components
of a system [11], [12]. It includes data types, relationships, and naming
conventions. The database architecture describes the organization of all database
objects and how they work together. It affects integrity, reliability, scalability, and
performance. The database architecture involves anything that defines the nature
of the data, the structure of the data, or how the data flows.
The overall structure of the database is called the database schema. The schema
specifies data, data relationships, data semantics, and consistency constraints on
the data. The entity-relationship data model is based on a collection of basic
objects, called entities, and of relationships among these objects. An entity is a
thing or object in the real world that is distinguishable from other objects. For
example, each person is an entity, and bank accounts can be considered entities.
Entities are described in a database by a set of attributes. For example, the
attributes account-number and balance describe one particular account in a bank. A
relationship is an association among several entities. For example, a depositor
relationship associates a customer with each of her accounts. The set of all entities
of the same type and the set of all relationships of the same type are termed an
entity set and a relationship set, respectively.
Like the entity-relationship model, the object-oriented model is based on a
collection of objects. An object contains values stored in instance variables within
the object. An object also contains bodies of code that operate on the object. These
bodies of code are called methods. The only way in which one object can access the
data of another object is by invoking a method of that other object. This action is
called sending a message to the object. Thus, the call interface of the methods of
an object defines that object's externally visible part. The internal part of the object
the instance variables and method codeare not visible externally. The result is
two levels of data abstraction, which are important to abstract away (hide) internal
details of objects. Object-oriented data models also provide object references which
can be used to identify (refer to) objects.
In record-based models, the database is structured in fixed-format records of
several types. Each record has a fixed set of fields. The three most widely accepted
record-based data models are the relational, network, and hierarchical models. The
latter two were widely used once, but are of declining importance. The relational
model is very widely used. Databases based on the relational model are called
relational databases.
The relational model uses a collection of tables (called relations) to represent both
data and the relationships among those data. Each table has multiple columns, and
each column has a unique name. Each row of the table is called a tuple, and each
column represents the value of an attribute of the tuple.
9.4
1976: P. Chen proposed the Entity-Relationship (ER) model for database design
giving yet another important insight into conceptual data models. Such higher level
modeling allows the designer to concentrate on the use of data instead of logical
table structure.
Early 1980's: Commercialization of relational systems begins as a boom in
computer purchasing fuels DB market for business.
Mid-1980's: SQL (Structured Query Language) becomes "intergalactic standard".
DB2 becomes IBM's flagship product. Network and hierarchical models fade into the
background, with essentially no development of these systems today but some
legacy systems are still in use. Development of the IBM PC gives rise to many DB
companies and products such as RIM, RBASE 5000, PARADOX, OS/2 Database
Manager, Dbase III, IV (later Foxbase, even later Visual FoxPro), Watcom SQL.
Early 1990's: An industry shakeout begins with fewer surviving companies offering
increasingly complex products at higher prices. Much development during this
period centers on client tools for application development such as PowerBuilder
(Sybase), Oracle Developer, VB (Microsoft), etc. Client-server model for computing
becomes the norm for future business decisions. Development of personal
productivity tools such as Excel/Access (MS) and ODBC. This also marks the
beginning of Object Database Management Systems (ODBMS) prototypes.
Mid-1990's: The usable Internet/WWW appears. A mad scramble ensues to allow
remote access to computer systems with legacy data. Client-server frenzy reaches
the desktop of average users with little patience for complexity while Web/DB grows
exponentially.
Late-1990's: The large investment in Internet companies fuels tools market boom
for Web/Internet/DB connectors. Active Server Pages, Front Page, Java Servlets,
JDBC, Enterprise Java Beans, ColdFusion, Dream Weaver, Oracle Developer 2000,
etc are examples of such offerings. Open source solution come online with
widespread use of gcc, cgi, Apache, MySQL, etc. Online Transaction processing
(OLTP) and online analytic processing (OLAP) comes of age with many merchants
using point-of-sale (POS) technology on a daily basis.
Early 21st century: Decline of the Internet industry as a whole but solid growth of
DB applications continues. More interactive applications appear with use of PDAs,
POS transactions, consolidation of vendors, etc. Three main (western) companies
predominate in the large DB market: IBM (buys Informix), Microsoft, and Oracle.
Future trends [16], [17]: Huge (terabyte) systems are appearing and will require
novel means of handling and analyzing data. Large science databases such as
genome project, geological, national security, and space exploration data. Data
mining, data warehousing, data marts are a commonly used technique today. More
of this in the future without a doubt. Smart/personalized shopping using purchase
history, time of day, etc.
Successors to SQL (and perhaps RDBMS) will be emerging in the future. Most
attempts to standardize SQL successors have not been successful. SQL92, SQL2,
SQL3 are still underpowered and more extensions are hard to agree upon. Most
likely this will be overtaken by XML and other emerging techniques. XML with Java
for databases is the current poster child of the "next great thing".
Mobile database use is a product now coming to market in various ways. Distributed
transaction processing is becoming the norm for business planning in many arenas.
Probably there will be a continuing shakeout in the RDBMS market. Linux with
Apache supporting mySQL (or even Oracle) on relatively cheap hardware is a major
threat to high cost legacy systems of Oracle and DB2.
Object Oriented Everything, including databases, seems to be always on the verge
to sweeping everything before it. Object Database Management Group (ODMG)
standards are proposed and accepted and maybe something comes from that.
Ethical/security/use issues tend to be diminished at times but always come back.
Should you be able to consult a database of the medical records/genetic makeup of
a prospective employee? Should you be able to screen a prospective partner/lover
for genetic diseases? Should amazon.com keep track of your book purchasing?
Should
there
be
national
database
of
convicted
sex
offenders/violent
Summary
The software that allows us to perform these functions easily is called a Data
Base Management System (DBMS). Using a DBMS files can be retrieved easily
and effectively.
Exercise:
True or False
6. A data view can consist of a subset of the stored data or from the stored data
derived data (not explicitly stored).
Answers:
1. False
2. True
3. False
4. False
5. True
6. True
7. True
8. False
9. False
10. True
www.library.dal.ca/Files/How_do_I/Tutorials/Key_Points/ Databases.pdf
[2]
sigma.wsb-nlu.edu.pl/~szyszkin/bd-zim/en/lab-01-intro.doc
[3]
www.chessbase.com/workshop2.asp?id=1862
[4]
www.progressivetech.org/Resources/PDF/ 14%20Characteristics%20of
%20Healthy%20DB%20Creation...
[5]
coral.lili.uni-bielefeld.de/VM-HyprLex/techdok-31-95/node4.html
[6]
asbbs.org/files/2008/PDF/W/WangJ.pdf
[8]
edocs.bea.com/liquiddata/docs81/querybld/dataview.html
[7]
web.mit.edu/tdqm/www/tdqmpub/IEEEDEApr93.pdf
[9]
www.management-hub.com/database-management.html
[10]
en.wikipedia.org/wiki/Database_management_system
[11]
www.cit.iit.bas.bg/CIT_04_en/v4-1/103-109.pdf
[12]
www.sice.umkc.edu/~kumarv/cs570/Introduction.pdf
[13]
www.almaden.ibm.com/u/mohan/
Evolution_of_Database_Technology_Mohan_Talk_IM_Event_Bangalore_11-2006.ppt
[14]
www.cs.ualberta.ca/~zaiane/courses/cmput690/slides/Chapter1/sld009.htm
[15]
fria.fri.uniza.sk/~kmat/dbs/oodbs/OODBS1a.htm
[16]
citeseer.ist.psu.edu/62680.html
[17]
portal.acm.org/citation.cfm?id=627359