Sie sind auf Seite 1von 19

CHAPTER 5: MANAGING DATA RESOURCES

5.1 Organizing Data in a Traditional File Environment


5.2 The Database Environment
5.3 Designing Databases
5.4 Discussion Questions
5.1 ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT
Information is becoming as important a business resource as money, material,
and people. Businesses are realizing the competitive advantage they can gain
over their competition through useful information, not just data.
Why should you know about organizing data? Because it is almost inevitable that
some day you will be establishing or at least working with a database of some
kind. As with anything else, understanding the lingo is the first step to
understanding the whole concept of managing and maintaining information.
File Organization Terms and Concepts
The first few terms that we need to define are field, record, file, and database.
An entity is basically the person, place, thing, or event about which we maintain
information. Each characteristic or quality describing an entity is called an
attribute. Each record requires a key field, or unique identifier. The best
example of this is your National Identity Number (ID): there is only one per
person. That explains in part why so many companies and organizations ask for
your National Identity Number when you do business with them.
Accessing Records
Description of file access will be based on magnetic tape and disk storage for
computer data. To understand how information is accessed from these mediums,
think about the difference between a music cassette tape and a music CD. If you
want to get to a particular song on a cassette tape, you must pass by all the other
songs sequentially. If you want to get to a song on CD, you can go directly to that
song without worrying about any of the others. That is the difference between
sequential and direct access organization for database records.
Sequential file organization, in conjunction with magnetic tape, is typically used
for processing the same information on all records at the same time. It is also
good for processing many records at once, commonly called batch processing.
Direct or random file organization is used with magnetic disks. Because of
increased speed and improved technological methods of recording data on disks,
many companies now use disks instead of tapes. The other advantage that disks
have over tapes is that disks don't physically deteriorate as fast as tapes do.

There is less danger of damaging the surface of the disks than there is of
breaking a tape.
Indexed Sequential Access Method
To explain the indexed sequential access method (ISAM), let's go back to the
example of the cassette tape. A cassette tape label has a printed list of the songs
contained on it which gives you a general idea of where to go on the tape to find
a particular tune. So too with computer records on a sequential access tape
using the key field. It gives the computer a pretty accurate idea of where a
particular record is located. That's why it is so important to have a unique ID as
the key field.
Direct file access method
This access method also uses key fields in combination with mathematical
calculations to determine the location of a record. If you order something by
phone from a mail order catalogue, the person taking your order does not have to
wait for the computer to randomly select your record; using the direct file access
method, the computer can find you very quickly.
Limitations of File-based Approach
1. File processing systems store groups of records in separate files
2. Separation and isolation of data
Each program maintains its own set of data.
Users of one program may be unaware of potentially useful data held by other
programs.
3. Duplication of data (redundancy)
Same data is held by different programs.
Wasted space and potentially different values and/or different formats for the
same item- data integrity problem, produce inconsistent results
4. Data dependence/ application program dependency
File structure/ format and records are defined in the program/application code.
Time consuming and error prone tasks
5. Incompatible file formats
Programs are written in different languages, and so cannot easily access each
other's files; rather files written in different programming languages cannot readily
be combined or compared.
6. Fixed Queries/Proliferation of application programs
Programs are written to satisfy particular functions. Any new requirement needs a
new program.
5.2 THE DATABASE ENVIRONMENT
Database Approach
Integrated data

All the application data is stored in a database


Programmer is not responsible for co-ordinating files; DBMS will do it.
Less duplication of data
Data is stored in only one place
Less data integrity problems
Program/data independence
Record formats are stored in DB itself, so it is accessed by DBMS, not by
application programs
Minimises the impact of data format changes on application programs
Easier representation of user's view of data.
Controlled access to database
Benefits Of The Database Approach

Minimal data redundancy and improved data consistency - the concept of


normalisation ensures that there is reduced data redundancy in a
database.
Ease of access to data/ improved data accessibility and responsiveness data in a database is interrelated and is in the same format. This facilitates
better data retrieval for general use.
Increased of development productivity/Ease of application development
and reduced program maintenance/ reduced application development
time. New application programs to manipulate data can be written with
ease because the data is integrated and is in the same format. Designing
and implementing a new database from scratch may take more time.
Improved data sharing- database systems allow multiple access and
update of data in a consistent manner. They also allow different views of
the same data.
Enforcement of standards/Improved security and integrity
Improved data quality Availability of up to date information
Flexibility/Program-data independence - data is independent from
applications and shared by multiple users and applications. It should be
possible to effect changes to an application program that accesses the
data without having to change the structure of the data itself. Similarly it
should be possible to change the structure of the data without affecting the
application program that operates on it.
Persistence - it is possible to maintain data over long periods of time,
independent of any program that accesses it.
Resilience - the ability of data to survive hardware and software failures
without sustaining loss or becoming inconsistent can be provided for in a
DB environment.

Having seen the advantages of the database approach it is important to look at


the Data Base Management System (DBMS) which is defined as a software
system that:

Enables users to define, create, and maintain the database and which
provides controlled access to this database.
Allows the storage, retrieval, and manipulation of information in a
prescribed format.
It interfaces with application programs that access the database data
Allows users to deal with the data in abstract terms, rather than as the
computer stores the data
Links between the physical database, the computer and the operating
system, and on the other hand, the users.

Examples of DBMS:
Microsoft Works
SQL Server
INNOPAC
Oracle
CDS/ISIS
Dbase I,II,III,IV
Microsoft Access
Lotus Approach
Paradox
Components Of A DBMS Used By Systems Personnel
Data Dictionary: contains the names and descriptions of every data element in a
database. It also contains a description of how data elements relate to one
another. Through the use of a data dictionary the DBMS stores the data in a
consistent manner, thus reducing redundancy.
Data Languages: To place the data in database, the special language is used to
describe the characteristics of the data elements. This language is called the
Data Definition language. DML is used to retrieve and process data from a
Database.
Teleprocessing Monitor: Communication software that manages
communication between the Database and remote terminals.
Application Development: This is a set of programs designed to help
programmers develop application program that use the database.
Security Software: This provides a variety of tools to shield the database from
unauthorised access.
Archiving and recovery system: Provides the database manager with tools to
make copies of the database, which can be used by in case original database
records are damaged. Restart, Recovery system are tools used to restart the
database and to recover lost data in the event of a failure.

Components Of A DBMS Used By Manager And Other Users


Report Writers: Allows managers and other users to design an output report
without writing an application in a programming language such as Java
Query languages: This is a set of commands for creating, updating and
accessing data from a database. It allows managers and other users to ask ad
hoc questions of the database interactively without the aid of programmers.
The three level architecture of the DBMS
External level/schema/view
Conceptual schema
Internal schema
Schema: The view of each of these levels is described by a scheme. A scheme
is an outline or a plan that describes the records and relationships existing in the
view. A database schema is a description of the Database and this is specified
during the Database design and is not expected to change frequently.
The three-level/schema database architecture allows a clear separation of the
information meaning (conceptual view) from the external data representation and
from the physical data structure layout. A database system that is able to
separate the three different views of data is likely to be flexible and adaptable.
Managers and workers must know and understand how databases are
constructed so they know how to use the information resource to their
advantage. Managers must guard against problems inherent with islands of
information and understand that sometimes resolution of short-term problems is
far costlier in the long term.
5.2 THE DATABASE ENVIRONMENT
The key to establishing an effective, efficient database is to involve the entire
organization as much as possible, even if everyone seemingly will not be
connected to it or be a user of it
Database Management Systems
You've heard the old saying, "Don't put all your eggs in one basket." When it
comes to data, just the opposite is true. You want to put all your corporate data in
one system that will serve the organization as a whole.
A Database Management System (DBMS) is software system that permits an
organization to centralize data, manage them efficiently, and provide access to

the stored data by application programs. A DBMS has three components, all of
them important for the long-term success of the system.
Data definition language. Marketing looks at customer addresses differently
from Shipping. So you must make sure that all users of the database are
speaking the same language. Think of it this way: Marketing is speaking French,
Production is speaking German, and Human Resources is speaking Japanese.
They are all saying the same thing, but it's very difficult for them to understand
each other. Defining the data definition language it sometimes gets shortchanged. It is critical to involve users in the development of the Data Definition.
Data manipulation language. This is a formal language used by programmers
to manipulate the data in the database and make sure they are formulated into
useful information. The goal of this language should be to make it easy for users.
The basic idea is to establish a single data element that can serve multiple users
in different departments depending on the situation. Otherwise, you'll be tying up
programmers to get information from the database that users should be able to
get on their own.
Data dictionary. Each data element or field should be carefully analysed to
determine what it will be used for, who will be the primary user, and how it fits into
the overall scheme of things. Then write it all down and make it easily available
to all users. This is one of the most important steps in creating a database.
Logical And Physical Views Of Data
Physical views of items are often different from the logical views of the same
items when they are actually being used.
The physical view of data cares about where the data are actually stored in the
record or in a file. The physical view is important to programmers who must
manipulate the data as they are physically stored in the database.
Does it really matter to the user that the customer address is physically stored on
the disk before the customer name? However, when users create a report of
customers located in Harare they generally will list the customer name first and
then the address. So it's more important to the end user to bring the data from
their physical location on the storage device to a logical view in the output device,
whether screen or paper.
Users' view of the database: Describes that part of database that is relevant to a
particular user. This is the level at which users interact with the system via
applications programs, a host language or data sub language.
Within these records the user may need access to only a few selected fields in
order to perform the specified tasks. The external schema supplies the user with

this limited window on the conceptual schema. Different views may have different
representations of the same data. E.g., user1 views dates as (day,month,year)
whereas user2 may view them as (year, month, day). Some views may include
some derived or calculated data, data not actually stored in the database as
such. E.g., ages of employees may be included in a view on an employee
relation but are unlikely to be stored. Instead, their dates of birth would be stored
and their ages calculated from them by the DBMS. The external schema also
contains the method of deriving the objects in the external view from the objects
in the conceptual view. The objects include entities, attributes and relationships.
Conceptual Level
The conceptual view is a representation of the entire information content of the
database. This level describes what data is stored in the database and the
relationships among the data. This level contains the logical structure of the
entire database as seen by the database administrator (DBA). The conceptual
schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and constraints.
This level mainly represents: all entities, their attributes and their relationships,
security and integrity information.
This level must not contain any storage-dependent details (e.g., storage structure
and access technique).
The schema can be regarded as derived from a model of the organization and
should be designed with care, as it is usual for its structure to remain relatively
unchanged in the life of the database.
Internal Level
Internal view is a low-level representation of the entire database. This level
describes how the data is stored in the database and the access paths for the
Database. The internal view is described by means of the internal schema which
defines the various stored record types, how stored fields are represented, what
indexes exist, what physical sequence the stored records are in, and so on. It is
concerned with storage details that are not part of a logical view of the database.
DBMS Users/ Roles in the DBMS
The classification of DBMS users can be done depending on their degree of
expertise or the mode of their interactions with the DBMS.
Nave users: These are users who need not be aware of the presence of a
database system or any other supporting their usage, e.g. a user of an automatic
teller machine.

Online users: These are aware of the presence of the database system and they
maybe communicating with the DB directly via an online terminal or indirectly via
a user interface or application program.
Application programmers: These are responsible for developing application
programs or user interfaces used by the nave and online users.
DBA(Database Administrator)
The database will be able to meet the demands of various users in the
organization effectively only if it is maintained and managed properly. Usually a
person (or a group of persons) centrally located, with an overall view of the
database, is needed to keep the database running smoothly. The DBA is the
custodian of the data and controls the database structure; he administers the
three levels of the database.
The DBA would normally have a large number of tasks related to maintaining and
managing the database. These tasks would include the following:
Deciding and Loading the Database Contents - The DBA in consultation with
senior management is normally responsible for defining the conceptual schema
of the database. The DBA would also be responsible for making changes to the
conceptual schema of the database if and when necessary.
Assisting and Approving Applications and Access - The DBA would normally
provide assistance to end-users interested in writing application programs to
access the database. The DBA would also approve or disapprove access to the
various parts of the database by different users.
Deciding Data Structures - Once the database contents have been decided, the
DBA would normally make decisions regarding how data is to be stored and what
indexes need to be maintained. In addition, a DBA normally monitors the
performance of the DBMS and makes changes to data structures if the
performance justifies them. In some cases, radical changes to the data structures
may be called for.
Backup and Recovery - Since the database is such a valuable asset, the DBA
must make all the efforts possible to ensure that the asset is not damaged or lost.
This normally requires a DBA to ensure that regular backups of a database are
carried out and in case of failure (or some other disaster like fire or flood),
suitable recovery procedures are used to bring the database up with as little
down time as possible.
Monitor Actual Usage - The DBA monitors actual usage to ensure that policies
laid down regarding use of the database are being followed. The usage
information is also used for performance tuning.
5.3 DESIGNING DATABASES

Requirements collection and analysis during this step designers interview


prospective database users to understand and document their data
requirements. Functional requirements of the application should also be specified
in parallel. DFDs are normally used for specifying functional requirements.
Conceptual database design creation of the conceptual schema (which is a
concise description of the data requirements of the users and includes detailed
descriptions of the data types, relationships and constraints. This is done using a
high-level data model. The concepts used do not include implementation details;
they are easier to understand and can be used to communicate with non
technical users. This approach enables the database designers to concentrate
on specifying the properties of the data, without being concerned with storage
details.
Choice of a DBMS
It is the process of choosing specific storage structures and access paths for the
database files to achieve good performance. Each DBMS offers a variety of
options for file organization and access paths, which include various types of
indexing, clustering of related records on disk blocks, linking related records via
pointers and various types of hashing.
Concepts on database design
The attributes of an entity are the data items that describe the properties of the
entity StudentName, StudentAddress, only record attributes that are of
significance to the organisation.
Four types of attributes:
Single valued: maximum cardinality is 1 (like single variable)
Multiple valued: maximum cardinality > 1 (like arrays or lists)
Composite: like records (eg. Address = {Street, City, State, Code})
Derived attributes eg age
Key attributes:
Primary key
Foreign key
Candidate key
A relationship is a link or an association between two entities, which is
meaningful for the organisation.
e.g. A Customer places an Order. Employee manages department.
Relationships usually arise because of association - a Customer places an
Order

Structure - an Order consists of Order-Lines. Relationships trace the access


from one entity to another e.g. finds the orders placed by a customer
Anomalies
There are basically three tuple database operations, ie, add, delete and update
tuple. These operations are the ones that can result in anomalies in a Database
So we have update, insertion/addition and deletion anomalies.
Update anomalies
Multiple copies of the same fact may lead to update anomalies or inconsistencies
when an update is made and only some of the multiple copies are updated. Thus
any change to an attribute of a tuple should be effected to all tuples relating to
the same tuple.
Insertion anomalies
To insert a new record, the primary key field cannot be left null, but for other
fileds null values can be accepted. Eg a new employee can be assigned a new
employee number before being assigned to a department so the department
attribute can be left null.
Deletion anomalies
This can result in the violation of associations/links.
Normalization
One of the principal objectives of a relational database is to ensure that each
item of data is held only once within the database.
The purpose of normalisation is:
To put data into a format that conforms to relational principle e.g single valued
columns, each relation represents one entity.
Avoid redundancy by storing each fact within the database only once
To put the data only into a form that is more able to accommodate change.
To avoid certain difficulties in updating (anomalies)
To facilitate the enforcement of constraints on data.
Normalization may have the effect of duplicating data within the database and
often results in the creation of additional tables. (While normalization tends to
increase the duplication of data, it does not introduce redundancy, which is
unnecessary duplication.)
The definition of 1st normal form
Remove all repeating groups
Define all the key attributes
or
Eliminate repeating groups in individual tables.
Create a separate table for each set of related data.

10

Identify each set of related data with a primary key.


A table is in 2nd normal form
it's in 1st normal form
it includes no partial dependencies (where an attribute is dependent on only a
part of a primary key).
based on the concept of full functional dependency.
or
Create separate tables for sets of values that apply to multiple records.
Relate these tables with a foreign key.
All its nonkey attributes are dependent on the whole key
The definition of 3rd normal form
It is in 2nd normal form
It contains no transitive dependencies (where a non-key attribute is dependent
on another non-key attribute).
No nonprime attribute is functionally dependent on another nonprime attribute.
Transitive dependency: A->B, and B->C, then A->C
Third normal form (3NF) goes one large step further:
Remove columns that are not dependent upon the primary key.
TNF does not allow partial dependencies and transitive dependencies
Example of Database Design Process
The following is a database to keep track of students sporting activities. The
database is designed to track each activity a student takes and the fee per
semester to do that activity at NUST.
Step 1: Create an Activities table containing all the fields: students name,
activity and cost. Because some students take more than one activity, including a
second activity and cost fields makes an allowance for that. The structure of the
table is Student, Activity1, Cost1, Activity2, Cost2 as shown in table 1
Step 2: Test the table with some sample data. Create sample data that is
populating the database. Nothing prevents the user from entering the same
name for different students, or different fees for the same activity. Asking
questions about the data and getting answers back (essentially querying the data
and producing reports) is very essential. For example, how can students taking
tennis be identified or found?
Student
John Dube
Rudo Masuku
John Dube

Activity1
Tennis
Squash
Tennis

Cost1 Activity2
Cost2
$3600 Swimming $1700
$4000 Swimming $1700
$3600

11

Mark Ruvende Swimming $1500 Golf

$4700

Table 1 Activity
Step 3: Analyse the data. In this case, above in table 1 there are two John
Dubes, and theres no way to differentiate them. There comes the need for a
uniquely identifier.
Uniquely identify records
Step 4: Modify the design. Each student is identified uniquely by giving each
one a unique ID(primary key ). This field (primary key) can be used to retrieve
any specific record.
The table structure is now: ID, Activity1, Cost1, Activity2, Cost2.
While its easy for the computer to keep track of ID codes, its not so useful for
humans. Therefore there is need to introduce a second table that lists each ID
and the student it belongs to. Using a database program, both table structures
can be linked by the common field, ID. The initial flat-file design has been
converted into a relational database: a database containing multiple tables linked
together by key fields.
Step 5: Test the table with sample data.
Student
John Dube
Rudo Masuku
John Dube
Mark Ruvende

ID*
084
100
182
219
ID*

Student

084 John
Dube
100 Rudo
Masuku
182 John
Dube
219 Mark
Ruvende

Activity1

Cost1 Activity2

Tennis

$3600 Swimming $1700

Squash

$4000 Swimming $1700

Tennis

$3600

Swimming $1500 Golf

Figure 1
Step 6: Analyse the data.
Theres still a lot wrong with the Activities table:

12

Cost2

$4700

Wasted space. Some students dont take a second activity, and so wasting space
when data is stored. It doesnt seem much of a bother in this sample, but what if
millions of records are involved? Of course the waste will be significant.
Addition anomalies. What if number 219 ( Mark Ruvende) wants to do a third
activity? University rules allow it, but theres no space in this structure for another
activity. Another for Mark, as that would violate the unique key field ID, and it
would also make it difficult to see all his information at once.
Redundant data entry. If the tennis fees go up to $3900, the database designer
has to go through every record containing tennis and modify the cost.
Querying difficulties. Its difficult to find all people doing swimming: a search has
to be made through both activities (Activity 1 and Activity 2) to make sure all are
caught.
Redundant information. If 50 students take swimming, then there is need to type
in both the activity and its cost each time.
Inconsistent data. It can be noted that there are conflicting prices for swimming?
Should it be $1500 or $1700? This happens when one record is updated and
another is not.
Eliminate recurring fields
The Students table is fine. But there are so many errors still to be corrected
Step 7: Modify the design.
The first four database design errors can be fixed by creating a separate record
for each activity a student takes, instead of one record for all the activities a
student takes.
Eliminate the Activity 2 and Cost 2 fields. Adjustments to the table structure to
accommodate multiple records entry for each student are necessary. The
refinement is done on the key so that it consists of two fields, ID and Activity. As
each student can only take an activity once, this combination gives us a unique
key for each record.
The Activities table has now been simplified to: ID, Activity, Cost. Note how the
new structure lets students take any number of activities they are no longer
limited to two.
Step 8: Test sample data.
Student Table
Student
ID*

Activity Table

13

John Dube
Rudo Masuku
John Dube
Mark Ruvende

084
100
182
219

ID*
084
084
100
100
182
219
219
219

Figure 2

Activity
Swimming
Tennis
Squash
Swimming
Tennis
Golf
Swimming
Squash

Cost
$1700
$3600
$4000
$1700
$3600
$4700
$1500
$4000

Step 9: Analyse the data.


There is still a problem of redundancy (activity fees repeated) and inconsistent
data (whats the correct fee for swimming?). These problems are related to
editing or modifying records.
Eliminate data entry anomalies
Check that other data entry processes, such as adding or deleting records, will
function correctly too. There is are potential problems when adding or deleting
records:
Insertion anomalies. What if the college introduces a new activity, such as Chess,
at $5000. Where will this information be stored? With the current design it is not
possible unless a student signs up for the activity.
Deletion anomalies. If Mark Ruvende (number 219) transfers to another college,
all the information about golf disappears from the system, as he was the only
student taking this activity.
Step 10: Modify the design.
The cause of all the remaining problems is that there is a non-key field (cost),
which is dependent on only part of the key (activity). The reader is encouraged to
solve this. The cost of each activity is not dependent on the students ID, which is
part of the composite key (ID + Activity). The cost of tennis, for example, is $3600
for each and every student who takes the sport so the students ID has no
bearing on the value contained in this field. The cost of an activity is purely
dependent on the activity itself. By checking the table structures and ensuring
that every non-key field is dependent on the whole key, the designer can
eliminate the rest of the problems.

14

The final design will thus contain three tables: the Students table (Student, ID), a
Participants table (ID, Activity), and a modified Activities table (Activity, Cost).
It can be observed that each non-key value depends on the whole key: the
student name is entirely dependent on the ID; the activity cost is entirely
dependent on the activity. The new Participants table essentially forms a union of
information drawn from the other two tables, and each of its fields is part of the
key. The tables are linked by key fields: the Students table: ID corresponds to the
Participants table: ID; the Activities table: Activity corresponds to the Participants
table: Activity.
Step 11: Test sample data.
Student Table

ID*
084
084
100
100
182
219
219
219

Table
Student*
John Dube
Rudo Masuku
John Dube
Mark Ruvende

ID*
084
100
182
219

Activity*
Swimming
Tennis
Squash
Swimming
Tennis
Golf
Swimming
Squash

Participant

Activity Table
Actibvity Cost
*
Golf
$4700
Chess
$5000
Squash
$4000
Swimming $1500
Tennis
$3600
Figure 3
Step 12: Analyse the results.
No redundant information.
No inconsistent data. Theres only one place where the user can enter the price
of each activity, so theres no chance of creating inconsistent data. Also, if there
is a fee rise, all that is necessary is to update the cost in one place.
No insertion anomalies. A new activity can be added to the Activities table without
a student signing up for it.
No deletion anomalies. If Mark Ruvende (number 219) leaves, details about
golfing activity retain.
It should be noted that in order to simplify the process and focus on the relational
aspects of designing the database structure, the students name in a single field.
15

This is not what normally happens, the name is divided into first name, surname
(and initials) fields. Similarly, other fields that you would normally store in a
student table, such as date of birth, address, parents names and so on were
excluded.
Although the ultimate design will depend on the complexity of data, the following
steps are important:
Break composite fields down into constituent parts. Example: Name becomes
surname and first name.
Create a key field, which uniquely identifies each record. or use a composite key.
Eliminate repeating groups of fields.
Eliminate record modification problems (such as redundant or inconsistent data)
and record deletion and addition problems by ensuring each non-key field
depends on the entire key.
Create a separate table for any information that is used in multiple records, and
then use a key to link these tables to one another.
SUMMARY ON CREATING A DATABASE
Gather the requirement and identify the entities
How the information is to be organised t is organized, stored, and used? How this
information could be organized better and used more easily throughout the
organization? What part of the current system are you going to get rid of and
what would you add? Involve as many users in this planning stage as possible.
They are the ones who will prosper or suffer because of the decisions you make
at this point.
Determine the relationships between each data element that you currently have
(entity-relationship diagram). The data don't necessarily have to be in a computer
for you to consider the impact. Determine which data elements work be together
and how you will organize them in tables. Break your groups of data into as small
a unit as possible (normalization). Even when you say it is as small as it can get,
go back again. Avoid redundancy between tables. Decide what the key identifier
will be for each record.
Give it your best shot in the beginning: it costs a lot of time, money, and
frustration to go back and make changes or corrections however it is better than
to live with a poorly designed database.
NB: It should be noted that you can use a commercial DBMS like Microsoft
Access but the output can still be a file system
Query

16

A view of data showing information from one or more tables. For instance, using
the sample database used when describing normalisation, a query could be
made to the Students database asking "Show the first and last names of the
students who take both Tennis and Golf and Dube as their surnames. Such a
query displays information from the Students table (firstname, lastname),
Participant Table, Activity Table.
SQL: Structured Query Language (pronounced sequel in the US; ess-queue-ell
elsewhere). A computer language designed to organise and simplify the process
of getting information out of a database in a usable form, and also used to
reorganise data within databases. SQL is most often used on larger databases
on minicomputers, mainframes and corporate servers.
The whole point of using a database is to turn data into information. Data are
facts that have no inherent meaning; information is data put into context to
convey meaning. Think of a student database containing information such as
student names, addresses, ID numbers and telephone numbers. Put a question
to the database such as What percentage of students does Computer Science
Fundamentals? The resulting answer is useful, meaningful information.
HIERARCHICAL DATABASES
The hierarchical data model presents data to users in a treelike structure.
Think of a mother and her children. A child only has one mother and inherits
some of her characteristics, such as eye colour or hair colour. A mother might
have one or more children to whom she passes some of her characteristics but
usually not exact ones. The child then goes on to develop its own characteristics
separate from the mother.
In a hierarchical database, characteristics from the parent are passed to the child
by a pointer just as a human mother will have a genetic connection to each
human child.
NETWORK DATABASE
A network data model is a variation of the hierarchical model. As with
hierarchical structures, each relationship in a network database must have a
pointer from all the parents to all the children and back.
These two types of databases, the hierarchical and the network, work well
together since they can easily pass data back and forth. But because these
database structures use pointers, which are actually additional data elements,
the size of the database can grow very quickly and cause maintenance and
operation problems.

17

RELATIONAL DATA MODEL


A relational data model uses tables in which data are stored to extract and
combine data in different combinations. The tables are sometimes called files,
although that is actually a misnomer, since you can have multiple tables in one
file.
In a relational database, each table contains a primary key, a unique identifier for
each record. To make sure the tables relate to each other, the primary key from
one table is stored in a related table as a secondary key. For instance, in the
Customer table the primary key is the unique Customer ID. That primary key is
then stored in the Order Table as the secondary key so that the two tables have a
direct relationship.
Use these three basic operations to develop relational databases:
Select: create a subset of records meeting the stated criteria
Join: combine related tables to provide more information than individual tables
Project: create a new table from subsets of previous tables
The biggest problem with these databases is the misconception that every data
element should be stored in the same table. In fact, each data element should be
analysed in relation to other data elements with the goal of making the tables as
small in size as possible. The ideal relational database will have many small
tables, not one big one. On the surface that may seem like extra work and effort,
but by keeping the tables small, they can serve a wider audience because they
are more flexible. This set-up is especially helpful in reducing redundancy and
increasing the usefulness of data.
Advantages and Disadvantages
Hierarchical and network databases can be very efficient as long as you plan
ahead. But as you know, needs change, and neither one of these databases
offers a lot of flexibility to change with business needs. It is sort of like parents
and children; once you establish the tie, it is pretty hard to amend.
Relational database management systems are more flexible, especially if you
keep the tables small. It is much easier for non-techies to create the query
language in a relational system. It's also easier to add new data elements,
although if you do, you'll have to go back and fill in the missing information for the
old records or just forget them altogether.
Comparison of Database Alternatives
Type of
Processing
Flexibility
database
efficiency
Hierarchical
High
Low
Network
Medium-high
Low-

18

User
Friendliness
Low
Low-moderate

Programming
Complexity
High
High

Relational

Lower but
improving

medium
High

High

Low

What you should remember is that none of these databases is very good if you
do not keep the end user in mind. If you're not careful, you'll wind up with lots of
information that no one can use.
There are three types of databases: hierarchical, network, and relational.
Relational databases are becoming the most popular of the three because they
are easier to work worth, easier to change, and can serve a wider range of needs
throughout the organization
8.4 DISCUSSION QUESTIONS:
Why do relational database management systems appear to be a better than a
hierarchical or network database management system?
What should managers focus on when building a database?

19

Das könnte Ihnen auch gefallen