Beruflich Dokumente
Kultur Dokumente
RDBMS is a database management system based on relational model defined by E.F.Codd. Data is stored in the
form of rows and columns. The relations among tables are also stored in the form of the table. RDBMS is used to
manage relational database. RDBMS is a collection of organized set of tables from which data can be accessed
easily. It consists of number of table and each table has its own primary key.
Features:
- Provides data to be stored in tables
- Persists data in the form of rows and columns
- Provides facility primary key, to uniquely identify the rows
- Creates indexes for quicker data retrieval
- Provides a virtual table creation in which sensitive data can be stored and simplified query can be applied.(views)
- Sharing a common column in two or more tables(primary key and foreign key)
- Provides multi user accessibility that can be controlled by individual users
Keys:
The key is defined as the column or attribute of the database table. For example if a table has id, name and address
as the column names then each one is known as the key for that table. We can also say that the table has 3 keys as
id, name and address. The keys are also used to identify each record in the database table. The following are the
various types of keys available in the DBMS system.
1. Candidate key
A Candidate key is a set of one or more attributes that can uniquely identify a row in a given table
Let's explain candidate key in detail using an example. Consider a table employee as given below
2. Super key
Any superset of a candidate Key is a super key.
Let's explain super key in detail using the same employee table mentioned above. Here we can have many superkeys
like {Employee_ID,FirstName}, {Employee_ID,LastName}, {Employee_ID,FirstName,LastName},
{Email,FirstName} etc because any superset of a candidate key is a superkey
3. Primary key
The Database designer will choose one key from the list of available candidate Keys to uniquely identify row in the
given table.This key is known as primary key.
To select primary key from all candidate keys,
Let's explain primary key in detail using the same employee table mentioned above. Here we have two candidate
keys - Employee_ID and Email. Remember our selection criteria for primary keys. Usually peference is given to
numeric column(s). Hence we can select Employee_ID as the primary key of the table "employee".
4. Foreign key
A Foreign Key is a set of attribute (s) whose values are required to match values of a Candidate key in the same or
another table
Let's explain Foreign key in detail using an example. Consider an employee table and a department table as given
below:
Here we can say that Department_ID of employee table is a foreign key refering Department_Number which is a
candidate key of the department table.
A table which has a foreign Key referring to its own candidate key is known as self-referencing table.
The constraint that values of a given foreign key must match the values of the corresponding candidate key is
known as referential constraint.
5. Non-Key Attributes
The attributes other than the Candidate Key attributes in a table/relation are called Non-Key attributes. In other
words, The attributes which do not participate in any of the Candidate keys are called Non-Key attributes.
6. Composite Key
A composite key is a key that contains more than one attribute.
# Explain Normalization with proper example.
Database Normalization is a technique of organizing the data in the database. Normalization is a systematic
approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion,
Update and Deletion Anamolies. It is a multi-step process that puts data into tabular form by removing duplicated
data from the relation tables. Normalization is the process of database designing which make sure that there will
not be any redundant data.
Normalization is used for mainly two purposes:
Age
15
14
17
Subject
Biology, Maths
Maths
Maths
In the First Normal Form, any row must not have a column in which more than one value is saved, like separated
with commas. Rather than that, we must separate such data into multiple rows.
Student Table Following 1NF will be:
Student
Ram
Ram
Shyam
Hari
Age
15
15
14
17
Subject
Biology
Maths
Maths
Maths
Using the First Normal Form, data redundancy increases, as there will be many columns with same data in multiple
rows but each row as a whole will be unique.
Second Normal Form (2NF):
As per the Second Normal Form, there must not be any partial dependency of any column on primary key. It means
that for a table that has concatenated primary key, each column in the table that is not part of the primary key must
depend upon the entire concatenated key for its existence, If any column depends only on one part of the
concatenated key, then the table fails Second Normal Form.
In example of First Normal Form there are two rows for Ram, to include multiple subjects that he has opted for.
While this is searchable, and follows First Normal Form, it is an inefficient use of space. Also in the above table in
First Normal Form, while the candidate is {student, Subject}, Age of student only depends on student column
which is incorrect as per Second Normal Form. To achieve Second Normal Form, it would be helpful to split out the
subjects into an independent table, and match them up using the student names as foreign keys.
New Student Table following 2NF will be:
Student
Age
Ram
Shyam
Hari
15
14
17
In student Table, the candidate key will be Student column, because all other column i.e. Age is dependent on it.
New Student Table introduced for 2NF will be:
Student
Ram
Ram
Shyam
Hari
Subject
Biology
Maths
Maths
Maths
In Subject Table, the candidate key will be {Student, Subject} column. Now, both the above tables qualifies for
Second Normal Form and will never suffer from Update Anomalies. Although there are a few complex cases in
which table in Second Normal Form suffers Update Anomalies, and to handle those scenarios Third Normal Form
is there.
Third Normal Form (3NF):
Third Normal Form applies that every non-prime attribute of table must be dependent on primary key. The
transitive functional dependency should be removed from the table. The table must be in Second Normal Form.
For example, consider a table with following fields.
Student_Detail Table:
Student_id
Student_name
DOB
Street
City
State
Zip
In the above Table, Student_id is primary key, but street, city and state depends upon Zip. The dependency
between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city
and state to a new table, with Zip as primary key.
New Student_Detail Table:
Student_id
Student_name
DOB
Zip
Address Table:
Zip
Street
City
State
Atomicity states that database modifications must follow an all or nothing rule. Each transaction is said
to be atomic. If one part of the transaction fails, the entire transaction fails. It is critical that the database
management system maintain the atomic nature of transactions in spite of any DBMS, operating system or
hardware failure.
Consistency states that only valid data will be written to the database. If, for some reason, a transaction is
executed that violates the databases consistency rules, the entire transaction will be rolled back and the database
will be restored to a state consistent with those rules. On the other hand, if a transaction successfully executes, it
will take the database from one state that is consistent with the rules to another state that is also consistent with
the rules.
Isolation requires that multiple transactions occurring at the same time not impact each others execution.
For example, if Joe issues a transaction against a database at the same time that Mary issues a differen t
transaction, both transactions should operate on the database in an isolated manner. The database should either
perform Joes entire transaction before executing Marys or vice-versa. This prevents Joes transaction from
reading intermediate data produced as a side effect of part of Marys transaction that will not eventually be
committed to the database. Note that the isolation property does not ensure which transaction will execute first,
merely that they will not interfere with each other.
Durability ensures that any transaction committed to the database will not be lost. Durability is ensured
through the use of database backups and transaction logs that facilitate the restoration of committed transactions
in spite of any subsequent software or hardware failures.
Take a few minutes to review these characteristics and commit them to memory. If you spend any significant
portion of your career working with databases, youll see them again and again. They provide the basic building
blocks of any database transaction model.
Data processing
What are the steps the may occure in systematic data processing?
Any data processing system may use all or a subset of the activities given below
1.
Collection of Data
2.
Recording of Data
3.
Sorting of Data
4.
Data Classification
5.
Calculation
6.
Retrieval of data
7.
Summarizing
8.
Communicating
Batch processing:
Real-time processing:
It is a parallel time relationship with on-going activity and the current activity is controlled by the
information produced.
1.
2.
In traditional approach, information is stored in flat files which are maintained by the file system
under the operating systems control.
Application programs go through the file system in order to access these flat files
How data is stored in flat files
Records consist of various fields which are delimited by a space, comma, pipe, any special
character etc.
End of records and end of files will be marked using any predetermined character set or special
characters in order to identify them
Example: Storing employee data in flat files
1.
Data Security
The data stored in the flat file(s) can be easily accessible and hence it is not secure.
Example: Consider an online banking application where we store the account related information of all
customers in flat files. A customer will have access only to his account related details. However from a flat
file, it is difficult to put such constraints. It is a big security issue.
2.
Data Redundancy
In this storage model, the same information may get duplicated in two or more files. This may lead to to
higher storage and access cost. it also may lead to data inconsistency.
For Example, assume the same data is repeated in two or more files. If a change is made to data stored in
one file, other files also needs to be change accordingly.
Example: Assume employee details such as firstname, lastname, emailid are stored in employee_details
file and employee_salary file. If a change needs to be made to emailid, both employee_details file and
emplyee_salary file need to be updated otherwise it will lead to inconsistent data.
However, it is possible to design file systems with minimal redundancy. Also note that Data redundancy is
sometimes preferred.
Example: Assume employee details such as firstname, lastname, emailid are stored only in
employee_details file and not in employee_salary file. If we need to access an employee salary along with
firstname of the employee, we have to retrieve details from two files. This would mean an increased
overhead.
3.
Data Isolation
Data Isolation means that all the related data is not available in one file. Usually the data is scattered in
various files having different formats. Hence writing new application programs to retrieve the appropriate
data is difficult.
4.
Program/Data Dependence
In traditional file approach, application programs are closely dependent on the files in which data is
stored. If we make any changes in the physical format of the file(s), like addition of a data field , etc, all
application programs needs to be changed accordingly. Consequently, for each of the application
programs that a programmer writes or maintains, the programmer must be concerned with data
management. There is no centralized execution of the data management functions. Data management is
scattered among all the application programs.
Example: Consider the banking system. An employee_salary file exists which has details about the salary
of employees. An employee_salary record is described by
employee_id
firstname
lastname
salary_amount
An application program is available to display all the details about the salary of all employees. Assume a
new data field, the date_of_joining is added to the employee_salary file. Since the application program
depends on the file, it also needs to be altered.
If the physical format of the employee_salary file for example the field delimiter, record delimiter, etc. are
changed, it necessitates that the application program which depends on it, also be altered.
5.
Lack of Flexibility
The traditional systems are able to retrieve information for predetermined requests for data. If we need
unanticipated data, huge programming effort is needed to make the information available, provided the
information is there in the files. By the time the information is made available, it may no longer be
required or useful.
Example : Consider a software application which is able to generate employee salary report. Assume that
all the data is stored in flat files. Suppose we now have a requirement to retrieve all the employee details
whose salary is greater than Rs.10000. It is not easy to generate such on-demand reports and lot of time is
needed for application developers to modify the application to meet such requirements.
6.
Many traditional systems allow multiple users to access and update the same piece of data
simultaneously. However this concurrent updates may result in inconsistent data. To guard against this
possibility, the system must maintain some form of supervision. But supervision is difficult because data
may be accessed by many different application programs and these application programs may not have
been coordinated previously.
Example : Consider a personal information system which has the data of all employees. Now there may be
an employee updating his address details in the system and at the same time, an administrator may be
taking a report containing the data of all employees. This is called concurrent access. Since the employee's
address is being updated at the same time, there is a possibility of the administrator reading an incorrect
address.
What is Database?
Database is a computer based record keeping system which is used to record ,maintain and retrieve data.
It is an organized collection of interrelated (persistent) data.
For interacting with the DBMS we use a Query language called Structured Query Language (SQL)
General Block Diagram
Types of Databases
1. Centralized Database
In Centralized database system, all data is stored at a single site. It offers a great control in accessing and
updating data. However failure chances are high because the system depends on the availability of
resources at the central site
Example: Think about a banking application which uses centralized database. In this case the data is
stored in a common place. Applications running in various banks may communicate to the common
database over network to access or insert/update/delete information.
2. Distributed Database
In Distributed Database system, the database is stored on several computers. Computers in a distributed
system may communicate with one another through internet/intranet/telephone lines etc. Most of the
distributed systems will be geographically separated and managed. Distributed databases can also
separately
be
administered
Example: Think about a banking application which uses distributed database. Bank's head office may be
in India where as branch offices may be in United States and United Kingdom. In this case the bank
database can be distributed across the branch offices and head office whereas the individual offices are
connected through a network.
Data management
2.
Data definition
3.
Transaction support
4.
Concurrency control
5.
Recovery
6.
7.
8.
user management
9.
backup
10.
performance analysis
11.
logging
12.
audit
13.
Users of a DBMS
1. Database Administrator (DBA)
DBA takes care of the administrative tasks of DBMS as the name suggests and his major responsibilities are given
below.
Management of information
Advantages of a DBMS
1.
Data independence
2.
3.
Increased security
4.
Better flexibility
5.
6.
7.
2.
Anything that has an independent existence and about which we collect data. It is also known as entity
type.
In ER modeling, notation for entity is given below.
Entity instance
Entity instance is a particular member of the entity type.
Example for entity instance : A particular employee
Regular Entity
An entity which has its own key attribute is a regular entity.
Example for regular entity : Employee.
Weak entity
An entity which depends on other entity for its existence and doesn't have any key attribute of its own is a
weak entity.
Example for a weak entity : In a parent/child relationship, a parent is considered as a strong entity and
the child is a weak entity.
In ER modeling, notation for weak entity is given below.
Attributes
Properties/characteristics which describe entities are called attributes.
In ER modeling, notation for attribute is given below.
Domain of Attributes
The set of possible values that an attribute can take is called the domain of the attribute. For example, the
attribute day may take any value from the set {Monday, Tuesday ... Friday}. Hence this set can be termed
as the domain of the attribute day.
Key attribute
The attribute (or combination of attributes) which is unique for every entity instance is called key
attribute.
E.g the employee_id of an employee, pan_card_number of a person etc.If the key attribute consists of two
or more attributes in combination, it is called a composite key.
In ER modeling, notation for key attribute is given below.
Simple attribute
If an attribute cannot be divided into simpler components, it is a simple attribute.
Example for simple attribute : employee_id of an employee.
Composite attribute
If an attribute can be split into components, it is called a composite attribute.
Example for composite attribute : Name of the employee which can be split into First_name,
Middle_name, and Last_name.
Single valued Attributes
If an attribute can take only a single value for each entity instance, it is a single valued attribute.
example for single valued attribute : age of a student. It can take only one value for a particular student.
Multi-valued Attributes
If an attribute can take more than one value for each entity instance, it is a multi-valued attribute. Multivalued
example for multi valued attribute : telephone number of an employee, a particular employee may have
multiple telephone numbers.
In ER modeling, notation for multi-valued attribute is given below.
Stored Attribute
An attribute which need to be stored permanently is a stored attribute
Example for stored attribute : name of a student
Derived Attribute
An attribute which can be calculated or derived based on other attributes is a derived attribute.
Example for derived attribute : age of employee which can be calculated from date of birth and current
date.
In ER modeling, notation for derived attribute is given below.
Relationships
Associations between entities are called relationships
Example : An employee works for an organization. Here "works for" is a relation between the entities
employee and organization.
In ER modeling, notation for relationship is given below.
However in ER Modeling, To connect a weak Entity with others, you should use a weak relationship
notation as given below
Degree of a Relationship
Degree of a relationship is the number of entity types involved. The n-ary relationship is the general form
for degree n. Special cases are unary, binary, and ternary ,where the degree is 1, 2, and 3, respectively.
Example for unary relationship : An employee ia a manager of another employee
Example for binary relationship : An employee works-for department.
Example for ternary relationship : customer purchase item from a shop keeper
Cardinality of a Relationship
Relationship cardinalities specify how many of each entity type is allowed. Relationships can have four
possible connectivities as given below.
One employee is assigned with only one parking space and one parking space is assigned to only one
employee. Hence it is a 1:1 relationship and cardinality is One-To-One (1:1)
In ER modeling, this can be mentioned using notations as given below
One organization can have many employees , but one employee works in only one organization. Hence it
is a 1:N relationship and cardinality is One-To-Many (1:N)
In ER modeling, this can be mentioned using notations as given below
One employee works in only one organization But one organization can have many employees. Hence it is
a M:1 relationship and cardinality is Many-to-One (M :1)
In ER modeling, this can be mentioned using notations as given below.
One student can enroll for many courses and one course can be enrolled by many students. Hence it is a
M:N relationship and cardinality is Many-to-Many (M:N)
In ER modeling, this can be mentioned using notations as given below
Relationship Participation
1. Total
In total participation, every entity instance will be connected through the relationship to another instance
of the other participating entity types
2. Partial
Example for relationship participation
Consider the relationship - Employee is head of the department.
Here all employees will not be the head of the department. Only one employee will be the head of the
department. In other words, only few instances of employee entity participate in the above relationship.
So employee entitys participation is partial in the said relationship.
However each department will be headed by some employee. So department entitys participation is total
in the said relationship.
ER Modeling is simple and easily understandable. It is represented in business users language and
it can be understood by non-technical specialist.
2.
3.
4.
5.
1.
Physical design derived from E-R Model may have some amount of ambiguities or inconsistency.
2.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
1.
Department
2.
Course
3.
Instructor
4.
Student
Stem 2 : Identify the relationships
1.
One department offers many courses. But one particular course can be offered by only one
department. hence the cardinality between department and course is One to Many (1:N)
2.
One department has multiple instructors . But instructor belongs to only one department. Hence
the cardinality between department and instructor is One to Many (1:N)
3.
One department has only one head and one head can be the head of only one department. Hence
the cardinality is one to one. (1:1)
4.
One course can be enrolled by many students and one student can enroll for many courses. Hence
the cardinality between course and student is Many to Many (M:N)
5.
One course is taught by only one instructor. But one instructor teaches many courses. Hence the
cardinality between course and instructor is Many to One (N :1)
Step 3: Identify the key attributes
Entity
A real-world thing either animate or inanimate that can be easily identifiable and distinguishable. For example, in a school database, student,
teachers, class and course offered can be considered as entities. All entities have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities. Entity set may contain entities with attribute sharing similar values. For example,
Students set may contain all the student of a school; likewise Teachers set may contain all the teachers of school from all faculties. Entities
sets need not to be disjoint.
Attributes
Entities are represented by means of their properties, called attributes. All attributes have values. For example, a student entity may have
name, class, age as attributes.
There exist a domain or range of values that can be assigned to attributes. For example, a student's name cannot be a numeric value. It has
to be alphabetic. A student's age cannot be negative, etc.
TYPES OF ATTRIBUTES:
Simple attribute:
Simple attributes are atomic values, which cannot be divided further. For example, student's phone-number is an atomic value of 10 digits.
Composite attribute:
Composite attributes are made of more than one simple attribute. For example, a student's complete name may have first_name and
last_name.
Derived attribute:
Derived attributes are attributes, which do not exist physical in the database, but there values are derived from other attributes presented in
the database. For example, average_salary in a department should be saved in database instead it can be derived. For another example,
age can be derived from data_of_birth.
Single-valued attribute:
Multi-value attribute may contain more than one values. For example, a person can have more than one phone numbers, email_addresses
etc.
These attribute types can come together in a way like:
Super Key: Set of attributes (one or more) that collectively identifies an entity in an entity set.
Candidate Key: Minimal super key is called candidate key that is, supers keys for which no proper subset are a superkey. An entity
set may have more than one candidate key.
Primary Key: This is one of the candidate key chosen by the database designer to uniquely identify the entity set.
Relationship
The association among entities is called relationship. For example, employee entity has relation works_at with department. Another example
is for student who enrolls in some course. Here, Works_at and Enrolls are called relationship.
RELATIONSHIP SET:
Relationship of similar type is called relationship set. Like entities, a relationship too can have attributes. These attributes are called
descriptive attributes.
DEGREE OF RELATIONSHIP
The number of participating entities in an relationship defines the degree of the relationship.
Binary = degree 2
Ternary = degree 3
n-ary = degree
MAPPING CARDINALITIES:
Cardinality defines the number of entities in one entity set which can be associated to the number of entities of other set via relationship set.
One-to-one: one entity from entity set A can be associated with at most one entity of entity set B and vice versa.
One-to-many: One entity from entity set A can be associated with more than one entities of entity set B but from entity set B one
entity can be associated with at most one entity.
Many-to-one: More than one entities from entity set A can be associated with at most one entity of entity set B but one entity from
entity set B can be associated with more than one entity from entity set A.
Many-to-many: one entity from A can be associated with more than one entity from B and vice versa.
Entity-Relationship Model
Entity-Relationship model is based on the notion of real world entities and relationship among them. While formulating real-world scenario
into database model, ER Model creates entity set, relationship set, general attributes and constraints.
ER Model is best used for the conceptual design of database.
ER Model is based on:
[Image: ER Model]
Entity
An entity in ER Model is real world entity, which has some properties called attributes. Every attribute is defined by its set of values,
called domain.
For example, in a school database, a student is considered as an entity. Student has various attributes like name, age and class etc.
Relationship
The logical association among entities is called relationship. Relationships are mapped with entities in various ways. Mapping cardinalities
define the number of association between two entities.
Mapping cardinalities:
o
one to one
one to many
many to one
many to many
ER-Model is explained here.
Relational Model
The most popular data model in DBMS is Relational Model. It is more scientific model then others. This model is based on first-order
predicate logic and defines table as an n-ary relation.
ER Diagram Representation
Now we shall learn how ER Model is represented by means of ER diagram. Every object like entity, attributes of an entity, relationship set,
and attributes of relationship set can be represented by tools of ER diagram.
Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set they represent.
Attributes
Attributes are properties of entities. Attributes are represented by means of eclipses. Every eclipse represents one attribute and is directly
connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every node is then connected to its attribute. That is
composite attributes are represented by eclipses that are connected with an eclipse.
Relationship
Relationships are represented by diamond shaped box. Name of the relationship is written in the diamond-box. All entities (rectangles),
participating in relationship, are connected to it by a line.
One-to-one
When only one instance of entity is associated with the relationship, it is marked as '1'. This image below reflects that only 1 instance of each
entity should be associated with the relationship. It depicts one-to-one relationship
[Image: One-to-one]
One-to-many
When more than one instance of entity is associated with the relationship, it is marked as 'N'. This image below reflects that only 1 instance of
entity on the left and more than one instance of entity on the right can be associated with the relationship. It depicts one-to-many relationship
[Image: One-to-many]
Many-to-one
When more than one instance of entity is associated with the relationship, it is marked as 'N'. This image below reflects that more than one
instance of entity on the left and only one instance of entity on the right can be associated with the relationship. It depicts many-to-one
relationship
[Image: Many-to-one]
Many-to-many
This image below reflects that more than one instance of entity on the left and more than one instance of entity on the right can be associated
with the relationship. It depicts many-to-many relationship
[Image: Many-to-many]
PARTICIPATION CONSTRAINTS
Total Participation: Each entity in the entity is involved in the relationship. Total participation is represented by double lines.
Partial participation: Not all entities are involved in the relationship. Partial participation is represented by single line.
Generalization Aggregation
ER Model has the power of expressing database entities in conceptual hierarchical manner such that, as the hierarchical goes up it
generalize the view of entities and as we go deep in the hierarchy it gives us detail of every entity included.
Going up in this structure is called generalization, where entities are clubbed together to represent a more generalized view. For example, a
particular student named, Mira can be generalized along with all the students, the entity shall be student, and further a student is person. The
reverse is called specialization where a person is student, and that student is Mira.
Generalization
As mentioned above, the process of generalizing entities, where the generalized entities contain the properties of all the generalized entities
is called Generalization. In generalization, a number of entities are brought together into one generalized entity based on their similar
characteristics. For an example, pigeon, house sparrow, crow and dove all can be generalized as Birds.
[Image: Generalization]
Specialization
Specialization is a process, which is opposite to generalization, as mentioned above. In specialization, a group of entities is divided into subgroups based on their characteristics. Take a group Person for example. A person has name, date of birth, gender etc. These properties are
common in all persons, human beings. But in a company, a person can be identified as employee, employer, customer or vendor based on
what role do they play in company.
[Image: Specialization]
Similarly, in a school database, a person can be specialized as teacher, student or staff; based on what role do they play in school as entities.
Inheritance
We use all above features of ER-Model, in order to create classes of objects in object oriented programming. This makes it easier for the
programmer to concentrate on what she is programming. Details of entities are generally hidden from the user, this process known as
abstraction.
One of the important features of Generalization and Specialization, is inheritance, that is, the attributes of higher-level entities are inherited by
the lower level entities.
[Image: Inheritance]
For example, attributes of a person like name, age, and gender can be inherited by lower level entities like student and teacher etc.
Concepts
Tables: In relation data model, relations are saved in the format of Tables. This format stores the relation among entities. A table has rows
and columns, where rows represent records and columns represents the attributes.
Tuple: A single row of a table, which contains a single record for that relation is called a tuple.
Relation instance: A finite set of tuples in the relational database system represents relation instance. Relation instances do not have
duplicate tuples.
Relation schema: This describes the relation name (table name), attributes and their names.
Relation key: Each row has one or more attributes which can identify the row in the relation (table) uniquely, is called the relation key.
Attribute domain: Every attribute has some pre-defined value scope, known as attribute domain.
Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions are called Relational Integrity Constraints.
There are three main integrity constraints.
Key Constraints
Domain constraints
KEY CONSTRAINTS:
There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely. This minimal subset of attributes is
called key for that relation. If there are more than one such minimal subsets, these are called candidate keys.
Key constraints forces that:
in a relation with a key attribute, no two tuples can have identical value for key attributes.
DOMAIN CONSTRAINTS
Attributes have specific values in real-world scenario. For example, age can only be positive integer. The same constraints has been tried to
employ on the attributes of a relation. Every attribute is bound to have a specific range of values. For example, age can not be less than zero
and telephone number can not be a outside 0-9.
Relational Algebra
Relational database systems are expected to be equipped by a query language that can assist its user to query the database instances. This
way its user empowers itself and can populate the results as required. There are two kinds of query languages, relational algebra and
relational calculus.
Relational algebra
Relational algebra is a procedural query language, which takes instances of relations as input and yields instances of relations as output. It
uses operators to perform queries. An operator can be either unary or binary. They accept relations as their input and yields relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are also considered relations.
Fundamental operations of Relational algebra:
Select
Project
Union
Set different
Cartesian product
Rename
These are defined briefly as follows:
Select Operation ()
Selects tuples that satisfy the given predicate from a relation.
Notation p(r)
Where p stands for selection predicate and r stands for relation. p is prepositional logic formulae which may use connectors like and, or and
not. These terms may use relational operators like: =, , , < , >, .
For example:
subject="database"(Books)
Output : Selects tuples from books where subject is 'database'.
subject="database"
and price="450"(Books)
Output : Selects tuples from books where subject is 'database' and 'price' is 450.
subject="database"
Output : Selects tuples from books where subject is 'database' and 'price' is 450 or the publication year is greater than 2010, that is published
after 2010.
Project Operation ()
Projects column(s) that satisfy given predicate.
Notation: A1, A2, An (r)
Where a1, a2 , an are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
for example:
subject,
author
(Books)
Selects and projects columns named as subject and author from relation Books.
Union Operation ()
Union operation performs binary union between two given relations and is defined as:
r s = { t | t r or t s}
Notion: r U s
Where r and s are either database relations or relation result set (temporary relation).
For a union operation to be valid, the following conditions must hold:
author
(Books)
author
(Articles)
Output : Projects the name of author who has either written a book or an article or both.
Set Difference ( )
The result of set difference operation is tuples which present in one relation but are not in the second relation.
Notation: r s
Finds all tuples that are present in r but not s.
author
(Books)
author
(Articles)
Output: Results the name of authors who has written books but not articles.
Cartesian Product ()
Combines information of two different relations into one.
Notation: r s
Where r and s are relations and there output will be defined as:
r s = { q t | q r and t s}
author = 'tutorialspoint'(Books
Articles)
Output : yields a relation as result which shows all books and articles written by tutorialspoint.
Rename operation ( )
Results of relational algebra are also relations but without any name. The rename operation allows us to rename the output relation. rename
operation is denoted with small greek letter rho
Notation: x (E)
Where the result of expression E is saved with name of x.
Additional operations are:
Set intersection
Assignment
Natural join
Relational Calculus
In contrast with Relational Algebra, Relational Calculus is non-procedural query language, that is, it tells what to do but never explains the
way, how to do it.
Relational calculus exists in two forms:
Output: returns tuples with 'name' from Author who has written article on 'database'.
TRC can be quantified also. We can use Existential ( )and Universal Quantifiers ( ).
For example:
{ R| T
Output : the query will yield the same result as the previous one.
Output: Yields Article, Page and Subject from relation TutorialsPoint where Subject is database.
Just like TRC, DRC also can be written using existential and universal quantifiers. DRC also involves relational operators.
Expression power of Tuple relation calculus and Domain relation calculus is equivalent to Relational Algebra.
Functional Dependency
Functional Dependency is the starting point for the process of normalization. Functional dependency exists when a
relationship between two attributes allows you to uniquely determine the corresponding attributes value. If X is known,
and as a result you are able to uniquely identify Y, there is functional dependency. Combined with keys, normal forms
are defined for relations.
Examples
Bear Number determines Student Name:
BearNum ---> StuName
Department Number and Job Rank determine Security Clearance:
(DeptNum, JRank) --->SecClear
Social Security Number determines Employee Name and Salary:
SSN ---> (EmpName, Salary)
Additionally, the above can be read as:
SSN --->EmpName and SSN Salary
Armstrongs Axioms
William W. Armstrong established a set of rules which can be sued to infer the functional dependencies in a relational
database (from umbc.edu - no external linking, Google Database Design UMBC):
Reflexivity rule:If A is a set of attributes, and B is a set of attributes that are completely contained in A,
the A implies B.
Augmentation rule: If A implies B, and C is a set of attributes, then if A implies B, then AC implies BC.
Normalization
Normalization, as previously mentioned, makes use of functional dependencies that exist in relations and the primary key
or candidate keys when analyzing tables. Multivalued Dependencies are also part of the normalization process, at levels
higher than Third Normal Form.
are
is
for
the
that
right
information.
of
individuals
These
to
rights
be
left
include:
in
peace.
Technology and information systems threaten the privacy of individuals to make cheap, efficient
and
effective
invasion.
Due process requires the existence of a set of rules or laws that clearly define how we treat
information about individuals and that appeal mechanisms available.
2) Property rights: how to move the classical concepts of patent and intellectual property in
digital technology? What are these rights and how to protect? Information technology has
hindered the protection of property because it is very easy to copy or distribute computer
information
networks.
Intellectual
property
is
subject
to
various
protections
under
three
patents:
Trade secrets: Any intellectual work product used for business purposes may be classified as
secret.
Copyright: It is a concession granted by law to protect creators of intellectual property against
copying
by
others
for
any
purpose
for
period
of
28
years.
Patents: A patent gives the holder, for 17 years, an exclusive monopoly on the ideas on which an
invention.
3) Responsibility and control: Who is responsible and who controls the use and abuse of
information from the People. The new information technologies are challenging existing laws
regarding liability and social practices, to force individuals and institutions accountable for their
actions.
4) Quality systems: What data standards, information processing programs should be required
to
ensure
the
protection
of
individual
rights
and
society?
It
can
hold
individuals
and
organizations for avoidable and foreseeable consequences if their obligation is to see and
correct.
5) Quality of life: What values should be preserved and protected in a society based on
information and knowledge? What institutions should protect and which should be protected? The
negative social costs of introducing information technologies and systems are growing along with
the power of technology. Computers and information technologies can destroy valuable elements
of culture and society, while providing benefits.
These five dimensions represent very good guideline considerations, ethical questions and
answers should be a company when introducing a new technology.