Beruflich Dokumente
Kultur Dokumente
BAXENDALE, Editor
A Relational Model of Data for The relational view (or model) of data described in
Section 1 appears to be superior in several respects to the
Large Shared Data Banks graph or network model [3, 4] presently in vogue for non-
inferential systems. It provides a means of describing data
with its natural structure only--that is, without superim-
E. F. CODD posing any additional structure for machine representation
I B M Research Laboratory, San Jose, California purposes. Accordingly, it provides a basis for a high level
data language which will yield maximal independence be-
tween programs on the one hand and machine representa-
Future users of large data banks must be protected from tion and organization of data on the other.
having to know how the data is organized in the machine (the A further advantage of the relational view is that it
internal representation). A prompting service which supplies forms a sound basis for treating derivability, redundancy,
such information is not a satisfactory solution. Activities of users and consistency of relations--these are discussed in Section
at terminals and most application programs should remain 2. The network model, on the other hand, has spawned a
unaffected when the internal representation of data is changed number of confusions, not the least of which is mistaking
and even when some aspects of the external representation the derivation of connections for the derivation of rela-
are changed. Changes in data representation will often be tions (see remarks in Section 2 on the "connection trap").
needed as a result of changes in query, update, and report Finally, the relational view permits a clearer evaluation
traffic and natural growth in the types of stored information. of the scope and logical limitations of present formatted
Existing noninferential, formatted data systems provide users data systems, and also the relative merits (from a logical
with tree-structured files or slightly more general network standpoint) of competing representations of data within a
models of the data. In Section 1, inadequacies of these models single system. Examples of this clearer perspective are
are discussed. A model based on n-ary relations, a normal cited in various parts of this paper. Implementations of
form for data base relations, and the concept of a universal systems to support the relational model are not discussed.
data sublanguage are introduced. In Section 2, certain opera-
1.2. DATA DEPENDENCIES IN PRESENT SYSTEMS
tions on relations (other than logical inference) are discussed
The provision of data description tables in recently de-
and applied to the problems of redundancy and consistency
veloped information systems represents a major advance
in the user's model.
toward the goal of data independence [5, 6, 7]. Such tables
KEY WORDS AND PHRASES: data bank, data base, data structure, data facilitate changing certain characteristics of the data repre-
organization, hierarchies of data, networks of data, relations, derivability, sentation stored in a data bank. However, the variety of
redundancy, consistency, composition, join, retrieval language, predicate
data representation characteristics which can be changed
calculus, security, data integrity
CR CATEGORIES: 3.70, 3.73, 3.75, 4.20, 4.22, 4.29 without logically impairing some application programs is
still quite limited. Further, the model of data with which
users interact is still cluttered with representational prop-
erties, particularly in regard to the representation of col-
lections of data (as opposed to individual items). Three of
the principal kinds of data dependencies which still need
1. Relational Model and Normal Form to be removed are: ordering dependence, indexing depend-
ence, and access path dependence. In some systems these
1.1. INTRODUCTION dependencies are not clearly separable from one another.
This paper is concerned with the application of ele- 1.2.1. Ordering Dependence. Elements of data in a
mentary relation theory to systems which provide shared data bank may be stored in a variety of ways, some involv-
access to large banks of formatted data. Except for a paper ing no concern for ordering, some permitting each element
by Childs [1], the principal application of relations to data to participate in one ordering only, others permitting each
systems has been to deductive question-answering systems. element to participate in several orderings. Let us consider
Levein and Maron [2] provide numerous references to work those existing systems which either require or permit data
in this area. elements to be stored in at least one total ordering which is
In contrast, the problems treated here are those of data closely associated with the hardware-determined ordering
independence--the independence of application programs of addresses. For example, the records of a file concerning
and terminal activities from growth in data types and parts might be stored in ascending order by part serial
changes in data representation--and certain kinds of data number. Such systems normally permit application pro-
inconsistency which are expected to become troublesome grams to assume that the order of presentation of records
even in nondeductive systems. from such a file is identical to (or is a subordering of) the
then we m a y form a three-way join of R, S, T; t h a t is, a Extension of the notions of linear and cyclic 3-join and
ternary relation such t h a t their natural counterparts to the joining of n binary rela-
~(v) = R, ~(U) = S, ~(V) = T. tions (where n > 3) is obvious. A few words m a y be ap-
propriate, however, regarding the joining of relations which
Such a join will be called a cyclic 3-join to distinguish it are not necessarily binary. Consider the case of two rela-
from a linear 3-join which would be a quaternary relation tions R (degree r), S (degree s) which are to be joined on
V such t h a t p of their domains (p < r, p < s). For simplicity, sup-
pose these p domains are the last p of the r domains of R,
~r12(V) = R, ~r~3(V) = Z, 7r34(V) = T.
and the first p of the s domains of S. I f this were not so, we
While it is possible for more than one cyclic 3-join to exist could always apply appropriate permutations to m a k e it
(see Figures 8, 9, for an example), the circumstances under so. Now, take the Cartesian product of the first r-p do-
which this can occur entail much more severe constraints mains of R, and call this new domain A. T a k e the Car-
tesian product of the last p domains of R, and call this B.
R (8 p) s (p i) T ~ 8) T a k e the Cartesian product of the last s-p domains of S
la ad dl and call this C.
2a ae d2 We can t r e a t R as if it were a binary relation on the
2b bd e2 domains A, B. Similarly, we can t r e a t S as if it were a bi-
be e2 n a r y relation on the domains B, C. The notions of linear
FIG. 8. B i n a r y r e l a t i o n s w i t h a p l u r a l i t y o f c y c l i c 3 ~ o i n s and cyclic 3-join are now directly applicable. A similar ap-
proach can be t a k e n with the linear and cyclic n-joins of n
U (8 p j) U' (8 p i) relations of assorted degrees.
2.1.4. Composition. T h e reader is probably familiar
lad lad
2ae 2ad with the notion of composition applied to functions. We
2bd 2ae shall discuss a generalization of t h a t concept and apply it
2be 2bd first to binary relations. Our definitions of composition
2be and composability are based very directly on the definitions
FIG. 9. Two cyclic 3-joins of the relations in Figure 8 of join and joinability given above.
Suppose we are given two relations R, S. T is a com-
t h a n those for a plurality of 2-joins. To be specific, the re- position of R with S if there exists a join U of R with S such
lations R, S, T m u s t possess points of ambiguity with t h a t T = 7913(U). Thus, two relations are composable if
respect to joining R with S (say point x), S with T (say and only if they are j oinable. However, the existence of
more t h a n one join of R with S does not imply the existence
8 A function is a binary relation, which is one-one or many-one, of more t h a n one composition of R with S.
but not one-many. Corresponding to the natural join of R with S is the
[]