SQL Server chapter on optimization

1268 Chapter 30 Microsoft SQL Server
considers for cost-based optimization of subqueries. Additional information on

the self-tuning aspects of SQL Server are discussed by Chaudhuri et al. [1999].
Chaudhuri and Shim [1994] and Yan and Larson [1995] discuss reordering of
aggregation operations.
Chatziantoniou and Ross [1997] and Galindo-Legaria and Joshi [2001] pro-
posed the alternative used by SQL Server for SQL queries requiring a self-join.
Under this scheme, the optimizer detects the pattern and considers per-segment
execution. Pellenkoft et al. [1997] discusses the optimization scheme for generat-
ing the complete search space using a set of transformations that are complete,
local and nonredundant. Graefe et al. [1998] offers discussion concerning hash
operations that support basic aggregation and join, with a number of optimiza-
tions, extensions, and dynamic tuning for data skew. Graefe et al. [1998] presents
the idea of joining indices for the sole purpose of assembling a row with the set of
columns needed on a query. It argues that this sometimes is faster than scanning
a base table.
Blakeley [1996] and Blakeley and Pizzo [2001] offer discussions concerning
communication with the storage engine through OLE-DB. Blakeley et al. [2005] de-
tails the implementation of the distributed and heterogeneous query capabilities
of SQL Server. Acheson et al. [2004] provides details on the integration of the .NET
CLR inside the SQL Server process.
Blakeley et al. [2008] describes the contracts for UDTs, UDAggs, and UDFs
in more detail. Blakeley et al. [2006] describes the ADO.NET Entity Framework.
Melnik et al. [2007] describes the mapping technology behind the ADO.NET En-
tity Framework. Adya et al. [2007] provides an overview of the ADO.NET Entity
Framework architecture. The SQL:2003 standard is defined in SQL/XML [2004].
Rys [2001] provides more details on the SQL Server 2000 XML functionality. Rys
[2004] provides an overview of the extensions to the for xml aggregation. For
information on XML capabilities that can be used on the client side or inside
CLR, refer to the collection of white papers at http://msdn.microsoft.com/XML/Building-
XML/XMLandDatabase/default.aspx. The XQuery 1.0/XPath 2.0 data model is defined
in Walsh et al. [2007]. Rys [2003] provides an overview of implementation tech-
niques for XQuery in the context of relational databases. The OrdPath numbering
scheme is described in O’Neil et al. [2004]; Pal et al. [2004] and Baras et al. [2005]
provide more information on XML indexing and XQuery algebraization and opti-
mization in SQL Server 2005.
PART
10
APPENDICES
Appendix A presents the full details of the university database that we have used
as our running example, including an E-R diagram, SQL DDL, and sample data that
we have used throughout the book. (The DDL and sample data are also available
on the Web site of the book, db-book.com, for use in laboratory exercises.)
The remaining appendices are not part of the printed book, but are available online
on the Web site of the book, db-book.com. These include:
• Appendix B (Advanced Relational Database Design), first covers the theory

of multivalued dependencies; recall that multivalued dependencies were
introduced in Chapter 8. The project-join normal form, which is based on a
type of constraint called join dependency is presented next; join dependencies
are a generalization of multivalued dependencies. The chapter concludes
with another normal form called the domain-key normal form.
• Appendix C (Other Relational Query Languages) first presents the relational
query language Query-by-Example (QBE), which was designed to be used by
non-programmers. In QBE, queries look like a collection of tables containing
an example of data to be retrieved. The graphical query language of Microsoft
Access, which is based on QBE, is presented next, followed by the Datalog
language, which has a syntax modeled after the logic-programming language
Prolog.
• Appendix D (Network Model), and Appendix E (Hierarchical Model), cover
the network and hierarchical data models. Both these data models predate
the relational model, and provide a level of abstraction that is lower than the
relational model. They abstract away some, but not all, details of the actual
data structures used to store data on disks. These models are only used in a
few legacy applications.
For appendices B through E, we illsutrate our concepts using a bank enterprise

with the schema shown in Figure 2.15.
1269
This page intentionally left blank
APPENDIX
A
Detailed University Schema
In this appendix, we present the full details of our running-example university
database. In Section A.1 we present the full schema as used in the text and the E-R
diagram that corresponds to that schema. In Section A.2 we present a relatively
complete SQL data definition for our running university example. Besides listing
a datatype for each attribute, we include a substantial number of constraints.
Finally, in Section A.3 we present sample data that correspond to our schema.
SQL scripts to create all the relations in the schema, and to populate them with
sample data, are available on the Web site of the book, db-book.com.
A.1 Full Schema

The full schema of the University database as used in the text is shown in Fig-
ure A.1. The E-R diagram that corresponds to that schema, and used throughout
the text, is shown in Figure A.2.
classroom(building, room number, capacity)

department(dept name, building, budget)
course(course id, title, dept name, credits)
instructor(ID, name, dept name, salary)
section(course id, sec id, semester, year, building, room number, time slot id)
teaches(ID, course id, sec id, semester, year)
student(ID, name, dept name, tot cred)
takes(ID, course id, sec id, semester, year, grade)
advisor(s ID, i ID)
time slot(time slot id, day, start time, end time)
prereq(course id, prereq id)
Figure A.1 Schema of the University database.
1271
1272 Appendix A Detailed University Schema
department
course_dept dept_name
building
budget
inst_dept stud_dept
instructor student
ID advisor ID
name name
salary tot_cred
teaches takes grade
section
course sec_id time_slot
course_id semester sec_time_slot time_slot_id
sec_course
title year { day
credits start_time
end_time
}
prereq
course_id prereq_id sec_class
classroom
building
room_number
capacity
Figure A.2 E-R diagram for a university enterprise.
A.2 DDL
In this section, we present a relatively complete SQL data definition for our exam-
ple. Besides listing a datatype for each attribute, we include a substantial number
of constraints.
create table classroom

(building varchar (15),
room number varchar (7),
capacity numeric (4,0),
primary key (building, room number));
A.2 DDL 1273
create table department

(dept name varchar (20),
building varchar (15),
budget numeric (12,2) check (budget > 0),
primary key (dept name));
create table course

(course id varchar (8),
title varchar (50),
dept name varchar (20),
credits numeric (2,0) check (credits > 0),
primary key (course id),
foreign key (dept name) references department
on delete set null);
create table instructor

(ID varchar (5),
name varchar (20) not null,
salary numeric (8,2) check (salary > 29000),
primary key (ID),
create table section

(course id varchar (8),
sec id varchar (8),
semester varchar (6) check (semester in
(’Fall’, ’Winter’, ’Spring’, ’Summer’)),
year numeric (4,0) check (year > 1701 and year < 2100),
building varchar (15),
room number varchar (7),
time slot id varchar (4),
primary key (course id, sec id, semester, year),
foreign key (course id) references course
on delete cascade,
foreign key (building, room number) references classroom
In the above DDL we add the on delete cascade specification to a foreign

key constraint if the existence of the tuple depends on the referenced tuple. For
example we add the on delete cascade specification to the foreign key constraint
from section (which was generated from weak entity section), to course (which was
its identifying relationship). In other foreign key constraints we either specify

on delete set null, which allows deletion of a referenced tuple by setting the
referencing value to null, or do not add any specification, which prevents the
deletion of any referenced tuple. For example, if a department is deleted, we
would not wish to delete associated instructors; the foreign key constraint from
instructor to department instead sets the dept name attribute to null. On the other
hand, the foreign key constraint for the prereq relation, shown later, prevents the
deletion of a course that is required as a prerequisite for another course. For the
advisor relation, shown later, we allow i ID to be set to null if an instructor is
deleted, but delete an advisor tuple if the referenced student is deleted.
create table teaches

(ID varchar (5),
course id varchar (8),
sec id varchar (8),
semester varchar (6),
year numeric (4,0),
primary key (ID, course id, sec id, semester, year),
foreign key (course id, sec id, semester, year) references section
on delete cascade,
foreign key (ID) references instructor
on delete cascade);
create table student

(ID varchar (5),
name varchar (20) not null,
tot cred numeric (3,0) check (tot cred >= 0),
primary key (ID),
create table takes

(ID varchar (5),
course id varchar (8),
sec id varchar (8),
semester varchar (6),
year numeric (4,0),
grade varchar (2),
primary key (ID, course id, sec id, semester, year),
foreign key (course id, sec id, semester, year) references section
on delete cascade,
foreign key (ID) references student
on delete cascade);
A.2 DDL 1275
create table advisor

(s ID varchar (5),
i ID varchar (5),
primary key (s ID),
foreign key (i ID) references instructor (ID)
on delete set null,
foreign key (s ID) references student (ID)
on delete cascade);
create table prereq

(course id varchar(8),
prereq id varchar(8),
primary key (course id, prereq id),
foreign key (course id) references course
on delete cascade,
foreign key (prereq id) references course);
The following create table statement for the table time slot can be run on most
database systems, but does not work on Oracle (at least as of Oracle version 11),
since Oracle does not support the SQL standard type time.
create table timeslot

(time slot id varchar (4),
day varchar (1) check (day in (’M’, ’T’, ’W’, ’R’, ’F’, ’S’, ’U’)),
start time time,
end time time,
primary key (time slot id, day, start time));
The syntax for specifying time in SQL is illustrated by these examples: ’08:30’,
’13:55’, and ’5:30 PM’. Since Oracle does not support the time type, for Oracle we
use the following schema instead:
create table timeslot

(time slot id varchar (4),
day varchar (1),
start hr numeric (2) check (start hr >= 0 and end hr < 24),
start min numeric (2) check (start min >= 0 and start min < 60),
end hr numeric (2) check (end hr >= 0 and end hr < 24),
end min numeric (2) check (end min >= 0 and end min < 60),
primary key (time slot id, day, start hr, start min));
The difference is that start time has been replaced by two attributes start hr
and start min, and similarly end time has been replaced by attributes end hr and
end min. These attributes also have constraints that ensure that only numbers
representing valid time values appear in those attributes. This version of the
schema for time slot works on all databases, including Oracle. Note that although
Oracle supports the datetime datatype, datetime includes a specific day, month,
and year as well as a time, and is not appropriate here since we want only a
time. There are two alternatives to splitting the time attributes into an hour and
a minute component, but neither is desirable. The first alternative is to use a
varchar type, but that makes it hard to enforce validity constraints on the string
as well as to perform comparison on time. The second alternative is to encode
time as an integer representing a number of minutes (or seconds) from midnight,
but this alternative requires extra code with each query to covert values between
the standard time representation and the integer encoding. We therefore chose
the two-part solution.
A.3 Sample Data

In this section we provide sample data for each of the relations defined in the
previous section.
building room number capacity

Packard 101 500
Painter 514 10
Taylor 3128 70
Watson 100 30
Watson 120 50
Figure A.3 The classroom relation.
dept name building budget

Biology Watson 90000
Comp. Sci. Taylor 100000
Elec. Eng. Taylor 85000
Finance Painter 120000
History Painter 50000
Music Packard 80000
Physics Watson 70000
Figure A.4 The department relation.

A.3 Sample Data 1277
course id title dept name credits

BIO-101 Intro. to Biology Biology 4
BIO-301 Genetics Biology 4
BIO-399 Computational Biology Biology 3
CS-101 Intro. to Computer Science Comp. Sci. 4
CS-190 Game Design Comp. Sci. 4
CS-315 Robotics Comp. Sci. 3
CS-319 Image Processing Comp. Sci. 3
CS-347 Database System Concepts Comp. Sci. 3
EE-181 Intro. to Digital Systems Elec. Eng. 3
FIN-201 Investment Banking Finance 3
HIS-351 World History History 3
MU-199 Music Video Production Music 3
PHY-101 Physical Principles Physics 4
Figure A.5 The course relation.
ID name dept name salary

10101 Srinivasan Comp. Sci. 65000
12121 Wu Finance 90000
15151 Mozart Music 40000
22222 Einstein Physics 95000
32343 El Said History 60000
33456 Gold Physics 87000
45565 Katz Comp. Sci. 75000
58583 Califieri History 62000
76543 Singh Finance 80000
76766 Crick Biology 72000
83821 Brandt Comp. Sci. 92000
98345 Kim Elec. Eng. 80000
Figure A.6 The instructor relation.

course id sec id semester year building room number time slot id

BIO-101 1 Summer 2009 Painter 514 B
BIO-301 1 Summer 2010 Painter 514 A
CS-101 1 Fall 2009 Packard 101 H
CS-101 1 Spring 2010 Packard 101 F
CS-190 1 Spring 2009 Taylor 3128 E
CS-190 2 Spring 2009 Taylor 3128 A
CS-315 1 Spring 2010 Watson 120 D
CS-319 1 Spring 2010 Watson 100 B
CS-319 2 Spring 2010 Taylor 3128 C
CS-347 1 Fall 2009 Taylor 3128 A
EE-181 1 Spring 2009 Taylor 3128 C
FIN-201 1 Spring 2010 Packard 101 B
HIS-351 1 Spring 2010 Painter 514 C
MU-199 1 Spring 2010 Packard 101 D
PHY-101 1 Fall 2009 Watson 100 A
Figure A.7 The section relation.
ID course id sec id semester year

10101 CS-101 1 Fall 2009
10101 CS-315 1 Spring 2010
10101 CS-347 1 Fall 2009
12121 FIN-201 1 Spring 2010
15151 MU-199 1 Spring 2010
22222 PHY-101 1 Fall 2009
32343 HIS-351 1 Spring 2010
45565 CS-101 1 Spring 2010
45565 CS-319 1 Spring 2010
76766 BIO-101 1 Summer 2009
76766 BIO-301 1 Summer 2010
83821 CS-190 1 Spring 2009
83821 CS-190 2 Spring 2009
83821 CS-319 2 Spring 2010
98345 EE-181 1 Spring 2009
Figure A.8 The teaches relation.

ID name dept name tot cred

00128 Zhang Comp. Sci. 102
12345 Shankar Comp. Sci. 32
19991 Brandt History 80
23121 Chavez Finance 110
44553 Peltier Physics 56
45678 Levy Physics 46
54321 Williams Comp. Sci. 54
55739 Sanchez Music 38
70557 Snow Physics 0
76543 Brown Comp. Sci. 58
76653 Aoi Elec. Eng. 60
98765 Bourikas Elec. Eng. 98
98988 Tanaka Biology 120
Figure A.9 The student relation.
ID course id sec id semester year grade

00128 CS-101 1 Fall 2009 A
00128 CS-347 1 Fall 2009 A-
12345 CS-101 1 Fall 2009 C
12345 CS-190 2 Spring 2009 A
12345 CS-315 1 Spring 2010 A
12345 CS-347 1 Fall 2009 A
19991 HIS-351 1 Spring 2010 B
23121 FIN-201 1 Spring 2010 C+
44553 PHY-101 1 Fall 2009 B-
45678 CS-101 1 Fall 2009 F
45678 CS-101 1 Spring 2010 B+
45678 CS-319 1 Spring 2010 B
54321 CS-101 1 Fall 2009 A-
54321 CS-190 2 Spring 2009 B+
55739 MU-199 1 Spring 2010 A-
76543 CS-101 1 Fall 2009 A
76543 CS-319 2 Spring 2010 A
76653 EE-181 1 Spring 2009 C
98765 CS-101 1 Fall 2009 C-
98765 CS-315 1 Spring 2010 B
98988 BIO-101 1 Summer 2009 A
98988 BIO-301 1 Summer 2010 null
Figure A.10 The takes relation.

s id i id
00128 45565
12345 10101
23121 76543
44553 22222
45678 22222
76543 45565
76653 98345
98765 98345
98988 76766
Figure A.11 The advisor relation.
time slot id day start time end time

A M 8:00 8:50
A W 8:00 8:50
A F 8:00 8:50
B M 9:00 9:50
B W 9:00 9:50
B F 9:00 9:50
C M 11:00 11:50
C W 11:00 11:50
C F 11:00 11:50
D M 13:00 13:50
D W 13:00 13:50
D F 13:00 13:50
E T 10:30 11:45
E R 10:30 11:45
F T 14:30 15:45
F R 14:30 15:45
G M 16:00 16:50
G W 16:00 16:50
G F 16:00 16:50
H W 10:00 12:30
Figure A.12 The time slot relation.

course id prereq id
BIO-301 BIO-101
BIO-399 BIO-101
CS-190 CS-101
CS-315 CS-101
CS-319 CS-101
CS-347 CS-101
EE-181 PHY-101
Figure A.13 The prereq relation.
time slot id day start hr start min end hr end min

A M 8 0 8 50
A W 8 0 8 50
A F 8 0 8 50
B M 9 0 9 50
B W 9 0 9 50
B F 9 0 9 50
C M 11 0 11 50
C W 11 0 11 50
C F 11 0 11 50
D M 13 0 13 50
D W 13 0 13 50
D F 13 0 13 50
E T 10 30 11 45
E R 10 30 11 45
F T 14 30 15 45
F R 14 30 15 45
G M 16 0 16 50
G W 16 0 16 50
G F 16 0 16 50
H W 10 0 12 30
Figure A.14 The time slot relation with start and end time separated into hour and minute.
This page intentionally left blank
Bibliography
[Abadi 2009] D. Abadi, “Data Management in the Cloud: Limitations and Oppor-
tunities”, Data Engineering Bulletin, Volume 32, Number 1 (2009), pages 3–12.
[Abadi et al. 2008] D. J. Abadi, S. Madden, and N. Hachem, “Column-stores vs.
row-stores: how different are they really?”, In Proc. of the ACM SIGMOD Conf. on
Management of Data (2008), pages 967–980.
[Abiteboul et al. 1995] S. Abiteboul, R. Hull, and V. Vianu, Foundations of Databases,
Addison Wesley (1995).
[Abiteboul et al. 2003] S. Abiteboul, R. Agrawal, P. A. Bernstein, M. J. Carey, et al.
“The Lowell Database Research Self Assessment” (2003).
[Acheson et al. 2004] A. Acheson, M. Bendixen, J. A. Blakeley, I. P. Carlin, E. Er-
san, J. Fang, X. Jiang, C. Kleinerman, B. Rathakrishnan, G. Schaller, B. Sezgin,
R. Venkatesh, and H. Zhang, “Hosting the .NET Runtime in Microsoft SQL Server”,
In Proc. of the ACM SIGMOD Conf. on Management of Data (2004), pages 860–865.
[Adali et al. 1996] S. Adali, K. S. Candan, Y. Papakonstantinou, and V. S. Subrah-
manian, “Query Caching and Optimization in Distributed Mediator Systems”, In
Proc. of the ACM SIGMOD Conf. on Management of Data (1996), pages 137–148.
[Adya et al. 2007] A. Adya, J. A. Blakeley, S. Melnik, and S. Muralidhar, “Anatomy
of the ADO.NET entity framework”, In Proc. of the ACM SIGMOD Conf. on Man-
agement of Data (2007), pages 877–888.
[Agarwal et al. 1996] S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F.
Naughton, R. Ramakrishnan, and S. Sarawagi, “On the Computation of Multi-
dimensional Attributes”, In Proc. of the International Conf. on Very Large Databases
(1996), pages 506–521.
[Agrawal and Srikant 1994] R. Agrawal and R. Srikant, “Fast Algorithms for Min-
ing Association Rules in Large Databases”, In Proc. of the International Conf. on Very
Large Databases (1994), pages 487–499.
1283
1284 Bibliography
[Agrawal et al. 1992] R. Agrawal, S. P. Ghosh, T. Imielinski, B. R. Iyer, and A. N.

Swami, “An Interval Classifier for Database Mining Applications”, In Proc. of the
International Conf. on Very Large Databases (1992), pages 560–573.
[Agrawal et al. 1993a] R. Agrawal, T. Imielinski, and A. Swami, “Mining Associa-
tion Rules between Sets of Items in Large Databases”, In Proc. of the ACM SIGMOD
Conf. on Management of Data (1993).
[Agrawal et al. 1993b] R. Agrawal, T. Imielinski, and A. N. Swami, “Database Min-
ing: A Performance Perspective”, IEEE Transactions on Knowledge and Data Engineer-
ing, Volume 5, Number 6 (1993), pages 914–925.
[Agrawal et al. 2000] S. Agrawal, S. Chaudhuri, and V. R. Narasayya, “Automated
Selection of Materialized Views and Indexes in SQL Databases”, In Proc. of the
[Agrawal et al. 2002] S. Agrawal, S. Chaudhuri, and G. Das, “DBXplorer: A System
for Keyword-Based Search over Relational Databases”, In Proc. of the International
Conf. on Data Engineering (2002).
[Agrawal et al. 2004] S. Agrawal, S. Chaudhuri, L. Kollar, A. Marathe,
V. Narasayya, and M. Syamala, “Database Tuning Advisor for Microsoft SQL Server
2005”, In Proc. of the International Conf. on Very Large Databases (2004).
[Agrawal et al. 2009] R. Agrawal, A. Ailamaki, P. A. Bernstein, E. A. Brewer, M. J.
Carey, S. Chaudhuri, A. Doan, D. Florescu, M. J. Franklin, H. Garcia-Molina,
J. Gehrke, L. Gruenwald, L. M. Haas, A. Y. Halevy, J. M. Hellerstein, Y. E. Ioan-
nidis, H. F. Korth, D. Kossmann, S. Madden, R. Magoulas, B. C. Ooi, T. O^Reilly,
R. Ramakrishnan, S. Sarawagi, and G. W. Michael Stonebraker, Alexander S. Sza-
lay, “The Claremont Report on Database Research”, Communications of the ACM,
Volume 52, Number 6 (2009), pages 56–65.
[Ahmed et al. 2006] R. Ahmed, A. Lee, A. Witkowski, D. Das, H. Su, M. Zaı̈t, and
T. Cruanes, “Cost-Based Query Transformation in Oracle”, In Proc. of the Interna-
tional Conf. on Very Large Databases (2006), pages 1026–1036.
[Aho et al. 1979a] A. V. Aho, C. Beeri, and J. D. Ullman, “The Theory of Joins in
Relational Databases”, ACM Transactions on Database Systems, Volume 4, Number 3
(1979), pages 297–314.
[Aho et al. 1979b] A. V. Aho, Y. Sagiv, and J. D. Ullman, “Equivalences among
Relational Expressions”, SIAM Journal of Computing, Volume 8, Number 2 (1979),
pages 218–246.
[Ailamaki et al. 2001] A. Ailamaki, D. J. DeWitt, M. D. Hill, and M. Skounakis,
“Weaving Relations for Cache Performance”, In Proc. of the International Conf. on
Very Large Databases (2001), pages 169–180.
[Alonso and Korth 1993] R. Alonso and H. F. Korth, “Database System Issues in
Nomadic Computing”, In Proc. of the ACM SIGMOD Conf. on Management of Data
(1993), pages 388–392.
Bibliography 1285
[Amer-Yahia et al. 2004] S. Amer-Yahia, C. Botev, and J. Shanmugasundaram,

“TeXQuery: A Full-Text Search Extension to XQuery”, In Proc. of the International
World Wide Web Conf. (2004).
[Anderson et al. 1992] D. P. Anderson, Y. Osawa, and R. Govindan, “A File System
for Continuous Media”, ACM Transactions on Database Systems, Volume 10, Number
4 (1992), pages 311–337.
[Anderson et al. 1998] T. Anderson, Y. Breitbart, H. F. Korth, and A. Wool, “Repli-
cation, Consistency and Practicality: Are These Mutually Exclusive?”, In Proc. of
the ACM SIGMOD Conf. on Management of Data (1998).
[ANSI 1986] American National Standard for Information Systems: Database Language
SQL. American National Standards Institute (1986).
[ANSI 1989] Database Language SQL with Integrity Enhancement, ANSI X3, 135–1989.
American National Standards Institute, New York (1989).
[ANSI 1992] Database Language SQL, ANSI X3,135–1992. American National Stan-
dards Institute, New York (1992).
[Antoshenkov 1995] G. Antoshenkov, “Byte-aligned Bitmap Compression (poster
abstract)”, In IEEE Data Compression Conf. (1995).
[Appelt and Israel 1999] D. E. Appelt and D. J. Israel, “Introduction to Information
Extraction Technology”, In Proc. of the International Joint Conferences on Artificial
Intelligence (1999).
[Apt and Pugin 1987] K. R. Apt and J. M. Pugin, “Maintenance of Stratified
Database Viewed as a Belief Revision System”, In Proc. of the ACM Symposium
on Principles of Database Systems (1987), pages 136–145.
[Armstrong 1974] W. W. Armstrong, “Dependency Structures of Data Base Rela-
tionships”, In Proc. of the 1974 IFIP Congress (1974), pages 580–583.
[Astrahan et al. 1976] M. M. Astrahan, M. W. Blasgen, D. D. Chamberlin, K. P.
Eswaran, J. N. Gray, P. P. Griffiths, W. F. King, R. A. Lorie, P. R. McJones, J. W. Mehl,
G. R. Putzolu, I. L. Traiger, B. W. Wade, and V. Watson, “System R, A Relational Ap-
proach to Data Base Management”, ACM Transactions on Database Systems, Volume
1, Number 2 (1976), pages 97–137.
[Atreya et al. 2002] M. Atreya, B. Hammond, S. Paine, P. Starrett, and S. Wu, Digital
Signatures, RSA Press (2002).
[Atzeni and Antonellis 1993] P. Atzeni and V. D. Antonellis, Relational Database
Theory, Benjamin Cummings (1993).
[Baeza-Yates and Ribeiro-Neto 1999] R. Baeza-Yates and B. Ribeiro-Neto, Modern
Information Retrieval, Addison Wesley (1999).
[Bancilhon et al. 1989] F. Bancilhon, S. Cluet, and C. Delobel, “A Query Language
for the O2 Object-Oriented Database”, In Proc. of the Second Workshop on Database
Programming Languages (1989).
1286 Bibliography
[Baras et al. 2005] A. Baras, D. Churin, I. Cseri, T. Grabs, E. Kogan, S. Pal, M. Rys,
and O. Seeliger. “Implementing XQuery in a Relational Database System” (2005).
[Baru et al. 1995] C. Baru et al., “DB2 Parallel Edition”, IBM Systems Journal, Volume
34, Number 2 (1995), pages 292–322.
[Bassiouni 1988] M. Bassiouni, “Single-site and Distributed Optimistic Protocols
for Concurrency Control”, IEEE Transactions on Software Engineering, Volume SE-14,
Number 8 (1988), pages 1071–1080.
[Batini et al. 1992] C. Batini, S. Ceri, and S. Navathe, Database Design: An Entity-
Relationship Approach, Benjamin Cummings (1992).
[Bayer 1972] R. Bayer, “Symmetric Binary B-trees: Data Structure and Maintenance
Algorithms”, Acta Informatica, Volume 1, Number 4 (1972), pages 290–306.
[Bayer and McCreight 1972] R. Bayer and E. M. McCreight, “Organization and
Maintenance of Large Ordered Indices”, Acta Informatica, Volume 1, Number 3
(1972), pages 173–189.
[Bayer and Schkolnick 1977] R. Bayer and M. Schkolnick, “Concurrency of Oper-
ating on B-trees”, Acta Informatica, Volume 9, Number 1 (1977), pages 1–21.
[Bayer and Unterauer 1977] R. Bayer and K. Unterauer, “Prefix B-trees”, ACM
Transactions on Database Systems, Volume 2, Number 1 (1977), pages 11–26.
[Bayer et al. 1978] R. Bayer, R. M. Graham, and G. Seegmuller, editors, Operating
Systems: An Advanced Course, Springer Verlag (1978).
[Beckmann et al. 1990] N. Beckmann, H. P. Kriegel, R. Schneider, and B. Seeger,
“The R∗ -tree: An Efficient and Robust Access Method for Points and Rectangles”,
In Proc. of the ACM SIGMOD Conf. on Management of Data (1990), pages 322–331.
[Beeri et al. 1977] C. Beeri, R. Fagin, and J. H. Howard, “A Complete Axiomatiza-
tion for Functional and Multivalued Dependencies”, In Proc. of the ACM SIGMOD
Conf. on Management of Data (1977), pages 47–61.
[Bentley 1975] J. L. Bentley, “Multidimensional Binary Search Trees Used for Asso-
ciative Searching”, Communications of the ACM, Volume 18, Number 9 (1975), pages
509–517.
[Berenson et al. 1995] H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O’Neil, and
P. O’Neil, “A Critique of ANSI SQL Isolation Levels”, In Proc. of the ACM SIGMOD
[Bernstein and Goodman 1981] P. A. Bernstein and N. Goodman, “Concurrency
Control in Distributed Database Systems”, ACM Computing Survey, Volume 13,
Number 2 (1981), pages 185–221.
[Bernstein and Newcomer 1997] P. A. Bernstein and E. Newcomer, Principles of
Transaction Processing, Morgan Kaufmann (1997).
Bibliography 1287
[Bernstein et al. 1998] P. Bernstein, M. Brodie, S. Ceri, D. DeWitt, M. Franklin,

H. Garcia-Molina, J. Gray, J. Held, J. Hellerstein, H. V. Jagadish, M. Lesk, D. Maier,
J. Naughton, H. Pirahesh, M. Stonebraker, and J. Ullman, “The Asilomar Report on
Database Research”, ACM SIGMOD Record, Volume 27, Number 4 (1998).
[Berson et al. 1995] S. Berson, L. Golubchik, and R. R. Muntz, “Fault Tolerant De-
sign of Multimedia Servers”, In Proc. of the ACM SIGMOD Conf. on Management of
Data (1995), pages 364–375.
[Bhalotia et al. 2002] G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Su-
darshan, “Keyword Searching and Browsing in Databases using BANKS”, In Proc.
of the International Conf. on Data Engineering (2002).
[Bharat and Henzinger 1998] K. Bharat and M. R. Henzinger, “Improved Algo-
rithms for Topic Distillation in a Hyperlinked Environment”, In Proc. of the ACM
SIGIR Conf. on Research and Development in Information Retrieval (1998), pages 104–
111.
[Bhattacharjee et al. 2003] B. Bhattacharjee, S. Padmanabhan, T. Malkemus, T. Lai,
L. Cranston, and M. Huras, “Efficient Query Processing for Multi-Dimensionally
Clustered Tables in DB2”, In Proc. of the International Conf. on Very Large Databases
(2003), pages 963–974.
[Biskup et al. 1979] J. Biskup, U. Dayal, and P. A. Bernstein, “Synthesizing Inde-
pendent Database Schemas”, In Proc. of the ACM SIGMOD Conf. on Management of
Data (1979), pages 143–152.
[Bitton et al. 1983] D. Bitton, D. J. DeWitt, and C. Turbyfill, “Benchmarking
Database Systems: A Systematic Approach”, In Proc. of the International Conf. on
Very Large Databases (1983).
[Blakeley 1996] J. A. Blakeley, “Data Access for the Masses through OLE DB”, In
[Blakeley and Pizzo 2001] J. A. Blakeley and M. Pizzo, “Enabling Component
Databases with OLE DB”, In K. R. Dittrich and A. Geppert, editors, Component
Database Systems, Morgan Kaufmann Publishers (2001), pages 139–173.
[Blakeley et al. 1986] J. A. Blakeley, P. Larson, and F. W. Tompa, “Efficiently Up-
dating Materialized Views”, In Proc. of the ACM SIGMOD Conf. on Management of
Data (1986), pages 61–71.
[Blakeley et al. 2005] J. A. Blakeley, C. Cunningham, N. Ellis, B. Rathakrishnan,
and M.-C. Wu, “Distributed/Heterogeneous Query Processing in Microsoft SQL
Server”, In Proc. of the International Conf. on Data Engineering (2005).
[Blakeley et al. 2006] J. A. Blakeley, D. Campbell, S. Muralidhar, and A. Nori, “The
ADO.NET entity framework: making the conceptual level real”, SIGMOD Record,
[Blakeley et al. 2008] J. A. Blakeley, V. Rao, I. Kunen, A. Prout, M. Henaire, and
C. Kleinerman, “.NET database programmability and extensibility in Microsoft
1288 Bibliography
SQL server”, In Proc. of the ACM SIGMOD Conf. on Management of Data (2008),
pages 1087–1098.
[Blasgen and Eswaran 1976] M. W. Blasgen and K. P. Eswaran, “On the Evaluation
of Queries in a Relational Database System”, IBM Systems Journal, Volume 16,
(1976), pages 363–377.
[Boyce et al. 1975] R. Boyce, D. D. Chamberlin, W. F. King, and M. Hammer, “Spec-
ifying Queries as Relational Expressions”, Communications of the ACM, Volume 18,
Number 11 (1975), pages 621–628.
[Brantner et al. 2008] M. Brantner, D. Florescu, D. Graf, D. Kossmann, and
T. Kraska, “Building a Database on S3”, In Proc. of the ACM SIGMOD Conf. on
[Breese et al. 1998] J. Breese, D. Heckerman, and C. Kadie, “Empirical Analysis of
Predictive Algorithms for Collaborative Filtering”, In Procs. Conf. on Uncertainty in
Artificial Intelligence, Morgan Kaufmann (1998).
[Breitbart et al. 1999a] Y. Breitbart, R. Komondoor, R. Rastogi, S. Seshadri, and
A. Silberschatz, “Update Propagation Protocols For Replicated Databases”, In Proc.
of the ACM SIGMOD Conf. on Management of Data (1999), pages 97–108.
[Breitbart et al. 1999b] Y. Breitbart, H. Korth, A. Silberschatz, and S. Sudarshan,
“Distributed Databases”, In Encyclopedia of Electrical and Electronics Engineering,
John Wiley and Sons (1999).
[Brewer 2000] E. A. Brewer, “Towards robust distributed systems (abstract)”, In
Proc. of the ACM Symposium on Principles of Distributed Computing (2000), page 7.
[Brin and Page 1998] S. Brin and L. Page, “The Anatomy of a Large-Scale Hyper-
textual Web Search Engine”, In Proc. of the International World Wide Web Conf. (1998).
[Brinkhoff et al. 1993] T. Brinkhoff, H.-P. Kriegel, and B. Seeger, “Efficient Process-
ing of Spatial Joins Using R-trees”, In Proc. of the ACM SIGMOD Conf. on Management
of Data (1993), pages 237–246.
[Bruno et al. 2002] N. Bruno, S. Chaudhuri, and L. Gravano, “Top-k Selection
Queries Over Relational Databases: Mapping Strategies and Performance Eval-
uation”, ACM Transactions on Database Systems, Volume 27, Number 2 (2002), pages
153–187.
[Buckley and Silberschatz 1983] G. Buckley and A. Silberschatz, “Obtaining Pro-
gressive Protocols for a Simple Multiversion Database Model”, In Proc. of the Inter-
national Conf. on Very Large Databases (1983), pages 74–81.
[Buckley and Silberschatz 1984] G. Buckley and A. Silberschatz, “Concurrency
Control in Graph Protocols by Using Edge Locks”, In Proc. of the ACM SIGMOD
[Buckley and Silberschatz 1985] G. Buckley and A. Silberschatz, “Beyond Two-
Phase Locking”, Journal of the ACM, Volume 32, Number 2 (1985), pages 314–326.
Bibliography 1289
[Bulmer 1979] M. G. Bulmer, Principles of Statistics, Dover Publications (1979).

[Burkhard 1976] W. A. Burkhard, “Hashing and Trie Algorithms for Partial Match
Retrieval”, ACM Transactions on Database Systems, Volume 1, Number 2 (1976),
pages 175–187.
[Burkhard 1979] W. A. Burkhard, “Partial-match Hash Coding: Benefits of Redun-
dancy”, ACM Transactions on Database Systems, Volume 4, Number 2 (1979), pages
228–239.
[Cannan and Otten 1993] S. Cannan and G. Otten, SQL — The Standard Handbook,
McGraw Hill (1993).
[Carey 1983] M. J. Carey, “Granularity Hierarchies in Concurrency Control”, In
[Carey and Kossmann 1998] M. J. Carey and D. Kossmann, “Reducing the Braking
Distance of an SQL Query Engine”, In Proc. of the International Conf. on Very Large
Databases (1998), pages 158–169.
[Carey et al. 1991] M. Carey, M. Franklin, M. Livny, and E. Shekita, “Data Caching
Tradeoffs in Client-Server DBMS Architectures”, In Proc. of the ACM SIGMOD Conf.
on Management of Data (1991).
[Carey et al. 1993] M. J. Carey, D. DeWitt, and J. Naughton, “The OO7 Benchmark”,
In Proc. of the ACM SIGMOD Conf. on Management of Data (1993).
[Carey et al. 1999] M. J. Carey, D. D. Chamberlin, S. Narayanan, B. Vance, D. Doole,
S. Rielau, R. Swagerman, and N. Mattos, “O-O, What Have They Done to DB2?”,
In Proc. of the International Conf. on Very Large Databases (1999), pages 542–553.
[Cattell 2000] R. Cattell, editor, The Object Database Standard: ODMG 3.0, Morgan
Kaufmann (2000).
[Cattell and Skeen 1992] R. Cattell and J. Skeen, “Object Operations Benchmark”,
ACM Transactions on Database Systems, Volume 17, Number 1 (1992).
[Chakrabarti 1999] S. Chakrabarti, “Recent Results in Automatic Web Resource
Discovery”, ACM Computing Surveys, Volume 31, Number 4 (1999).
[Chakrabarti 2000] S. Chakrabarti, “Data Mining for Hypertext: A Tutorial Sur-
vey”, SIGKDD Explorations, Volume 1, Number 2 (2000), pages 1–11.
[Chakrabarti 2002] S. Chakrabarti, Mining the Web: Discovering Knowledge from Hy-
perText Data, Morgan Kaufmann (2002).
[Chakrabarti et al. 1998] S. Chakrabarti, S. Sarawagi, and B. Dom, “Mining Sur-
prising Patterns Using Temporal Description Length”, In Proc. of the International
Conf. on Very Large Databases (1998), pages 606–617.
[Chakrabarti et al. 1999] S. Chakrabarti, M. van den Berg, and B. Dom, “Focused
Crawling: A New Approach to Topic Specific Web Resource Discovery”, In Proc. of
the International World Wide Web Conf. (1999).
1290 Bibliography
[Chamberlin 1996] D. Chamberlin, Using the New DB2: IBM’s Object-Relational

Database System, Morgan Kaufmann (1996).
[Chamberlin 1998] D. D. Chamberlin, A Complete Guide to DB2 Universal Database,
Morgan Kaufmann (1998).
[Chamberlin and Boyce 1974] D. D. Chamberlin and R. F. Boyce, “SEQUEL: A
Structured English Query Language”, In ACM SIGMOD Workshop on Data Descrip-
tion, Access, and Control (1974), pages 249–264.
[Chamberlin et al. 1976] D. D. Chamberlin, M. M. Astrahan, K. P. Eswaran, P. P.
Griffiths, R. A. Lorie, J. W. Mehl, P. Reisner, and B. W. Wade, “SEQUEL 2: A Unified
Approach to Data Definition, Manipulation, and Control”, IBM Journal of Research
and Development, Volume 20, Number 6 (1976), pages 560–575.
[Chamberlin et al. 1981] D. D. Chamberlin, M. M. Astrahan, M. W. Blasgen, J. N.
Gray, W. F. King, B. G. Lindsay, R. A. Lorie, J. W. Mehl, T. G. Price, P. G. Selinger,
M. Schkolnick, D. R. Slutz, I. L. Traiger, B. W. Wade, and R. A. Yost, “A History
and Evaluation of System R”, Communications of the ACM, Volume 24, Number 10
(1981), pages 632–646.
[Chamberlin et al. 2000] D. D. Chamberlin, J. Robie, and D. Florescu, “Quilt: An
XML Query Language for Heterogeneous Data Sources”, In Proc. of the International
Workshop on the Web and Databases (WebDB) (2000), pages 53–62.
[Chan and Ioannidis 1998] C.-Y. Chan and Y. E. Ioannidis, “Bitmap Index Design
and Evaluation”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1998).
[Chan and Ioannidis 1999] C.-Y. Chan and Y. E. Ioannidis, “An Efficient Bitmap
Encoding Scheme for Selection Queries”, In Proc. of the ACM SIGMOD Conf. on
Management of Data (1999).
[Chandra and Harel 1982] A. K. Chandra and D. Harel, “Structure and Complexity
of Relational Queries”, Journal of Computer and System Sciences, Volume 15, Number
10 (1982), pages 99–128.
[Chandrasekaran et al. 2003] S. Chandrasekaran, O. Cooper, A. Deshpande, M. J.
Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman,
F. Reiss, and M. Shah, “TelegraphCQ: Continuous Dataflow Processing for an
Uncertain World”, In First Biennial Conference on Innovative Data Systems Research
(2003).
[Chang et al. 2008] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach,
M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable: A Distributed Storage
System for Structured Data”, ACM Trans. Comput. Syst., Volume 26, Number 2
(2008).
[Chatziantoniou and Ross 1997] D. Chatziantoniou and K. A. Ross, “Groupwise
Processing of Relational Queries”, In Proc. of the International Conf. on Very Large
Bibliography 1291
[Chaudhuri and Narasayya 1997] S. Chaudhuri and V. Narasayya, “An Efficient

Cost-Driven Index Selection Tool for Microsoft SQL Server”, In Proc. of the Interna-
tional Conf. on Very Large Databases (1997).
[Chaudhuri and Shim 1994] S. Chaudhuri and K. Shim, “Including Group-By in
Query Optimization”, In Proc. of the International Conf. on Very Large Databases (1994).
[Chaudhuri et al. 1995] S. Chaudhuri, R. Krishnamurthy, S. Potamianos, and
K. Shim, “Optimizing Queries with Materialized Views”, In Proc. of the Interna-
tional Conf. on Data Engineering (1995).
[Chaudhuri et al. 1998] S. Chaudhuri, R. Motwani, and V. Narasayya, “Random
sampling for histogram construction: how much is enough?”, In Proc. of the ACM
SIGMOD Conf. on Management of Data (1998), pages 436–447.
[Chaudhuri et al. 1999] S. Chaudhuri, E. Christensen, G. Graefe, V. Narasayya,
and M. Zwilling, “Self Tuning Technology in Microsoft SQL Server”, IEEE Data
Engineering Bulletin, Volume 22, Number 2 (1999).
[Chaudhuri et al. 2003] S. Chaudhuri, K. Ganjam, V. Ganti, and R. Motwani, “Ro-
bust and Efficient Fuzzy Match for Online Data Cleaning”, In Proc. of the ACM
SIGMOD Conf. on Management of Data (2003).
[Chen 1976] P. P. Chen, “The Entity-Relationship Model: Toward a Unified View
of Data”, ACM Transactions on Database Systems, Volume 1, Number 1 (1976), pages
9–36.
[Chen et al. 1994] P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Pat-
terson, “RAID: High-Performance, Reliable Secondary Storage”, ACM Computing
Survey, Volume 26, Number 2 (1994).
[Chen et al. 2007] S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry, “Improving
hash join performance through prefetching”, ACM Transactions on Database Systems,
Volume 32, Number 3 (2007).
[Chomicki 1995] J. Chomicki, “Efficient Checking of Temporal Integrity Con-
straints Using Bounded History Encoding”, ACM Transactions on Database Systems,
[Chou and Dewitt 1985] H. T. Chou and D. J. Dewitt, “An Evaluation of Buffer
Management Strategies for Relational Database Systems”, In Proc. of the Interna-
tional Conf. on Very Large Databases (1985), pages 127–141.
[Cieslewicz et al. 2009] J. Cieslewicz, W. Mee, and K. A. Ross, “Cache-Conscious
Buffering for Database Operators with State”, In Proc. Fifth International Workshop
on Data Management on New Hardware (DaMoN 2009) (2009).
[Cochrane et al. 1996] R. Cochrane, H. Pirahesh, and N. M. Mattos, “Integrating
Triggers and Declarative Constraints in SQL Database Sytems”, In Proc. of the
1292 Bibliography
[Codd 1970] E. F. Codd, “A Relational Model for Large Shared Data Banks”, Com-
munications of the ACM, Volume 13, Number 6 (1970), pages 377–387.
[Codd 1972] E. F. Codd. “Further Normalization of the Data Base Relational
Model”, In Rustin [1972], pages 33–64 (1972).
[Codd 1979] E. F. Codd, “Extending the Database Relational Model to Capture
More Meaning”, ACM Transactions on Database Systems, Volume 4, Number 4 (1979),
pages 397–434.
[Codd 1982] E. F. Codd, “The 1981 ACM Turing Award Lecture: Relational
Database: A Practical Foundation for Productivity”, Communications of the ACM,
[Codd 1990] E. F. Codd, The Relational Model for Database Management: Version 2,
Addison Wesley (1990).
[Comer 1979] D. Comer, “The Ubiquitous B-tree”, ACM Computing Survey, Volume
11, Number 2 (1979), pages 121–137.
[Comer 2009] D. E. Comer, Computer Networks and Internets, 5th edition, Prentice
Hall (2009).
[Cook 1996] M. A. Cook, Building Enterprise Information Architecture: Reengineering
Information Systems, Prentice Hall (1996).
[Cooper et al. 2008] B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein,
P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni, “PNUTS: Yahoo!’s
hosted data serving platform”, Proceedings of the VLDB Endowment, Volume 1,
Number 2 (2008), pages 1277–1288.
[Cormen et al. 1990] T. Cormen, C. Leiserson, and R. Rivest, Introduction to Algo-
rithms, MIT Press (1990).
[Cortes and Vapnik 1995] C. Cortes and V. Vapnik, Machine Learning, Volume 20,
Number 3 (1995), pages 273–297.
[Cristianini and Shawe-Taylor 2000] N. Cristianini and J. Shawe-Taylor, An Intro-
duction to Support Vector Machines and other Kernel-Based Learning Methods, Cam-
bridge University Press (2000).
[Dageville and Zaı̈t 2002] B. Dageville and M. Zaı̈t, “SQL Memory Management
in Oracle9i”, In Proc. of the International Conf. on Very Large Databases (2002), pages
962–973.
[Dageville et al. 2004] B. Dageville, D. Das, K. Dias, K. Yagoub, M. Zaı̈t, and M. Zi-
auddin, “Automatic SQL Tuning in Oracle 10g”, In Proc. of the International Conf. on
[Dalvi et al. 2009] N. Dalvi, R. Kumar, B. Pang, R. Ramakrishnan, A. Tomkins,
P. Bohannon, S. Keerthi, and S. Merugu, “A Web of Concepts”, In Proc. of the ACM
Symposium on Principles of Database Systems (2009).
Bibliography 1293
[Daniels et al. 1982] D. Daniels, P. G. Selinger, L. M. Haas, B. G. Lindsay, C. Mohan,

A. Walker, and P. F. Wilms. “An Introduction to Distributed Query Compilation in
R*”, In Schneider [1982] (1982).
[Dashti et al. 2003] A. Dashti, S. H. Kim, C. Shahabi, and R. Zimmermann, Stream-
ing Media Server Design, Prentice Hall (2003).
[Date 1983] C. J. Date, “The Outer Join”, In Proc. of the International Conference on
Databases, John Wiley and Sons (1983), pages 76–106.
[Date 1989] C. Date, A Guide to DB2, Addison Wesley (1989).
[Date 1993] C. J. Date, “How SQL Missed the Boat”, Database Programming and
Design, Volume 6, Number 9 (1993).
[Date 2003] C. J. Date, An Introduction to Database Systems, 8th edition, Addison
Wesley (2003).
[Date and Darwen 1997] C. J. Date and G. Darwen, A Guide to the SQL Standard,
4th edition, Addison Wesley (1997).
[Davis et al. 1983] C. Davis, S. Jajodia, P. A. Ng, and R. Yeh, editors, Entity-
Relationship Approach to Software Engineering, North Holland (1983).
[Davison and Graefe 1994] D. L. Davison and G. Graefe, “Memory-Contention
Responsive Hash Joins”, In Proc. of the International Conf. on Very Large Databases
(1994).
[Dayal 1987] U. Dayal, “Of Nests and Trees: A Unified Approach to Processing
Queries that Contain Nested Subqueries, Aggregates and Quantifiers”, In Proc. of
the International Conf. on Very Large Databases (1987), pages 197–208.
[Deutsch et al. 1999] A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu,
“A Query Language for XML”, In Proc. of the International World Wide Web Conf.
(1999).
[DeWitt 1990] D. DeWitt, “The Gamma Database Machine Project”, IEEE Transac-
tions on Knowledge and Data Engineering, Volume 2, Number 1 (1990).
[DeWitt and Gray 1992] D. DeWitt and J. Gray, “Parallel Database Systems: The
Future of High Performance Database Systems”, Communications of the ACM, Vol-
ume 35, Number 6 (1992), pages 85–98.
[DeWitt et al. 1992] D. DeWitt, J. Naughton, D. Schneider, and S. Seshadri, “Practi-
cal Skew Handling in Parallel Joins”, In Proc. of the International Conf. on Very Large
Databases (1992).
[Dias et al. 1989] D. Dias, B. Iyer, J. Robinson, and P. Yu, “Integrated Concurrency-
Coherency Controls for Multisystem Data Sharing”, Software Engineering, Volume
15, Number 4 (1989), pages 437–448.
[Donahoo and Speegle 2005] M. J. Donahoo and G. D. Speegle, SQL: Practical Guide
for Developers, Morgan Kaufmann (2005).
1294 Bibliography
[Douglas and Douglas 2003] K. Douglas and S. Douglas, PostgreSQL, Sam’s Pub-
lishing (2003).
[Dubois and Thakkar 1992] M. Dubois and S. Thakkar, editors, Scalable Shared
Memory Multiprocessors, Kluwer Academic Publishers (1992).
[Duncan 1990] R. Duncan, “A Survey of Parallel Computer Architectures”, IEEE
Computer, Volume 23, Number 2 (1990), pages 5–16.
[Eisenberg and Melton 1999] A. Eisenberg and J. Melton, “SQL:1999, formerly
known as SQL3”, ACM SIGMOD Record, Volume 28, Number 1 (1999).
[Eisenberg and Melton 2004a] A. Eisenberg and J. Melton, “Advancements in
SQL/XML”, ACM SIGMOD Record, Volume 33, Number 3 (2004), pages 79–86.
[Eisenberg and Melton 2004b] A. Eisenberg and J. Melton, “An Early Look at
XQuery API for Java (XQJ)”, ACM SIGMOD Record, Volume 33, Number 2 (2004),
pages 105–111.
[Eisenberg et al. 2004] A. Eisenberg, J. Melton, K. G. Kulkarni, J.-E. Michels, and
F. Zemke, “SQL:2003 Has Been Published”, ACM SIGMOD Record, Volume 33,
Number 1 (2004), pages 119–126.
[Elhemali et al. 2007] M. Elhemali, C. A. Galindo-Legaria, T. Grabs, and M. Joshi,
“Execution strategies for SQL subqueries”, In Proc. of the ACM SIGMOD Conf. on
[Ellis 1987] C. S. Ellis, “Concurrency in Linear Hashing”, ACM Transactions on
Database Systems, Volume 12, Number 2 (1987), pages 195–217.
[Elmasri and Navathe 2006] R. Elmasri and S. B. Navathe, Fundamentals of Database
Systems, 5th edition, Addison Wesley (2006).
[Epstein et al. 1978] R. Epstein, M. R. Stonebraker, and E. Wong, “Distributed
Query Processing in a Relational Database System”, In Proc. of the ACM SIGMOD
[Escobar-Molano et al. 1993] M. Escobar-Molano, R. Hull, and D. Jacobs, “Safety
and Translation of Calculus Queries with Scalar Functions”, In Proc. of the ACM
[Eswaran et al. 1976] K. P. Eswaran, J. N. Gray, R. A. Lorie, and I. L. Traiger, “The
Notions of Consistency and Predicate Locks in a Database System”, Communications
of the ACM, Volume 19, Number 11 (1976), pages 624–633.
[Fagin 1977] R. Fagin, “Multivalued Dependencies and a New Normal Form for
Relational Databases”, ACM Transactions on Database Systems, Volume 2, Number 3
(1977), pages 262–278.
[Fagin 1979] R. Fagin, “Normal Forms and Relational Database Operators”, In Proc.
Bibliography 1295
[Fagin 1981] R. Fagin, “A Normal Form for Relational Databases That Is Based on
Domains and Keys”, ACM Transactions on Database Systems, Volume 6, Number 3
(1981), pages 387–415.
[Fagin et al. 1979] R. Fagin, J. Nievergelt, N. Pippenger, and H. R. Strong, “Ex-
tendible Hashing — A Fast Access Method for Dynamic Files”, ACM Transactions
on Database Systems, Volume 4, Number 3 (1979), pages 315–344.
[Faloutsos and Lin 1995] C. Faloutsos and K.-I. Lin, “Fast Map: A Fast Algo-
rithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia
Datasets”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1995), pages
163–174.
[Fayyad et al. 1995] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy,
Advances in Knowledge Discovery and Data Mining, MIT Press (1995).
[Fekete et al. 2005] A. Fekete, D. Liarokapis, E. O’Neil, P. O’Neil, and D. Shasha,
“Making Snapshot Isolation Serializable”, ACM Transactions on Database Systems,
[Finkel and Bentley 1974] R. A. Finkel and J. L. Bentley, “Quad Trees: A Data Struc-
ture for Retrieval on Composite Keys”, Acta Informatica, Volume 4, (1974), pages
1–9.
[Fischer 2006] L. Fischer, editor, Workflow Handbook 2001, Future Strategies (2006).
[Florescu and Kossmann 1999] D. Florescu and D. Kossmann, “Storing and Query-
ing XML Data Using an RDBMS”, IEEE Data Engineering Bulletin (Special Issue on
XML) (1999), pages 27–35.
[Florescu et al. 2000] D. Florescu, D. Kossmann, and I. Monalescu, “Integrating
Keyword Search into XML Query Processing”, In Proc. of the International World
Wide Web Conf. (2000), pages 119–135. Also appears in Computer Networks, Vol. 33,
pages 119-135.
[Fredkin 1960] E. Fredkin, “Trie Memory”, Communications of the ACM, Volume 4,
Number 2 (1960), pages 490–499.
[Freedman and DeWitt 1995] C. S. Freedman and D. J. DeWitt, “The SPIFFI Scal-
able Video-on-Demand Server”, In Proc. of the ACM SIGMOD Conf. on Management
of Data (1995), pages 352–363.
[Funderburk et al. 2002a] J. E. Funderburk, G. Kiernan, J. Shanmugasundaram,
E. Shekita, and C. Wei, “XTABLES: Bridging Relational Technology and XML”,
IBM Systems Journal, Volume 41, Number 4 (2002), pages 616–641.
[Funderburk et al. 2002b] J. E. Funderburk, S. Malaika, and B. Reinwald, “XML
Programming with SQL/XML and XQuery”, IBM Systems Journal, Volume 41,
Number 4 (2002), pages 642–665.
[Galindo-Legaria 1994] C. Galindo-Legaria, “Outerjoins as Disjunctions”, In Proc.
of the ACM SIGMOD Conf. on Management of Data (1994).
1296 Bibliography
[Galindo-Legaria and Joshi 2001] C. A. Galindo-Legaria and M. M. Joshi, “Orthog-

onal Optimization of Subqueries and Aggregation”, In Proc. of the ACM SIGMOD
[Galindo-Legaria and Rosenthal 1992] C. Galindo-Legaria and A. Rosenthal,
“How to Extend a Conventional Optimizer to Handle One- and Two-Sided Outer-
join”, In Proc. of the International Conf. on Data Engineering (1992), pages 402–409.
[Galindo-Legaria et al. 2004] C. Galindo-Legaria, S. Stefani, and F. Waas, “Query
Processing for SQL Updates”, In Proc. of the ACM SIGMOD Conf. on Management of
Data (2004), pages 844–849.
[Ganguly 1998] S. Ganguly, “Design and Analysis of Parametric Query Optimiza-
tion Algorithms”, In Proc. of the International Conf. on Very Large Databases (1998).
[Ganguly et al. 1992] S. Ganguly, W. Hasan, and R. Krishnamurthy, “Query Opti-
mization for Parallel Execution”, In Proc. of the ACM SIGMOD Conf. on Management
of Data (1992).
[Ganguly et al. 1996] S. Ganguly, P. Gibbons, Y. Matias, and A. Silberschatz, “A
Sampling Algorithm for Estimating Join Size”, In Proc. of the ACM SIGMOD Conf.
[Ganski and Wong 1987] R. A. Ganski and H. K. T. Wong, “Optimization of Nested
SQL Queries Revisited”, In Proc. of the ACM SIGMOD Conf. on Management of Data
(1987).
[Garcia and Korth 2005] P. Garcia and H. F. Korth, “Multithreaded Architectures
and the Sort Benchmark”, In Proc. of the First International Workshop on Data Man-
agement on Modern Hardward (DaMoN) (2005).
[Garcia-Molina 1982] H. Garcia-Molina, “Elections in Distributed Computing Sys-
tems”, IEEE Transactions on Computers, Volume C-31, Number 1 (1982), pages 48–59.
[Garcia-Molina and Salem 1987] H. Garcia-Molina and K. Salem, “Sagas”, In Proc.
[Garcia-Molina and Salem 1992] H. Garcia-Molina and K. Salem, “Main Memory
Database Systems: An Overview”, IEEE Transactions on Knowledge and Data Engi-
neering, Volume 4, Number 6 (1992), pages 509–516.
[Garcia-Molina et al. 2008] H. Garcia-Molina, J. D. Ullman, and J. D. Widom,
Database Systems: The Complete Book, 2nd edition, Prentice Hall (2008).
[Georgakopoulos et al. 1994] D. Georgakopoulos, M. Rusinkiewicz, and A. Seth,
“Using Tickets to Enforce the Serializability of Multidatabase Transactions”, IEEE
Transactions on Knowledge and Data Engineering, Volume 6, Number 1 (1994), pages
166–180.
[Gilbert and Lynch 2002] S. Gilbert and N. Lynch, “Brewer’s conjecture and the
feasibility of consistent, available, partition-tolerant web services”, SIGACT News,
Bibliography 1297
[Graefe 1990] G. Graefe, “Encapsulation of Parallelism in the Volcano Query Pro-

cessing System”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1990),
pages 102–111.
[Graefe 1995] G. Graefe, “The Cascades Framework for Query Optimization”, Data
Engineering Bulletin, Volume 18, Number 3 (1995), pages 19–29.
[Graefe 2008] G. Graefe, “The Five-Minute Rule 20 Years Later: and How Flash
Memory Changes the Rules”, ACM Queue, Volume 6, Number 4 (2008), pages
40–52.
[Graefe and McKenna 1993a] G. Graefe and W. McKenna, “The Volcano Optimizer
Generator”, In Proc. of the International Conf. on Data Engineering (1993), pages 209–
218.
[Graefe and McKenna 1993b] G. Graefe and W. J. McKenna, “Extensibility and
Search Efficiency in the Volcano Optimizer Generator”, In Proc. of the International
[Graefe et al. 1998] G. Graefe, R. Bunker, and S. Cooper, “Hash Joins and Hash
Teams in Microsoft SQL Server”, In Proc. of the International Conf. on Very Large
[Gray 1978] J. Gray. “Notes on Data Base Operating System”, In Bayer et al. [1978],
pages 393–481 (1978).
[Gray 1981] J. Gray, “The Transaction Concept: Virtues and Limitations”, In Proc.
of the International Conf. on Very Large Databases (1981), pages 144–154.
[Gray 1991] J. Gray, The Benchmark Handbook for Database and Transaction Processing
Systems, 2nd edition, Morgan Kaufmann (1991).
[Gray and Graefe 1997] J. Gray and G. Graefe, “The Five-Minute Rule Ten Years
Later, and Other Computer Storage Rules of Thumb”, SIGMOD Record, Volume 26,
Number 4 (1997), pages 63–68.
[Gray and Reuter 1993] J. Gray and A. Reuter, Transaction Processing: Concepts and
Techniques, Morgan Kaufmann (1993).
[Gray et al. 1975] J. Gray, R. A. Lorie, and G. R. Putzolu, “Granularity of Locks and
Degrees of Consistency in a Shared Data Base”, In Proc. of the International Conf. on
[Gray et al. 1976] J. Gray, R. A. Lorie, G. R. Putzolu, and I. L. Traiger, Granularity of
Locks and Degrees of Consistency in a Shared Data Base, Nijssen (1976).
[Gray et al. 1981] J. Gray, P. R. McJones, and M. Blasgen, “The Recovery Manager
of the System R Database Manager”, ACM Computing Survey, Volume 13, Number
2 (1981), pages 223–242.
[Gray et al. 1995] J. Gray, A. Bosworth, A. Layman, and H. Pirahesh, “Data Cube:
A Relational Aggregation Operator Generalizing Group-By, Cross-Tab and Sub-
Totals”, Technical report, Microsoft Research (1995).
1298 Bibliography
[Gray et al. 1996] J. Gray, P. Helland, and P. O’Neil, “The Dangers of Replication
and a Solution”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1996),
pages 173–182.
[Gray et al. 1997] J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart,
M. Venkatrao, F. Pellow, and H. Pirahesh, “Data Cube: A Relational Aggregation
Operator Generalizing Group-by, Cross-Tab, and Sub Totals”, Data Mining and
Knowledge Discovery, Volume 1, Number 1 (1997), pages 29–53.
[Gregersen and Jensen 1999] H. Gregersen and C. S. Jensen, “Temporal Entity-
Relationship Models-A Survey”, IEEE Transactions on Knowledge and Data Engineer-
[Grossman and Frieder 2004] D. A. Grossman and O. Frieder, Information Retrieval:
Algorithms and Heuristics, 2nd edition, Springer Verlag (2004).
[Gunning 2008] P. K. Gunning, DB2 9 for Developers, MC Press (2008).
[Guo et al. 2003] L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram, “XRANK:
Ranked Keyword Search over XML Documents”, In Proc. of the ACM SIGMOD
[Guttman 1984] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial
Searching”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1984), pages
47–57.
[Haas et al. 1989] L. M. Haas, J. C. Freytag, G. M. Lohman, and H. Pirahesh, “Ex-
tensible Query Processing in Starburst”, In Proc. of the ACM SIGMOD Conf. on
[Haas et al. 1990] L. M. Haas, W. Chang, G. M. Lohman, J. McPherson, P. F. Wilms,
G. Lapis, B. G. Lindsay, H. Pirahesh, M. J. Carey, and E. J. Shekita, “Starburst Mid-
Flight: As the Dust Clears”, IEEE Transactions on Knowledge and Data Engineering,
[Haerder and Reuter 1983] T. Haerder and A. Reuter, “Principles of Transaction-
Oriented Database Recovery”, ACM Computing Survey, Volume 15, Number 4
(1983), pages 287–318.
[Haerder and Rothermel 1987] T. Haerder and K. Rothermel, “Concepts for Trans-
action Recovery in Nested Transactions”, In Proc. of the ACM SIGMOD Conf. on
[Halsall 2006] F. Halsall, Computer Networking and the Internet : With Internet and
Multiamedia Applications, Addison Wesley (2006).
[Han and Kamber 2000] J. Han and M. Kamber, Data Mining: Concepts and Tech-
niques, Morgan Kaufmann (2000).
[Harinarayan et al. 1996] V. Harinarayan, J. D. Ullman, and A. Rajaraman, “Imple-
menting Data Cubes Efficiently”, In Proc. of the ACM SIGMOD Conf. on Management
of Data (1996).
Bibliography 1299
[Haritsa et al. 1990] J. Haritsa, M. Carey, and M. Livny, “On Being Optimistic about
Real-Time Constraints”, In Proc. of the ACM SIGMOD Conf. on Management of Data
(1990).
[Harizopoulos and Ailamaki 2004] S. Harizopoulos and A. Ailamaki, “STEPS to-
wards Cache-resident Transaction Processing”, In Proc. of the International Conf. on
[Hellerstein and Stonebraker 2005] J. M. Hellerstein and M. Stonebraker, editors,
Readings in Database Systems, 4th edition, Morgan Kaufmann (2005).
[Hellerstein et al. 1995] J. M. Hellerstein, J. F. Naughton, and A. Pfeffer, “General-
ized Search Trees for Database Systems”, In Proc. of the International Conf. on Very
[Hennessy et al. 2006] J. L. Hennessy, D. A. Patterson, and D. Goldberg, Computer
Architecture: A Quantitative Approach, 4th edition, Morgan Kaufmann (2006).
[Hevner and Yao 1979] A. R. Hevner and S. B. Yao, “Query Processing in Dis-
tributed Database Systems”, IEEE Transactions on Software Engineering, Volume
SE-5, Number 3 (1979), pages 177–187.
[Heywood et al. 2002] I. Heywood, S. Cornelius, and S. Carver, An Introduction to
Geographical Information Systems, 2nd edition, Prentice Hall (2002).
[Hong et al. 1993] D. Hong, T. Johnson, and S. Chakravarthy, “Real-Time Transac-
tion Scheduling: A Cost Conscious Approach”, In Proc. of the ACM SIGMOD Conf.
[Howes et al. 1999] T. A. Howes, M. C. Smith, and G. S. Good, Understanding and
Deploying LDAP Directory Services, Macmillan Publishing (1999).
[Hristidis and Papakonstantinou 2002] V. Hristidis and Y. Papakonstantinou,
“DISCOVER: Keyword Search in Relational Databases”, In Proc. of the International
Conf. on Very Large Databases (2002).
[Huang and Garcia-Molina 2001] Y. Huang and H. Garcia-Molina, “Exactly-once
Semantics in a Replicated Messaging System”, In Proc. of the International Conf. on
Data Engineering (2001), pages 3–12.
[Hulgeri and Sudarshan 2003] A. Hulgeri and S. Sudarshan, “AniPQO: Almost
Non-Intrusive Parametric Query Optimization for Non-Linear Cost Functions”, In
Proc. of the International Conf. on Very Large Databases (2003).
[IBM 1987] IBM, “Systems Application Architecture: Common Programming In-
terface, Database Reference”, Technical report, IBM Corporation, IBM Form Num-
ber SC26–4348–0 (1987).
[Ilyas et al. 2008] I. Ilyas, G. Beskales, and M. A. Soliman, “A Survey of top-k query
processing techniques in relational database systems”, ACM Computing Surveys,
1300 Bibliography
[Imielinski and Badrinath 1994] T. Imielinski and B. R. Badrinath, “Mobile Com-

puting — Solutions and Challenges”, Communications of the ACM, Volume 37, Num-
ber 10 (1994).
[Imielinski and Korth 1996] T. Imielinski and H. F. Korth, editors, Mobile Comput-
ing, Kluwer Academic Publishers (1996).
[Ioannidis and Christodoulakis 1993] Y. Ioannidis and S. Christodoulakis, “Opti-
mal Histograms for Limiting Worst-Case Error Propagation in the Size of Join Re-
sults”, ACM Transactions on Database Systems, Volume 18, Number 4 (1993), pages
709–748.
[Ioannidis and Poosala 1995] Y. E. Ioannidis and V. Poosala, “Balancing Histogram
Optimality and Practicality for Query Result Size Estimation”, In Proc. of the ACM
[Ioannidis et al. 1992] Y. E. Ioannidis, R. T. Ng, K. Shim, and T. K. Sellis, “Parametric
Query Optimization”, In Proc. of the International Conf. on Very Large Databases (1992),
pages 103–114.
[Jackson and Moulinier 2002] P. Jackson and I. Moulinier, Natural Language Pro-
cessing for Online Applications: Text Retrieval, Extraction, and Categorization, John
Benjamin (2002).
[Jagadish et al. 1993] H. V. Jagadish, A. Silberschatz, and S. Sudarshan, “Recov-
ering from Main-Memory Lapses”, In Proc. of the International Conf. on Very Large
Databases (1993).
[Jagadish et al. 1994] H. Jagadish, D. Lieuwen, R. Rastogi, A. Silberschatz, and
S. Sudarshan, “Dali: A High Performance Main Memory Storage Manager”, In
Proc. of the International Conf. on Very Large Databases (1994).
[Jain and Dubes 1988] A. K. Jain and R. C. Dubes, Algorithms for Clustering Data,
Prentice Hall (1988).
[Jensen et al. 1994] C. S. Jensen et al., “A Consensus Glossary of Temporal Database
Concepts”, ACM SIGMOD Record, Volume 23, Number 1 (1994), pages 52–64.
[Jensen et al. 1996] C. S. Jensen, R. T. Snodgrass, and M. Soo, “Extending Existing
Dependency Theory to Temporal Databases”, IEEE Transactions on Knowledge and
Data Engineering, Volume 8, Number 4 (1996), pages 563–582.
[Johnson 1999] T. Johnson, “Performance Measurements of Compressed Bitmap
Indices”, In Proc. of the International Conf. on Very Large Databases (1999).
[Johnson and Shasha 1993] T. Johnson and D. Shasha, “The Performance of Con-
current B-Tree Algorithms”, ACM Transactions on Database Systems, Volume 18,
Number 1 (1993).
[Jones and Willet 1997] K. S. Jones and P. Willet, editors, Readings in Information
Retrieval, Morgan Kaufmann (1997).
Bibliography 1301
[Jordan and Russell 2003] D. Jordan and C. Russell, Java Data Objects, O’Reilly
(2003).
[Jorwekar et al. 2007] S. Jorwekar, A. Fekete, K. Ramamritham, and S. Sudarshan,
“Automating the Detection of Snapshot Isolation Anomalies”, In Proc. of the Inter-
national Conf. on Very Large Databases (2007), pages 1263–1274.
[Joshi 1991] A. Joshi, “Adaptive Locking Strategies in a Multi-Node Shared Data
Model Environment”, In Proc. of the International Conf. on Very Large Databases (1991).
[Kanne and Moerkotte 2000] C.-C. Kanne and G. Moerkotte, “Efficient Storage of
XML Data”, In Proc. of the International Conf. on Data Engineering (2000), page 198.
[Katz et al. 2004] H. Katz, D. Chamberlin, D. Draper, M. Fernandez, M. Kay, J. Ro-
bie, M. Rys, J. Simeon, J. Tivy, and P. Wadler, XQuery from the Experts: A Guide to the
W3C XML Query Language, Addison Wesley (2004).
[Kaushik et al. 2004] R. Kaushik, R. Krishnamurthy, J. F. Naughton, and R. Ra-
makrishnan, “On the Integration of Structure Indexes and Inverted Lists”, In Proc.
[Kedem and Silberschatz 1979] Z. M. Kedem and A. Silberschatz, “Controlling
Concurrency Using Locking Protocols”, In Proc. of the Annual IEEE Symposium on
Foundations of Computer Science (1979), pages 275–285.
[Kedem and Silberschatz 1983] Z. M. Kedem and A. Silberschatz, “Locking Pro-
tocols: From Exclusive to Shared Locks”, Journal of the ACM, Volume 30, Number 4
(1983), pages 787–804.
[Kifer et al. 2005] M. Kifer, A. Bernstein, and P. Lewis, Database Systems: An Appli-
cation Oriented Approach, Complete Version, 2nd edition, Addison Wesley (2005).
[Kim 1982] W. Kim, “On Optimizing an SQL-like Nested Query”, ACM Transactions
[Kim 1995] W. Kim, editor, Modern Database Systems, ACM Press (1995).
[King et al. 1991] R. P. King, N. Halim, H. Garcia-Molina, and C. Polyzois, “Man-
agement of a Remote Backup Copy for Disaster Recovery”, ACM Transactions on
[Kitsuregawa and Ogawa 1990] M. Kitsuregawa and Y. Ogawa, “Bucket Spreading
Parallel Hash: A New, Robust, Parallel Hash Join Method for Skew in the Super
Database Computer”, In Proc. of the International Conf. on Very Large Databases (1990),
pages 210–221.
[Kleinberg 1999] J. M. Kleinberg, “Authoritative Sources in a Hyperlinked Envi-
ronment”, Journal of the ACM, Volume 46, Number 5 (1999), pages 604–632.
[Kleinrock 1975] L. Kleinrock, Queueing Systems, Wiley-Interscience (1975).
[Klug 1982] A. Klug, “Equivalence of Relational Algebra and Relational Calculus
Query Languages Having Aggregate Functions”, Journal of the ACM, Volume 29,
Number 3 (1982), pages 699–717.
1302 Bibliography
[Knapp 1987] E. Knapp, “Deadlock Detection in Distributed Databases”, ACM

Computing Survey, Volume 19, Number 4 (1987).
[Knuth 1973] D. E. Knuth, The Art of Computer Programming, Volume 3, Addison
Wesley, Sorting and Searching (1973).
[Kohavi and Provost 2001] R. Kohavi and F. Provost, editors, Applications of Data
Mining to Electronic Commerce, Kluwer Academic Publishers (2001).
[Konstan et al. 1997] J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker, L. R.
Gordon, and J. Riedl, “GroupLens: Applying Collaborative Filtering to Usenet
News”, Communications of the ACM, Volume 40, Number 3 (1997), pages 77–87.
[Korth 1982] H. F. Korth, “Deadlock Freedom Using Edge Locks”, ACM Transactions
[Korth 1983] H. F. Korth, “Locking Primitives in a Database System”, Journal of the
ACM, Volume 30, Number 1 (1983), pages 55–79.
[Korth and Speegle 1990] H. F. Korth and G. Speegle, “Long Duration Transactions
in Software Design Projects”, In Proc. of the International Conf. on Data Engineering
(1990), pages 568–575.
[Korth and Speegle 1994] H. F. Korth and G. Speegle, “Formal Aspects of Concur-
rency Control in Long Duration Transaction Systems Using the NT/PV Model”,
ACM Transactions on Database Systems, Volume 19, Number 3 (1994), pages 492–535.
[Krishnaprasad et al. 2004] M. Krishnaprasad, Z. Liu, A. Manikutty, J. W. Warner,
V. Arora, and S. Kotsovolos, “Query Rewrite for XML in Oracle XML DB”, In Proc.
[Kung and Lehman 1980] H. T. Kung and P. L. Lehman, “Concurrent Manipulation
of Binary Search Trees”, ACM Transactions on Database Systems, Volume 5, Number
3 (1980), pages 339–353.
[Kung and Robinson 1981] H. T. Kung and J. T. Robinson, “Optimistic Concur-
rency Control”, ACM Transactions on Database Systems, Volume 6, Number 2 (1981),
pages 312–326.
[Kurose and Ross 2005] J. Kurose and K. Ross, Computer Networking — A Top-Down
Approach Featuring the Internet, 3rd edition, Addison Wesley (2005).
[Lahiri et al. 2001] T. Lahiri, A. Ganesh, R. Weiss, and A. Joshi, “Fast-Start: Quick
Fault Recovery in Oracle”, In Proc. of the ACM SIGMOD Conf. on Management of
Data (2001).
[Lam and Kuo 2001] K.-Y. Lam and T.-W. Kuo, editors, Real-Time Database Systems,
Kluwer Academic Publishers (2001).
[Lamb et al. 1991] C. Lamb, G. Landis, J. Orenstein, and D. Weinreb, “The Object-
Store Database System”, Communications of the ACM, Volume 34, Number 10 (1991),
pages 51–63.
Bibliography 1303
[Lamport 1978] L. Lamport, “Time, Clocks, and the Ordering of Events in a Dis-
tributed System”, Communications of the ACM, Volume 21, Number 7 (1978), pages
558–565.
[Lampson and Sturgis 1976] B. Lampson and H. Sturgis, “Crash Recovery in a
Distributed Data Storage System”, Technical report, Computer Science Laboratory,
Xerox Palo Alto Research Center,Palo Alto (1976).
[Lecluse et al. 1988] C. Lecluse, P. Richard, and F. Velez, “O2: An Object-Oriented
Data Model”, In Proc. of the International Conf. on Very Large Databases (1988), pages
424–433.
[Lehman and Yao 1981] P. L. Lehman and S. B. Yao, “Efficient Locking for Con-
current Operations on B-trees”, ACM Transactions on Database Systems, Volume 6,
Number 4 (1981), pages 650–670.
[Lehner et al. 2000] W. Lehner, R. Sidle, H. Pirahesh, and R. Cochrane, “Main-
tenance of Automatic Summary Tables”, In Proc. of the ACM SIGMOD Conf. on
[Lindsay et al. 1980] B. G. Lindsay, P. G. Selinger, C. Galtieri, J. N. Gray, R. A. Lorie,
T. G. Price, G. R. Putzolu, I. L. Traiger, and B. W. Wade. “Notes on Distributed
Databases”, In Draffen and Poole, editors, Distributed Data Bases, pages 247–284.
Cambridge University Press (1980).
[Litwin 1978] W. Litwin, “Virtual Hashing: A Dynamically Changing Hashing”, In
Proc. of the International Conf. on Very Large Databases (1978), pages 517–523.
[Litwin 1980] W. Litwin, “Linear Hashing: A New Tool for File and Table Address-
ing”, In Proc. of the International Conf. on Very Large Databases (1980), pages 212–223.
[Litwin 1981] W. Litwin, “Trie Hashing”, In Proc. of the ACM SIGMOD Conf. on
[Lo and Ravishankar 1996] M.-L. Lo and C. V. Ravishankar, “Spatial Hash-Joins”,
In Proc. of the ACM SIGMOD Conf. on Management of Data (1996).
[Loeb 1998] L. Loeb, Secure Electronic Transactions: Introduction and Technical Refer-
ence, Artech House (1998).
[Lomet 1981] D. G. Lomet, “Digital B-trees”, In Proc. of the International Conf. on Very
[Lomet et al. 2009] D. Lomet, A. Fekete, G. Weikum, and M. Zwilling, “Unbundling
Transaction Services in the Cloud”, In Proc. 4th Biennial Conference on Innovative Data
Systems Research (2009).
[Lu et al. 1991] H. Lu, M. Shan, and K. Tan, “Optimization of Multi-Way Join
Queries for Parallel Execution”, In Proc. of the International Conf. on Very Large
[Lynch and Merritt 1986] N. A. Lynch and M. Merritt, “Introduction to the Theory
of Nested Transactions”, In Proc. of the International Conf. on Database Theory (1986).
1304 Bibliography
[Lynch et al. 1988] N. A. Lynch, M. Merritt, W. Weihl, and A. Fekete, “A Theory of

Atomic Transactions”, In Proc. of the International Conf. on Database Theory (1988),
pages 41–71.
[Maier 1983] D. Maier, The Theory of Relational Databases, Computer Science Press
(1983).
[Manning et al. 2008] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to
Information Retrieval, Cambridge University Press (2008).
[Martin et al. 1989] J. Martin, K. K. Chapman, and J. Leben, DB2, Concepts, Design,
and Programming, Prentice Hall (1989).
[Mattison 1996] R. Mattison, Data Warehousing: Strategies, Technologies, and Tech-
niques, McGraw Hill (1996).
[McHugh and Widom 1999] J. McHugh and J. Widom, “Query Optimization for
XML”, In Proc. of the International Conf. on Very Large Databases (1999), pages 315–326.
[Mehrotra et al. 1991] S. Mehrotra, R. Rastogi, H. F. Korth, and A. Silberschatz,
“Non-Serializable Executions in Heterogeneous Distributed Database Systems”, In
Proc. of the International Conf. on Parallel and Distributed Information Systems (1991).
[Mehrotra et al. 2001] S. Mehrotra, R. Rastogi, Y. Breitbart, H. F. Korth, and A. Sil-
berschatz, “Overcoming Heterogeneity and Autonomy in Multidatabase Sys-
tems.”, Inf. Comput., Volume 167, Number 2 (2001), pages 137–172.
[Melnik et al. 2007] S. Melnik, A. Adya, and P. A. Bernstein, “Compiling map-
pings to bridge applications and databases”, In Proc. of the ACM SIGMOD Conf. on
[Melton 2002] J. Melton, Advanced SQL:1999 – Understanding Object-Relational and
Other Advanced Features, Morgan Kaufmann (2002).
[Melton and Eisenberg 2000] J. Melton and A. Eisenberg, Understanding SQL and
Java Together : A Guide to SQLJ, JDBC, and Related Technologies, Morgan Kaufmann
(2000).
[Melton and Simon 1993] J. Melton and A. R. Simon, Understanding The New SQL:
A Complete Guide, Morgan Kaufmann (1993).
[Melton and Simon 2001] J. Melton and A. R. Simon, SQL:1999, Understanding Re-
lational Language Components, Morgan Kaufmann (2001).
[Microsoft 1997] Microsoft, Microsoft ODBC 3.0 Software Development Kit and Pro-
grammer’s Reference, Microsoft Press (1997).
[Mistry et al. 2001] H. Mistry, P. Roy, S. Sudarshan, and K. Ramamritham, “Materi-
alized View Selection and Maintenance Using Multi-Query Optimization”, In Proc.
[Mitchell 1997] T. M. Mitchell, Machine Learning, McGraw Hill (1997).
Bibliography 1305
[Mohan 1990a] C. Mohan, “ARIES/KVL: A Key-Value Locking Method for Con-

currency Control of Multiaction Transactions Operations on B-Tree indexes”, In
Proc. of the International Conf. on Very Large Databases (1990), pages 392–405.
[Mohan 1990b] C. Mohan, “Commit-LSN: A Novel and Simple Method for Re-
ducing Locking and Latching in Transaction Processing Systems”, In Proc. of the
[Mohan 1993] C. Mohan, “IBM’s Relational Database Products:Features and Tech-
nologies”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1993).
[Mohan and Levine 1992] C. Mohan and F. Levine, “ARIES/IM:An Efficient and
High-Concurrency Index Management Method Using Write-Ahead Logging”, In
Proc. of the ACM SIGMOD Conf. on Management of Data (1992).
[Mohan and Lindsay 1983] C. Mohan and B. Lindsay, “Efficient Commit Protocols
for the Tree of Processes Model of Distributed Transactions”, In Proc. of the ACM
Symposium on Principles of Distributed Computing (1983).
[Mohan and Narang 1992] C. Mohan and I. Narang, “Efficient Locking and
Caching of Data in the Multisystem Shared Disks Transaction Environment”, In
Proc. of the International Conf. on Extending Database Technology (1992).
[Mohan and Narang 1994] C. Mohan and I. Narang, “ARIES/CSA: A Method for
Database Recovery in Client-Server Architectures”, In Proc. of the ACM SIGMOD
[Mohan et al. 1986] C. Mohan, B. Lindsay, and R. Obermarck, “Transaction Man-
agement in the R* Distributed Database Management System”, ACM Transactions
[Mohan et al. 1992] C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz,
“ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking
and Partial Rollbacks Using Write-Ahead Logging”, ACM Transactions on Database
Systems, Volume 17, Number 1 (1992).
[Moss 1985] J. E. B. Moss, Nested Transactions: An Approach to Reliable Distributed
Computing, MIT Press (1985).
[Moss 1987] J. E. B. Moss, “Log-Based Recovery for Nested Transactions”, In Proc.
[Murthy and Banerjee 2003] R. Murthy and S. Banerjee, “XML Schemas in Oracle
XML DB”, In Proc. of the International Conf. on Very Large Databases (2003), pages
1009–1018.
[Nakayama et al. 1984] T. Nakayama, M. Hirakawa, and T. Ichikawa, “Architecture
and Algorithm for Parallel Execution of a Join Operation”, In Proc. of the International
[Ng and Han 1994] R. T. Ng and J. Han, “Efficient and Effective Clustering Methods
for Spatial Data Mining”, In Proc. of the International Conf. on Very Large Databases
(1994).
1306 Bibliography
[NIST 1993] NIST, “Integration Definition for Information Modeling (IDEF1X)”,

Technical Report Federal Information Processing Standards Publication
184, National Institute of Standards and Technology (NIST), Available at
www.idef.com/Downloads/pdf/Idef1x.pdf (1993).
[Nyberg et al. 1995] C. Nyberg, T. Barclay, Z. Cvetanovic, J. Gray, and D. B. Lomet,
“AlphaSort: A Cache-Sensitive Parallel External Sort”, VLDB Journal, Volume 4,
Number 4 (1995), pages 603–627.
[O’Neil and O’Neil 2000] P. O’Neil and E. O’Neil, Database: Principles, Program-
ming, Performance, 2nd edition, Morgan Kaufmann (2000).
[O’Neil and Quass 1997] P. O’Neil and D. Quass, “Improved Query Performance
with Variant Indexes”, In Proc. of the ACM SIGMOD Conf. on Management of Data
(1997).
[O’Neil et al. 2004] P. E. O’Neil, E. J. O’Neil, S. Pal, I. Cseri, G. Schaller, and N. West-
bury, “ORDPATHs: Insert-Friendly XML Node Labels”, In Proc. of the ACM SIG-
MOD Conf. on Management of Data (2004), pages 903–908.
[Ooi and S. Parthasarathy 2009] B. C. Ooi and e. S. Parthasarathy, “Special Issue
on Data Management on Cloud Computing Platforms”, Data Engineering Bulletin,
[Orenstein 1982] J. A. Orenstein, “Multidimensional Tries Used for Associative
Searching”, Information Processing Letters, Volume 14, Number 4 (1982), pages 150–
157.
[Ozcan et al. 1997] F. Ozcan, S. Nural, P. Koksal, C. Evrendilek, and A. Dogac, “Dy-
namic Query Optimization in Multidatabases”, Data Engineering Bulletin, Volume
20, Number 3 (1997), pages 38–45.
[Ozden et al. 1994] B. Ozden, A. Biliris, R. Rastogi, and A. Silberschatz, “A Low-
cost Storage Server for a Movie on Demand Database”, In Proc. of the International
[Ozden et al. 1996a] B. Ozden, R. Rastogi, P. Shenoy, and A. Silberschatz, “Fault-
Tolerant Architectures for Continuous Media Servers”, In Proc. of the ACM SIGMOD
[Ozden et al. 1996b] B. Ozden, R. Rastogi, and A. Silberschatz, “On the Design of a
Low-Cost Video-on-Demand Storage System”, Multimedia Systems Journal, Volume
4, Number 1 (1996), pages 40–54.
[Ozsoyoglu and Snodgrass 1995] G. Ozsoyoglu and R. Snodgrass, “Temporal and
Real-Time Databases: A Survey”, IEEE Transactions on Knowledge and Data Engineer-
[Ozsu and Valduriez 1999] T. Ozsu and P. Valduriez, Principles of Distributed
Database Systems, 2nd edition, Prentice Hall (1999).
Bibliography 1307
[Padmanabhan et al. 2003] S. Padmanabhan, B. Bhattacharjee, T. Malkemus,

L. Cranston, and M. Huras, “Multi-Dimensional Clustering: A New Data Lay-
out Scheme in DB2”, In Proc. of the ACM SIGMOD Conf. on Management of Data
(2003), pages 637–641.
[Pal et al. 2004] S. Pal, I. Cseri, G. Schaller, O. Seeliger, L. Giakoumakis, and V. Zolo-
tov, “Indexing XML Data Stored in a Relational Database”, In Proc. of the International
[Pang et al. 1995] H.-H. Pang, M. J. Carey, and M. Livny, “Multiclass Scheduling in
Real-Time Database Systems”, IEEE Transactions on Knowledge and Data Engineering,
[Papakonstantinou et al. 1996] Y. Papakonstantinou, A. Gupta, and L. Haas,
“Capabilities-Based Query Rewriting in Mediator Systems”, In Proc. of the Inter-
national Conf. on Parallel and Distributed Information Systems (1996).
[Parker et al. 1983] D. S. Parker, G. J. Popek, G. Rudisin, A. Stoughton, B. J. Walker,
E. Walton, J. M. Chow, D. Edwards, S. Kiser, and C. Kline, “Detection of Mutual
Inconsistency in Distributed Systems”, IEEE Transactions on Software Engineering,
[Patel and DeWitt 1996] J. Patel and D. J. DeWitt, “Partition Based Spatial-Merge
Join”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1996).
[Patterson 2004] D. P. Patterson, “Latency Lags Bandwidth”, Communications of the
ACM, Volume 47, Number 10 (2004), pages 71–75.
[Patterson et al. 1988] D. A. Patterson, G. Gibson, and R. H. Katz, “A Case for
Redundant Arrays of Inexpensive Disks (RAID)”, In Proc. of the ACM SIGMOD
[Pellenkoft et al. 1997] A. Pellenkoft, C. A. Galindo-Legaria, and M. Kersten, “The
Complexity of Transformation-Based Join Enumeration”, In Proc. of the International
[Peterson and Davie 2007] L. L. Peterson and B. S. Davie, Computer Networks: a
Systems Approach, Morgan Kaufmann Publishers Inc. (2007).
[Pless 1998] V. Pless, Introduction to the Theory of Error-Correcting Codes, 3rd edition,
John Wiley and Sons (1998).
[Poe 1995] V. Poe, Building a Data Warehouse for Decision Support, Prentice Hall
(1995).
[Polyzois and Garcia-Molina 1994] C. Polyzois and H. Garcia-Molina, “Evalua-
tion of Remote Backup Algorithms for Transaction-Processing Systems”, ACM
[Poosala et al. 1996] V. Poosala, Y. E. Ioannidis, P. J. Haas, and E. J. Shekita, “Im-
proved Histograms for Selectivity Estimation of Range Predicates”, In Proc. of the
ACM SIGMOD Conf. on Management of Data (1996), pages 294–305.
1308 Bibliography
[Popek et al. 1981] G. J. Popek, B. J. Walker, J. M. Chow, D. Edwards, C. Kline,

G. Rudisin, and G. Thiel, “LOCUS: A Network Transparent, High Reliability Dis-
tributed System”, In Proc. of the Eighth Symposium on Operating System Principles
(1981), pages 169–177.
[Pöss and Potapov 2003] M. Pöss and D. Potapov, “Data Compression in Oracle”,
In Proc. of the International Conf. on Very Large Databases (2003), pages 937–947.
[Rahm 1993] E. Rahm, “Empirical Performance Evaluation of Concurrency and
Coherency Control Protocols for Database Sharing Systems”, ACM Transactions on
Database Systems, Volume 8, Number 2 (1993).
[Ramakrishna and Larson 1989] M. V. Ramakrishna and P. Larson, “File Organi-
zation Using Composite Perfect Hashing”, ACM Transactions on Database Systems,
[Ramakrishnan and Gehrke 2002] R. Ramakrishnan and J. Gehrke, Database Man-
agement Systems, 3rd edition, McGraw Hill (2002).
[Ramakrishnan and Ullman 1995] R. Ramakrishnan and J. D. Ullman, “A Survey
of Deductive Database Systems”, Journal of Logic Programming, Volume 23, Number
2 (1995), pages 125–149.
[Ramakrishnan et al. 1992] R. Ramakrishnan, D. Srivastava, and S. Sudarshan,
Controlling the Search in Bottom-up Evaluation (1992).
[Ramesh et al. 1989] R. Ramesh, A. J. G. Babu, and J. P. Kincaid, “Index Optimiza-
tion: Theory and Experimental Results”, ACM Transactions on Database Systems,
[Rangan et al. 1992] P. V. Rangan, H. M. Vin, and S. Ramanathan, “Designing an
On-Demand Multimedia Service”, Communications Magazine, Volume 1, Number 1
(1992), pages 56–64.
[Rao and Ross 2000] J. Rao and K. A. Ross, “Making B+-Trees Cache Conscious in
Main Memory”, In Proc. of the ACM SIGMOD Conf. on Management of Data (2000),
pages 475–486.
[Rathi et al. 1990] A. Rathi, H. Lu, and G. E. Hedrick, “Performance Comparison of
Extendable Hashing and Linear Hashing Techniques”, In Proc. ACM SIGSmall/PC
Symposium on Small Systems (1990), pages 178–185.
[Reed 1983] D. Reed, “Implementing Atomic Actions on Decentralized Data”,
Transactions on Computer Systems, Volume 1, Number 1 (1983), pages 3–23.
[Revesz 2002] P. Revesz, Introduction to Constraint Databases, Springer Verlag (2002).
[Richardson et al. 1987] J. Richardson, H. Lu, and K. Mikkilineni, “Design and
Evaluation of Parallel Pipelined Join Algorithms”, In Proc. of the ACM SIGMOD
[Rivest 1976] R. L. Rivest, “Partial Match Retrieval Via the Method of Superim-
posed Codes”, SIAM Journal of Computing, Volume 5, Number 1 (1976), pages
19–50.
Bibliography 1309
[Robinson 1981] J. Robinson, “The k-d-B Tree: A Search Structure for Large Mul-
tidimensional Indexes”, In Proc. of the ACM SIGMOD Conf. on Management of Data
(1981), pages 10–18.
[Roos 2002] R. M. Roos, Java Data Objects, Pearson Education (2002).
[Rosch 2003] W. L. Rosch, The Winn L. Rosch Hardware Bible, 6th edition, Que (2003).
[Rosenthal and Reiner 1984] A. Rosenthal and D. Reiner, “Extending the Algebraic
Framework of Query Processing to Handle Outerjoins”, In Proc. of the International
[Ross 1990] K. A. Ross, “Modular Stratification and Magic Sets for DATALOG
Programs with Negation”, In Proc. of the ACM SIGMOD Conf. on Management of
Data (1990).
[Ross 1999] S. M. Ross, Introduction to Probability and Statistics for Engineers and
Scientists, Harcourt/Academic Press (1999).
[Ross and Srivastava 1997] K. A. Ross and D. Srivastava, “Fast Computation of
Sparse Datacubes”, In Proc. of the International Conf. on Very Large Databases (1997),
pages 116–125.
[Ross et al. 1996] K. Ross, D. Srivastava, and S. Sudarshan, “Materialized View
Maintenance and Integrity Constraint Checking: Trading Space for Time”, In Proc.
[Rothermel and Mohan 1989] K. Rothermel and C. Mohan, “ARIES/NT: A Recov-
ery Method Based on Write-Ahead Logging for Nested Transactions”, In Proc. of
[Roy et al. 2000] P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobhe, “Efficient and Ex-
tensible Algorithms for Multi-Query Optimization”, In Proc. of the ACM SIGMOD
[Rusinkiewicz and Sheth 1995] M. Rusinkiewicz and A. Sheth. “Specification and
Execution of Transactional Workflows”, In Kim [1995], pages 592–620 (1995).
[Rustin 1972] R. Rustin, Data Base Systems, Prentice Hall (1972).
[Rys 2001] M. Rys, “Bringing the Internet to Your Database: Using SQL Server 2000
and XML to Build Loosely-Coupled Systems”, In Proc. of the International Conf. on
Data Engineering (2001), pages 465–472.
[Rys 2003] M. Rys. “XQuery and Relational Database Systems”, In H. Katz, editor,
XQuery From the Experts, pages 353–391. Addison Wesley (2003).
[Rys 2004] M. Rys. “What’s New in FOR XML in Microsoft SQL Server 2005”.
http://msdn.microsoft.com/en-us/library/ms345137(SQL.90).aspx (2004).
[Sagiv and Yannakakis 1981] Y. Sagiv and M. Yannakakis, “Equivalence among
Relational Expressions with the Union and Difference Operators”, Proc. of the ACM
SIGMOD Conf. on Management of Data (1981).
1310 Bibliography
[Salton 1989] G. Salton, Automatic Text Processing, Addison Wesley (1989).

[Samet 1990] H. Samet, The Design and Analysis of Spatial Data Structures, Addison
Wesley (1990).
[Samet 1995a] H. Samet, “General Research Issues in Multimedia Database Sys-
tems”, ACM Computing Survey, Volume 27, Number 4 (1995), pages 630–632.
[Samet 1995b] H. Samet. “Spatial Data Structures”, In Kim [1995], pages 361–385
(1995).
[Samet 2006] H. Samet, Foundations of Multidimenional and Metric Data Structures,
Morgan Kaufmann (2006).
[Samet and Aref 1995] H. Samet and W. Aref. “Spatial Data Models and Query
Processing”, In Kim [1995], pages 338–360 (1995).
[Sanders 1998] R. E. Sanders, ODBC 3.5 Developer’s Guide, McGraw Hill (1998).
[Sarawagi 2000] S. Sarawagi, “User-Adaptive Exploration of Multidimensional
Data”, In Proc. of the International Conf. on Very Large Databases (2000), pages 307–316.
[Sarawagi et al. 2002] S. Sarawagi, A. Bhamidipaty, A. Kirpal, and C. Mouli,
“ALIAS: An Active Learning Led Interactive Deduplication System”, In Proc. of
[Schlageter 1981] G. Schlageter, “Optimistic Methods for Concurrency Control in
Distributed Database Systems”, In Proc. of the International Conf. on Very Large
[Schneider 1982] H. J. Schneider, “Distributed Data Bases”, In Proc. of the Interna-
tional Symposium on Distributed Databases (1982).
[Selinger et al. 1979] P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie,
and T. G. Price, “Access Path Selection in a Relational Database System”, In Proc.
[Sellis 1988] T. K. Sellis, “Multiple Query Optimization”, ACM Transactions on
[Sellis et al. 1987] T. K. Sellis, N. Roussopoulos, and C. Faloutsos, “TheR+ -Tree: A
Dynamic Index for Multi-Dimensional Objects”, In Proc. of the International Conf. on
[Seshadri et al. 1996] P. Seshadri, H. Pirahesh, and T. Y. C. Leung, “Complex Query
Decorrelation”, In Proc. of the International Conf. on Data Engineering (1996), pages
450–458.
[Shafer et al. 1996] J. C. Shafer, R. Agrawal, and M. Mehta, “SPRINT: A Scalable
Parallel Classifier for Data Mining”, In Proc. of the International Conf. on Very Large
[Shanmugasundaram et al. 1999] J. Shanmugasundaram, G. He, K. Tufte,
C. Zhang, D. DeWitt, and J. Naughton, “Relational Databases for Querying XML
Bibliography 1311
Documents: Limitations and Opportunities”, In Proc. of the International Conf. on

Very Large Databases (1999).
[Shapiro 1986] L. D. Shapiro, “Join Processing in Database Systems with Large
Main Memories”, ACM Transactions on Database Systems, Volume 11, Number 3
(1986), pages 239–264.
[Shasha and Bonnet 2002] D. Shasha and P. Bonnet, Database Tuning: Principles,
Experiments, and Troubleshooting Techniques, Morgan Kaufmann (2002).
[Silberschatz 1982] A. Silberschatz, “A Multi-Version Concurrency Control
Scheme With No Rollbacks”, In Proc. of the ACM Symposium on Principles of Dis-
tributed Computing (1982), pages 216–223.
[Silberschatz and Kedem 1980] A. Silberschatz and Z. Kedem, “Consistency in
Hierarchical Database Systems”, Journal of the ACM, Volume 27, Number 1 (1980),
pages 72–80.
[Silberschatz et al. 1990] A. Silberschatz, M. R. Stonebraker, and J. D. Ullman,
“Database Systems: Achievements and Opportunities”, ACM SIGMOD Record, Vol-
ume 19, Number 4 (1990).
[Silberschatz et al. 1996] A. Silberschatz, M. Stonebraker, and J. Ullman, “Database
Research: Achievements and Opportunities into the 21st Century”, Technical Re-
port CS-TR-96-1563, Department of Computer Science, Stanford University, Stan-
ford (1996).
[Silberschatz et al. 2008] A. Silberschatz, P. B. Galvin, and G. Gagne, Operating
System Concepts, 8th edition, John Wiley and Sons (2008).
[Simmen et al. 1996] D. Simmen, E. Shekita, and T. Malkemus, “Fundamental
Techniques for Order Optimization”, In Proc. of the ACM SIGMOD Conf. on Man-
agement of Data (1996), pages 57–67.
[Skeen 1981] D. Skeen, “Non-blocking Commit Protocols”, In Proc. of the ACM
[Soderland 1999] S. Soderland, “Learning Information Extraction Rules for Semi-
structured and Free Text”, Machine Learning, Volume 34, Number 1–3 (1999), pages
233–272.
[Soo 1991] M. Soo, “Bibliography on Temporal Databases”, ACM SIGMOD Record,
[SQL/XML 2004] SQL/XML. “ISO/IEC 9075-14:2003, Information Technology:
Database languages: SQL.Part 14: XML-Related Specifications (SQL/XML)” (2004).
[Srikant and Agrawal 1996a] R. Srikant and R. Agrawal, “Mining Quantitative As-
sociation Rules in Large Relational Tables”, In Proc. of the ACM SIGMOD Conf. on
Management of Data (1996).
1312 Bibliography
[Srikant and Agrawal 1996b] R. Srikant and R. Agrawal, “Mining Sequential Pat-
terns: Generalizations and Performance Improvements”, In Proc. of the International
Conf. on Extending Database Technology (1996), pages 3–17.
[Stam and Snodgrass 1988] R. Stam and R. Snodgrass, “A Bibliography on Tem-
poral Databases”, IEEE Transactions on Knowledge and Data Engineering, Volume 7,
Number 4 (1988), pages 231–239.
[Stinson 2002] B. Stinson, PostgreSQL Essential Reference, New Riders (2002).
[Stonebraker 1986] M. Stonebraker, “Inclusion of New Types in Relational
Database Systems”, In Proc. of the International Conf. on Data Engineering (1986),
pages 262–269.
[Stonebraker and Rowe 1986] M. Stonebraker and L. Rowe, “The Design of POST-
GRES”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1986).
[Stonebraker et al. 1989] M. Stonebraker, P. Aoki, and M. Seltzer, “Parallelism in
XPRS”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1989).
[Stonebraker et al. 1990] M. Stonebraker, A. Jhingran, J. Goh, and S. Potamianos,
“On Rules, Procedure, Caching and Views in Database Systems”, In Proc. of the
ACM SIGMOD Conf. on Management of Data (1990), pages 281–290.
[Stuart et al. 1984] D. G. Stuart, G. Buckley, and A. Silberschatz, “A Centralized
Deadlock Detection Algorithm”, Technical report, Department of Computer Sci-
ences, University of Texas, Austin (1984).
[Tanenbaum 2002] A. S. Tanenbaum, Computer Networks, 4th edition, Prentice Hall
(2002).
[Tansel et al. 1993] A. Tansel, J. Clifford, S. Gadia, S. Jajodia, A. Segev, and R. Snod-
grass, Temporal Databases: Theory, Design and Implementation, Benjamin Cummings
(1993).
[Teorey et al. 1986] T. J. Teorey, D. Yang, and J. P. Fry, “A Logical Design Method-
ology for Relational Databases Using the Extended Entity-Relationship Model”,
ACM Computing Survey, Volume 18, Number 2 (1986), pages 197–222.
[Thalheim 2000] B. Thalheim, Entity-Relationship Modeling: Foundations of Database
Technology, Springer Verlag (2000).
[Thomas 1996] S. A. Thomas, IPng and the TCP/IP Protocols: Implementing the Next
Generation Internet, John Wiley and Sons (1996).
[Traiger et al. 1982] I. L. Traiger, J. N. Gray, C. A. Galtieri, and B. G. Lindsay, “Trans-
actions and Consistency in Distributed Database Management Systems”, ACM
[Tyagi et al. 2003] S. Tyagi, M. Vorburger, K. McCammon, and H. Bobzin, Core Java
Data Objects, prenticehall (2003).
Bibliography 1313
[Umar 1997] A. Umar, Application (Re)Engineering : Building Web-Based Applications

and Dealing With Legacies, Prentice Hall (1997).
[UniSQL 1991] UniSQL/X Database Management System User’s Manual: Release 1.2.
UniSQL, Inc. (1991).
[Verhofstad 1978] J. S. M. Verhofstad, “Recovery Techniques for Database Sys-
tems”, ACM Computing Survey, Volume 10, Number 2 (1978), pages 167–195.
[Vista 1998] D. Vista, “Integration of Incremental View Maintenance into Query
Optimizers”, In Proc. of the International Conf. on Extending Database Technology
(1998).
[Vitter 2001] J. S. Vitter, “External Memory Algorithms and Data Structures: Deal-
ing with Massive Data”, ACM Computing Surveys, Volume 33, (2001), pages 209–271.
[Walsh et al. 2007] N. Walsh et al. “XQuery 1.0 and XPath 2.0 Data Model”.
http://www.w3.org/TR/xpath-datamodel. currently a W3C Recommendation
(2007).
[Walton et al. 1991] C. Walton, A. Dale, and R. Jenevein, “A Taxonomy and Per-
formance Model of Data Skew Effects in Parallel Joins”, In Proc. of the International
[Weikum 1991] G. Weikum, “Principles and Realization Strategies of Multilevel
Transaction Management”, ACM Transactions on Database Systems, Volume 16,
Number 1 (1991).
[Weikum et al. 1990] G. Weikum, C. Hasse, P. Broessler, and P. Muth, “Multi-Level
Recovery”, In Proc. of the ACM SIGMOD Conf. on Management of Data (1990), pages
109–123.
[Wilschut et al. 1995] A. N. Wilschut, J. Flokstra, and P. M. Apers, “Parallel Evalu-
ation of Multi-Join Queues”, In Proc. of the ACM SIGMOD Conf. on Management of
Data (1995), pages 115–126.
[Witten and Frank 1999] I. H. Witten and E. Frank, Data Mining: Practical Machine
Learning Tools and Techniques with Java Implementations, Morgan Kaufmann (1999).
[Witten et al. 1999] I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes:
Compressing and Indexing Documents and images, 2nd edition, Morgan Kaufmann
(1999).
[Wolf 1991] J. Wolf, “An Effective Algorithm for Parallelizing Hash Joins in the
Presence of Data Skew”, In Proc. of the International Conf. on Data Engineering (1991).
[Wu and Buchmann 1998] M. Wu and A. Buchmann, “Encoded Bitmap Indexing
for Data Warehouses”, In Proc. of the International Conf. on Data Engineering (1998).
[Wu et al. 2003] Y. Wu, J. M. Patel, and H. V. Jagadish, “Structural Join Order Se-
lection for XML Query Optimization”, In Proc. of the International Conf. on Data
Engineering (2003).
1314 Bibliography
[X/Open 1991] X/Open Snapshot: X/Open DTP: XA Interface. X/Open Company,

Ltd. (1991).
[Yan and Larson 1995] W. P. Yan and P. A. Larson, “Eager Aggregation and Lazy
Aggregation”, In Proc. of the International Conf. on Very Large Databases (1995).
[Yannakakis et al. 1979] M. Yannakakis, C. H. Papadimitriou, and H. T. Kung,
“Locking Protocols: Safety and Freedom from Deadlock”, In Proc. of the IEEE Sym-
posium on the Foundations of Computer Science (1979), pages 286–297.
[Zaharioudakis et al. 2000] M. Zaharioudakis, R. Cochrane, G. Lapis, H. Pirahesh,
and M. Urata, “Answering Complex SQL Queries using Automatic Summary Ta-
bles”, In Proc. of the ACM SIGMOD Conf. on Management of Data (2000), pages
105–116.
[Zeller and Gray 1990] H. Zeller and J. Gray, “An Adaptive Hash Join Algorithm
for Multiuser Environments”, In Proc. of the International Conf. on Very Large
[Zhang et al. 1996] T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An Efficient
Data Clustering Method for Very Large Databases”, In Proc. of the ACM SIGMOD
[Zhou and Ross 2004] J. Zhou and K. A. Ross, “Buffering Database Operations for
Enhanced Instruction Cache Performance”, In Proc. of the ACM SIGMOD Conf. on
[Zhuge et al. 1995] Y. Zhuge, H. Garcia-Molina, J. Hammer, and J. Widom, “View
maintenance in a warehousing environment”, In Proc. of the ACM SIGMOD Conf.
on Management of Data (1995), pages 316–327.
[Ziauddin et al. 2008] M. Ziauddin, D. Das, H. Su, Y. Zhu, and K. Yagoub, “Op-
timizer plan change management: improved stability and performance in Oracle
11g”, Proceedings of the VLDB Endowment, Volume 1, Number 2 (2008), pages 1346–
1355.
[Zikopoulos et al. 2004] P. Zikopoulos, G. Baklarz, D. deRoos, and R. B. Melnyk,
DB2 Version 8: The Official Guide, IBM Press (2004).
[Zikopoulos et al. 2007] P. Zikopoulos, G. Baklarz, L. Katsnelson, and C. Eaton,
IBM DB2 9 New Features, McGraw Hill (2007).
[Zikopoulos et al. 2009] P. Zikopoulos, B. Tassi, G. Baklarz, and C. Eaton, Break
Free with DB2 9.7, McGraw Hill (2009).
[Zilio et al. 2004] D. C. Zilio, J. Rao, S. Lightstone, G. M. Lohman, A. Storm,
C. Garcia-Arellano, and S. Fadden, “DB2 Design Advisor: Integrated Automatic
Physical Database Design”, In Proc. of the International Conf. on Very Large Databases
(2004), pages 1087–1097.
[Zloof 1977] M. M. Zloof, “Query-by-Example: A Data Base Language”, IBM Sys-
tems Journal, Volume 16, Number 4 (1977), pages 324–343.
Index
2PC. See two-phase commit query optimization and, 597 disconnected operation and,
3NF. See third normal form query processing and, 566-567 395-396
3PC. See three-phase commit ranking and, 192-195 encryption and, 411-417
relational algebra and, HyperText Markup Language
abstract data types, 1127 235-239 (HTML) and, 378-380
access paths, 542 representation of, 304 HyperText Transfer Protocol
ACID properties. See view maintenance and, (HTTP) and, 377-381, 383,
atomicity; consistency; 610-611 395, 404-406, 417
durability; isolation windowing, 195-197 Java Server Pages (JSP) and,
Active Server Pages (ASP), 397 Ajax, 390-391, 398, 867 377, 383-391
ADO.NET, 169, 395, 1249, 1253 aliases, 75, 355, 829, 872-873, performance and, 400-402
Advanced Encryption Standard 1229 rapid application
(AES), 412-413 alter table, 63, 129 development and,
agglomerative clustering, alter trigger, 185 396-400
907-908 alter type, 140 security and, 402-417
aggregate functions American National Standards servlets and, 383-391
basic, 85-86 Institute (ANSI), 57, 1051 three-layer architecture and,
Boolean values and, 89-90 analysis pass, 753 318
SQL, 84 analytic workspaces, 1161 TP-monitors and, 1095-1096
fusion, 960 uniform resource locators
and operation, 66, 83-84, 1174
with grouping, 86-88 (URLs), 377-378
any keyword, 92n8
having clause, 88-89
Apache, 386, 399, 426, 980 user interfaces and, 375-377
null values and, 89-90
Apple Macintosh OS X, 1124 World Wide Web and, 377-382
aggregation
advanced features of, 192-197 application design, 418 application development
alternative notations for, application architectures and, performance benchmarks
304-310 391-396 and, 1045-1048
entity-relationship (E-R) authentication and, 405-407 performance tuning and,
model and, 301-302, 304 business-logic layer and, 1029-1045
IBM DB2 and, 1209-1210 391-392 set orientation and, 1030-1031
intraoperation parallelism client-server architecture and, standardization and,
and, 811 376-377 1051-1056
.NET Common Language common gateway interface testing applications and,
Runtime (CLR) and, (CGI), 380-381 1048-1051
1257-1258 cookies and, 382-385 updates and, 1030-1033
OLAP and, 197-209 data access layer and, 391, application migration,
PostgreSQL and, 1153 393, 395 1050-1051
1315
1316 Index
application program interface two-phase commit protocol distributed transactions and,

(API) (2PC), 786-788 830-832
ADO.NET, 169, 1054 two-tier, 24-25 isolation and, 646-648
application design and, wide-area networks (WANs), log records and, 726-728,
383-386, 395 788, 790-791 730-734
C++, 1054 ARIES, 1146 recoverable schedules and,
customized maps and, 1068, analysis pass, 753 647
1070 compensation log records recovery systems and, 726-735
DOM, 1020 (CLRs), 751-752, 754 storage structure and, 632-633
IBM DB2, 1196 data structure, 751-753 workflows and, 1099-1100
Java, 158-166, 213, 383-386, dirty page table, 750-755 attribute inheritance, 298-299
1018, 1030 fine-grained locking, 756 attributes
LDAP, 874 fuzzy checkpoints, 750-752 atomic domains and, 327-329
Microsoft SQL Server, 1229, log sequence number (LSN), classifiers and, 896-897
1245, 1248-1250, 750-755 closure of attribute sets and,
1253-1255, 1265, 1267 nested top actions, 755-756 340-342
ODBC, 166-169, 1053 optimization and, 756 complex, 277-278, 284-285
PostgreSQL, 1125 physiological redo and, 750 composite, 267
SAX, 1020 recovery and, 751, 753, 756 continuous valued, 898
standards for, 1051-1056 redo pass, 754 decomposition and, 329-338,
system architectures and, 772 savepoints and, 756 348-360
XML, 985, 1008-1009 transaction rollback and, derived, 268
apply operator, 1230-1231 754-755 design issues and, 290-291
architectures, 767 undo pass, 754-755 domain, 267
business logic and, 25, 1158, Arpanet, 790 entity-relationship diagrams
1221-1222, 1228, 1232, array types, 956-961 and, 277-278
1253, 1263-1267 asc expression, 77-78 entity-relationship (E-R)
centralized, 769-771 as clause, 75-76 model and, 263, 267-269
client-server, 771-772 ASP.NET, 387 entity sets and, 283-286,
cloud-based, 777 assertions, 11, 135-136 290-291
data server, 773, 775-777 assignment operation, 176, 217, multiple-key access and,
data warehouse, 889-891 232, 1052 506-509
distributed systems, 784-788 associations, 17, 43 multivalued, 267-268, 327-329
hierarchical, 781, 784 entity sets and, 290 (see also naming of, 362-363
hypercube, 781 entity sets) nesting, 958-961
mesh, 780-781 relationship sets and, 264-267, null values and, 268-269
network types and, 788-791 308-309, 314 partitioned, 896-897
parallel databases, 797-820 rules for, 904-907 placement of, 294-295
parallel systems, 777-784 associative property, 584-585 search key and, 476
process monitor, 774 Aster Data, 816 simple, 267, 283-284
server system, 772-777 asymmetric-key encryption, single-valued, 267-268
shared-disk, 781, 783, 789 412 Unified Modeling Language
shared memory, 781-783 atomic domains, 42 (UML) and, 308-310
shared-nothing, 781, 783-784, first normal form and, 327-329 uniquifiers and, 498-499
803 object-based databases and, unnesting, 958-961
shared server, 1185 947 value set of, 267
single-user system, 770-771 relational database design xmlattributes and, 1015
source-driven, 890 and, 327-329 XML types and, 990-998
storage-area network (SAN), atomicity, 4-5, 22-23, 104, 625 attribute-value skew, 800-801
789 cascadeless schedules and, audit trails, 409-410
thread pooling and, 1246 647-648 augmentation rule, 339
three-tier, 25 commit protocols and, authentication
TP-monitor, 1092-1095 832-838 challenge-response system
transaction-server, 773-775 defined, 628 and, 415
Index 1317
digital certificates and, IBM DB2 and, 1215, 1218 See also specific
416-417 Microsoft SQL Server and, operation
digital signatures and, 1223, 1227-1228, 1245, bottlenecks
416-417 1262 application design and, 402,
encryption and, 415-417 Oracle and, 1165, 1169, 1029, 1033-1035, 1039
security and, 405-407 (see also 1181-1183, 1187-1190 file structure and, 468
security) parallel databases and, 816 system architectures and,
single sign-on system and, recovery systems and, 726-738 782-783, 800, 816-819,
406-407 (see also recovery 829, 839-840
smart cards and, 415-416 systems) transactions and, 1106-1107,
two-factor, 405-407 remote systems for, 756-759, 1116
authorization, 11, 21, 58 850, 1095-1096 Boyce-Codd normal form
administrators and, 143 system architectures and, 770 (BCNF)
application-level, 407-409 transactions and, 632, 1115 decomposition algorithms
check clause, 148 backup coordinator, 851 and, 349-356
database design and, 312 balanced range-partitioning, dependency preservation
end-user information and, 801 and, 334-336
407-408 balanced tree, 486 relational database design
granting/revoking privileges, BASE properties, 853 and, 333-336
29, 143-145, 149-150 batch scaleup, 779 broadcast data, 1082-1083
lack of fine-grained, 408-409 batch update, 1030-1031 B-trees
roles and, 145-146 Bayesian classifiers, 900-901 application development and,
schema and, 147-148 1039
Bayes’ theorem, 900
Security Assertion Markup IBM DB2 and, 1205
BCNF. See Boyce-Codd normal
Language (SAML) and, indices and, 504-506, 530, 1039
form
407 Oracle and, 1159, 1164-1169,
begin atomic...end, 128, 176,
sql security invoker, 147 1173
181, 183-185
transfer of privileges, 148-149 PostgreSQL and, 1135,
best splits, 897-899
update and, 147, 148 1148-1150
views and, 146-147 biased protocol, 841
spatial data and, 1064,
autonomy, 858 big-bang approach, 1050-1051
1071-1072, 1076, 1086
availability Bigtable, 862-867 B+-trees, 1075
CAP theorem and, 852-853 binary splits, 898 balanced trees and, 486
consistency and, 852-853 bit-interleaved parity bitmap indices and, 528
coordinator selection and, organization, 445 bulk loading of indices and,
850-852 bit-level striping, 442-444 503-504
majority-based approach and, bitmap indices, 507, 509, 531, deletion time and, 491,
848-849 536 495-497, 499-501
read one, write all approach, B+-trees and, 528 extensions and, 500-506
849-850 efficient operations of, 527-528 fanout and, 487
remote backup and, 850 existence, 526-527 file organization and, 500-502
robustness and, 847 Oracle and, 1166-1167 flash memory and, 506
site reintegration and, 850 scans, 1153 indexing strings and, 502-503
average response time, 636 sequential records and, insertion time and, 491-495,
average seek time, 435, 540n2 524-525 499-501
avg expression, 204 structure of, 525-527 on multiple keys, 508
aggregate functions and, blind writes, 687 nonleaf nodes of, 487
84-88 blobs, 138, 166, 457, 502, 1013, nonunique search keys and,
query processing and, 566-567 1198-1199, 1259 497-499
relational algebra and, 236 block-interleaved parity performance and, 485-486
organization, 445-446 pointers and, 486
backup, 186-187 block-level striping, 443-4 queries on, 488-491, 538, 544
application design and, 415 block nested-loop join, 551-552 record relocation and, 502
distributed databases and, Boolean operations, 83, 89-90, secondary indices and, 502
839, 877 94, 161, 176, 873, 1256. structure of, 486-488
1318 Index
updates and, 491-500 application design and, 387, distributed databases and,
buffer manager, 21, 464-466 397-398 830, 848
buffer pools, 1201-1202, 1220 Microsoft SQL Server and, e-catalogs and, 1103-1104
buffers 1228, 1253 IBM DB2 and, 1195, 1220
database buffering and, ODBC and, 167-168 indices and, 476 (see also
739-741 Unified Modeling Language indices)
file structure and, 437-438 (UML) and, 308 information retrieval and,
force/no-force policy and, C++, 7n1, 14 915, 935
739-740 advanced SQL and, 169, 173 Microsoft SQL Server and,
force output and, 725-726 Microsoft SQL Server and, 1235-1236, 1250, 1256,
IBM DB2 and, 1200-1203 1253 1266
log-record buffering and, object-based databases and, PostgreSQL and, 1151, 1154
738-739 945 query optimization and,
main-memory databases and, persistent systems and, 590-596, 1151
1105-1108 968-971 SQL and, 142-143, 165,
management of, 738-743 standards for, 1054 168-169
multiple pool, 1220 Unified Modeling Language system, 462, 468, 801, 1132
operating system role in, (UML) and, 308 transaction processing and,
741-742 caching, 429 1116
recovery systems and, 738-743 application design and, XML and, 1017
replacement policies and, 400-401 categories, 935-937
465-468 coherency and, 776 centralized architectures,
steal/no-steal policy and, 740 data servers and, 776 769-771
transaction servers and, 774 locks and, 776 centralized deadlock detection,
write-ahead logging (WAL) multithreading and, 817-818 845-846
rule and, 739-741 Oracle and, 1184 challenge-response system, 415
bugs, 174n4 query plans and, 605, 775 change isolation level, 649
application design and, 404, shared-memory architecture check clause, 130, 134
1048-1050 and, 783 assertion and, 135-136
recovery systems and, 721-722 call, 175 authorization and, 148
system architectures and, callable statements, 164 data constraints and, 310
787-788 call back, 776 dependency preservation
transactions and, 1093, 1102 Call Level Interface (CLI) and, 334-336
workflows and, 1101 standards, 1053 user-defined types and, 140
build input, 558 candidate keys, 45-46 check constraints, 134, 148, 310,
bulk export, 1032 canonical cover, 342-345 317, 334, 628, 1130
bulk insert, 1032 CAP theorem, 852-853 checkpoint log record, 752
bulk loads, 503-504, 1031-1033 Cartesian products, 68 checkpoints
bully algorithm, 851, 852 equivalence and, 584 fuzzy, 750-752
business logic, 25, 173 join expressions and, 229-232 Microsoft SQL Server and,
application design and, 376, queries and, 573, 584, 589, 1246
383, 391-393, 396, 410, 418 595-596, 606, 616 Oracle and, 1185
IBM DB2 and, 1221-1222 relational algebra and, 50-51, recovery systems and,
Microsoft SQL Server and, 217, 222-232 734-735, 742-743
1228, 1232, 1253, SQL and, 68-75, 120, 209 transaction servers and, 774
1263-1267 cascade, 133, 150, 186 checksums, 434
Oracle and, 1158 cascadeless schedules, 647-648 classification hierarchy, 935-937
business-logic layer, 39, 391 cascading stylesheet (CSS) classifiers
bus system, 780 standard, 380 Bayesian, 900-901, 1191, 1266
BYNET, 806 case construct, 102-103 best splits and, 897-899
byte-code, 389 CASE tools, 1194-1195 decision-tree, 895-900
cast, 136, 139-140 entropy measure and, 897
C, 7, 14, catalogs Gini measure and, 897
advanced SQL and, 157, 169, application development and, information gain and, 897-898
173, 180 1053 kernel functions and, 901-902
Index 1319
neural-net, 900 commit work, 127 locks and, 661-686, 839-842

partitions and, 896-897 common gateway interface logical undo and, 749-750
prediction and, 894-904 (CGI) standard, 380-381 long-duration transactions
purity and, 897 Common Language Runtime and, 1111-1112
regression and, 902-903 (CLR), 180 Microsoft SQL Server and,
Support Vector Machine, Common Object Request 1241-1246
900-902 Broker Architecture multiple granularity and,
training instances and, 895 (CORBA), 1054-1055 679-682
validating, 903-904 common subexpression multiversion schemes and,
client-server systems, 23, 32, elimination, 614 689-692, 1137-1146
204 commutative property, 584-585 Oracle and, 1180-1183
application design and, compatibility function, 662 PostgreSQL and, 1137-1145
376-377, 1031, 1053, 1056 compensation log records predicate reads and, 697-701
Microsoft SQL Server and, (CLRs), 751-752, 754 recovery systems and, 729-730
1228 complex data types, 28, 31, 138, rollbacks and, 667, 670,
recovery systems and, 756 1061 674-679, 685, 689, 691, 709
system architecture and, entity-relationship (E-R) serializability and, 650, 662,
769-772, 777, 788, 791 model and, 946-947 666-667, 671, 673, 681-690,
transaction processing and, keywords and, 947-949 693-697, 701-704, 708
1092-1096 normal forms and, 947 snapshot isolation and,
client-side scripting, 389-391 object-based databases and, 692-697, 729-730
clobs, 138, 166, 457, 502, 945-949, 963, 970-974 timestamp-based protocol
1010-1013, 1196-1199 system architecture and, 864 and, 682-686
cloud computing, 777 component object model updates and, 867-868
challenges with, 868-870 (COM), 1253 user interactions and, 702-704
data representation and, compression, 1077-1078
validation and, 686-689,
863-865 application development and,
729-730
partitions and, 865-866 1041
Web crawlers and, 930-931
replication and, 866-868 data warehouses and, 893
condition-defined entity sets,
retrieval and, 865-866 IBM DB2 and, 1194
299
storage and, 862-863 Microsoft SQL Server and,
confidence, 893, 905
traditional databases and, 868 1236
Oracle and, 1165-1167 conformance levels, 168-169
transactions and, 866-868
clusters, 781 prefix, 503, 1166 conjunction, 545-546, 594
agglomerative, 907-908 computer-aided design (CAD), connect to, 170
cloud computing and, 867 312, 1061, 1064-1068 consistency, 11, 22, 104
data mining and, 894, 907-908 conceptual design, 15-16, 260 availability and, 852-853
divisive, 907-908 concurrency control, 631, 636, CAP theorem and, 852-853
hierarchical, 907-908 639, 709 concurrency control and, 695,
IBM DB2 and, 1203-1207 access anomalies and, 5 701-704, 710
multidimensional, 1203-1207 blind writes and, 687 cursor stability and, 702
Oracle and, 1173, 1186 consistency and, 695, 701-704, deadlocks and, 665-666
Real Application Clusters 710 degree-two, 701-702
(RAC) and, 1186 deadlock handling and, distributed transactions and,
coalescence, 491, 706 674-679 830-832
code breaking. See encryption delete operations and, 697-701 logical operations and, 746
ColdFusion Markup Language distributed databases and, mobile networks and,
(CFML), 387 835-836, 839-847 1083-1085
collect function, 959 DML commands and, replication with weak,
collection volumes, 957, 957-958 1138-1139 843-844
column-oriented storage, false cycles and, 846-847 requirement of, 630
892-893 IBM DB2 and, 1200-1203, transactions and, 627-631,
combinatorics, 639 1217-1218 635-636, 640, 648-650, 655
commit protocols, 832-838 index structures and, 704-708 user interactions and, 702-704
commit time, 758 insert operations and, 697-701 weak levels of, 701-704
1320 Index
constraints create recursive view, 192 766-767, 769-772, 777,

condition-defined, 299 create role, 146 788, 791
decomposition and, 354-355 create schema, 143 complexity of, 260
dependency preservation create snapshot, 843-844 conceptual-design phase and,
and, 334-336 create table, 60-63, 139, 161 15-16, 260
disjoint, 300 with data, 141-142 data constraints and, 310-311
entity-relationship (E-R) extensions for, 141-142 direct design process and,
model and, 269-272, integrity constraints and, 129 259-260
299-301 object-based databases and, encryption and, 411-417
IBM DB2 and, 1199 950, 961-962 entity-relationship (E-R)
integrity, 4 (see also integrity create table...as, 142 model and, 17-18, 249-313
constraints) create type, 139-141 IBM DB2 and, 1194-1195
keys, 271-272 create unique index, 529 incompleteness and, 262
mapping cardinalities and, create view, 121-123, 125, 142, logical-design phase and, 16,
269-270, 276-277 147, 1130-1131 260-261
overlapping, 300 cross-site request forgery Microsoft SQL Server and,
partial, 300 (XSRF), 403-404 1223-1228
participation, 270 cross-site scripting (XSS), normalization and, 18-20
PostgreSQL and, 1130-1131, 403-405 Oracle and, 1157-1158
1153-1154 cross-tabulation, 199-203, 205, overview of process, 259-262
total, 300 210 parallel systems and, 815-817
transactions and, 628, CRUD Web interfaces, 399 phases of, 259-261
1108-1109 Crystal Reports, 399-400 physical-design phase and,
UML and, 309-310 cube by, 1221-1222 16, 261
user-defined, 299 cube construct, 206-210 redundancy and, 261-262
contains, 93 current date, 137 relational, 323-368
context switch, 1092 cursor stability, 702 specification of functional
conversation group, 1262 curve fitting, 902-903 requirements and, 16
cookies, 382-385, 403-405 Cyc project, 925, 927 top-down, 297
cores, 770 universities and, 16-17
correlated subqueries, 93 user needs and, 260
correlations, 906 data abstraction, 6-8 user requirements and,
correlation variables, 605-607 data access layer, 391, 393, 395 311-312
count, 84-86, 89, 566-567 data analysis workflow and, 312-313
count-distinct, 236 data mining and, 893-910 database graph, 671-674
count value, 204 decision-support systems database instance, 42
crashes, 434, 467-468 and, 887-891 databases
actions after, 736-738 information retrieval and, abstraction and, 6-8, 10
algorithms for, 735-738 915-938 architecture and, 28-29, 767
ARIES and, 750-756 warehousing and, 887-891 (see also architectures)
checkpoints and, 734-735 database design, 257, 313-314 buffering and, 739-741 (see
failure classification and, adapting for modern also buffers)
721-722 architectures and, concurrency control and,
recovery systems and, 736-738 818-819 661-710 (see also
(see also recovery alternatives and, 261-262, concurrency control)
systems) 304-310 distributed, 825-878, 1188
transactions and, 628 application, 375-418 dumping and, 743-744
create assertion, 135 authorization requirements file-processing system and,
create distinct type, 141, and, 312 3-6
1194-1195, 1197 automated tuning and, force output and, 725-726
create domain, 140-141 1040-1041 history of, 29-31
create function, 175, 177, 189 bottom-up, 297 indexing and, 475-531 (see
create index, 528-529, 1150-1151 buffers and, 464-468 also indices)
create or replace, 174n4 client-server architecture and, information retrieval and,
create procedure, 175, 179 32, 204, 376-377, 756, 915-937
Index 1321
locks and, 661-679 (see also DATAllegro, 816 data warehouses, 888
locks) Datalog, 37 column-oriented storage and,
main-memory, 1105-1108 data-manipulation language 892-893
mobile, 1079-1085 (DML), 12-14 components of, 889-891
modification and, 98-103, authorization and, 143 deduplication and, 890-891
728-729 compiler and, 21-22 defined, 889
multimedia, 1076-1079 concurrency control and, fact tables and, 891-892
normalization and, 18-20 1138-1139, 1145 householding and, 891
parallel, 797-820 declarative, 10 IBM DB2 and, 1194, 1221-1222
personal, 1079-1085 defined, 10, 32 materialized views and,
recovery systems and, 631, host language and, 15 1171-1172
633, 721-761 (see also Microsoft SQL Server and, merger-purge operation and,
recovery systems) 1231-1233, 1245, 1261 890-891
storage and, 20-22, 427 (see Oracle and, 1161-1162, 1165, Microsoft SQL Server and,
also storage) 1181 1264
time in, 1062-1064 PostgreSQL and, 1137-1138 Oracle and, 1158, 1162,
databases administrator (DBA), precompiler and, 15 1169-1172, 1178, 1189
28-29, 149, 1152, procedural/nonprocedural, transformations and, 891
1214-1215, 1243 10 updates and, 891
database schemas. See schemas querying and, 21-22 Data Encryption Standard
Database Task Group, 1052 snapshot isolation and, 1137 (DES), 413
database writer process, 773 (see also snapshot datetime, 201
data cleansing, 890 isolation) DB-Lib, 1249
data cube, 200, 206-210 storage manager and, 20-21 deadlines, 1108-1109
data-definition language triggers and, 1161-1162 deadlocks
(DDL), 9-12, 14, 32 data mediation, 1018-1019 consistency and, 665-666
authorization and, 58 data mining, 25-26, 33, 771-772, distributed databases and,
basic types and, 59-60 887-889 839, 841, 844-847
concurrency control and,
association rules and, 904-907 handling of, 674-679
1144-145
best splits and, 897-899 IBM DB2 and, 1217-1220
dumping and, 743-744
classification and, 894-904 long-duration transactions
IBM DB2 and, 1194-1197, 1204
clusters and, 894, 907-908 and, 1110-1111
indices and, 58
data-visualization, 909 Microsoft SQL Server and,
integrity and, 58
Microsoft SQL Server and, defined, 893 1243-1244, 1246
1225, 1228-1233, 1245, descriptive patterns and, 894 PostgreSQL and, 1143-1145
1253, 1256, 1261 entropy measure and, 897 prevention of, 675-676
Oracle and, 1162, 1181 Gini measure and, 897 rollback and, 678-679
PostgreSQL and, 1144-1145, information gain and, 897-898 starvation and, 679
1150 Microsoft SQL Server and, victim selection and, 678
querying and, 21-22 1266-1267 wait-for graph and, 676-677,
schema definition and, 28, 58, Oracle and, 1191 676-678
60-63 prediction and, 894-904 decision support, 1047
security and, 58 purity and, 897 decision-support queries, 797
set of relations in, 58-61 rules for, 893-894 decision-support systems,
SQL basics and, 57-63, 104 text, 908 887-891
storage and, 58 data models. See specific model decision-tree classifiers, 895-900
data dictionary, 12, 21, 462-464 data parallelism, 805 declare, 175-178
Data Encryption Standard data server systems, 773, decode, 208
(DES), 413 775-777, 782 decomposition
DataGrid control, 398 data storage and definition algorithms for, 348-355
data guard, 1183 language, 11 Boyce-Codd normal form
data inconsistency. See data striping, 444 and, 333-336, 349-355
consistency data-transfer rate, 435-436 dependency preservation
data isolation, See isolation data types. See types and, 334-336
1322 Index
fourth normal form and, directory information tree greater potential for bugs in,
358-360 (DIT), 872-875 787-788
functional dependencies and, directory systems, 870-875 implementation and, 786-788
329-338, 355-360 dirty blocks, 741 increased processing
higher normal forms and, dirty page table, 750-756, overhead of, 788
337-338 1244-1245 local transactions and, 784
keys and, 330-333 dirty read, 1137, 1181 nodes and, 784
lossless, 345-346 dirty writes, 649 ready state and, 787
lossless-join, 345-346 disable trigger, 185 replication and, 785
lossy, 345-346 disconnected operation, sharing data and, 785
multivalued dependencies 395-396 sites and, 784
and, 355-360 disjoint entity sets, 300 software-development cost
relational database design disjoint specialization, 296-297 and, 787
and, 329-338, 348-360 disjunction, 545-546, 594 two-phase commit protocol
third normal form and, disk-arm-scheduling, 437 (2PC) and, 786-788
336-337, 352-355 disk controller, 434 distributor, 1252
decomposition rule, 339 distinct types, 84-86, 138-141 divisive clustering, 907-908
DEC Rdb, 30 distinguished name (DN), 872 Document Object Model
deduplication, 890-891 distributed databases, 876-878 (DOM), 390
Deep Web crawlers, 931 availability and, 847-853 document type definition
default values, 133, 137, 140, cloud-based, 861-870 (DTD), 990-994
144, 425, 899, 952, domain, 42
commit protocols and,
991-992, 996, 1128 domain constraints, 11
832-838
deferred integrity constraints, domain-key normal form
concurrency control and,
134 (DKNF), 360
835-836, 839-847
domain relational calculus, 249
degree-two consistency, 701-702 deadlock handling and,
example queries, 246-247
deletion, 61, 63, 98-100, 102, 161 844-847
expressive power of
concurrency control and, directory systems and,
languages, 248
697-701 870-875
formal definition, 245
EXEC SQL and, 171 failure and, 831-835 safety of expressions, 247-248
hashing and, 510, 513, 516, 523 fragmentation and, 826-829 double-pipelined hash-join,
integrity constraints and, 133 heterogeneous, 825-826, 571-572
PostgreSQL and, 1130-1131 857-861 drill down, 201
privileges and, 143-145 homogeneous, 825-826 Driver-Manager class, 160
transactions and, 629, 653 joins and, 855-857 drop index, 529
triggers and, 183 locks and, 839-847 drop table, 63, 164
views and, 125 partitions and, 835 drop trigger, 185
delta relation, 186 persistent messaging and, drop type, 140
demand-driven pipelining, 836-838 dumping, 743-744
569-570 query processing and, 854-860 duplicate elimination, 563-564
denormalization, 363-364 recovery and, 835-836 durability, 22-23, 104, 625,
dependency preservation, replication and, 826, 829, 630-631
334-336, 346-348 843-844 defined, 628
desc, 77-78 storage and, 826-830 distributed transactions and,
descriptive attributes. See timestamps and, 842-843 830-832
attributes transparency and, 829-830 one-safe, 758
descriptive patterns, 894 unified view of data and, remote backup systems and,
deviation, 215, 906-907 858-859 758
dicing, 201 distributed-lock manager, 840 storage structure and, 632-633
dictionary attacks, 414 distributed systems two-safe, 758
digital certificates, 416-417 autonomy and, 785 two-very-safe, 758
digital signatures, 416 availability and, 785 dynamic SQL, 58, 158, 175
direct-access storage, 431 example of, 786
directories, 935-937 global transactions and, 784 e-catalogs, 1103
Index 1323
Eclipse, 386 attributes and, 263, 267-269, enterprise resource planning

E-commerce, 1102-1105 290-291, 294-295, 298-299, (ERP) systems, 1101
efficiency, 6-8 327-329 entropy measure, 897-898
election algorithms, 851-852 complex data types and, equi-joins, 549-559, 563, 566,
embedded SQL, 58, 158, 169-173 946-947 571, 807, 819
empty relations, 93-94 constraints and, 269-272 equivalence
encryption design issues and, 290-295 cost analysis and, 601-602
Advanced Encryption enterprise schema and, 262 join ordering and, 588-589
Standard (AES), 412-413 entity sets and, 262-267, relational algebra and,
applications of, 411-417 272-274, 279-286, 290-291, 582-590
asymmetric-key, 412 296-298 transformation examples for,
authentication and, 415-417 extended features, 295-304 586-588
challenge-response system generalization and, 297-304 error-correcting code (ECC)
and, 415 normalization and, 361-362 organization, 444-445
database support and, 414-415 object-oriented data model ERWin, 1194
dictionary attacks and, 414 and, 27 escape, 77
digital certificates and, reduction to relational evaluation primitive, 539
416-417 schemas and, 283-290 every clause, 90
digital signatures and, 416 redundancy and, 272-274 except clause, 82-83, 93, 188
nonrepudiation and, 416 relationship sets and, 264-267, exchange system, 1104
Oracle and, 1165-1166 286-290, 291-295, 296-297 exclusive-mode locks, 661
specialization and, 295-297 EXEC SQL, 169-173
prime numbers and, 413
Unified Modeling Language execute, 147
public-key, 412-414
(UML) and, 308-310 existence bitmap, 526-527
Rijndael algorithm and,
entity sets exists clause, 93
412-413
alternative notations for, extensibility contracts,
techniques of, 412-414
304-310 1256-1258
end-user information, 407-408 attributes and, 263, 284-285,
enterprise information, 1-2 external language routines,
290-291 179-180
Entity Data Model, 395 condition-defined, 299
entity-relationship (E-R) external sort-merge algorithm,
defined, 262-263 548-549
diagram, 17-18 design issues and, 290-292
alternative notations for, disjoint, 299
304-310 extension of, 263 Facebook, 31, 862
basic structure of, 274-275 generalization and, 297-304 factorials, 639
complex attributes, 277-278 identifying relationship and, fact tables, 891-892
entity sets and, 279-281 280 fail-stop assumption, 722
generalization and, 298 overlapping, 299 false cycles, 846-847
identifying relationship and, properties of, 262-264 false drops, 929
280 relationship sets and, 264-267, false negatives, 903, 929-930
mapping cardinality, 276 291-292 false positives, 903, 929-930
nonbinary relationship sets, removing redundancy in, false value, 90, 208
278-279 272-274 fanout, 487
relationship sets, 278-279 role of, 264-265 fetching, 21, 138, 906, 1078, 1097
roles, 278 simple attributes and, 283-284 advanced SQL and, 161,
university enterprise strong, 283-285 166-173, 176, 180, 194
example, 282-283 subclass, 298 application design and,
weak entity sets, 279-281 superclass, 298 389-397, 1030, 1038
entity-relationship (E-R) superclass-subclass IBM DB2 and, 1199, 1202,
model, 9, 17-18, 259, relationship and, 296-297 1209, 1211, 1219
313-314, 963 Unified Modeling Language information retrieval and,
aggregation and, 301-302, 304 (UML) and, 308-310 921, 929, 936 (see also
alternative modeling data user-defined, 299 information retrieval)
notations and, 304-310 weak, 279-281, 285-286 Microsoft SQL Server and,
atomic domains and, 327-329 Entity SQL, 395 1241, 1251
1324 Index
object-based databases and, flow-distinct, 1240-1241 union rule, 339

965, 969-972 FLWOR (for, let, where, order functionally determined
PostgreSQL and, 1137, 1146, by, return) expressions, attributes, 340-342
1151, 1153 1002-1003 function-based indices,
storage and, 437, 439, 444 (see forced output, 465, 725-726 1167-1168
also storage) force policy, 739-740 functions. See also specific
Web crawlers and, 930-931 for each row clause, 181-184 function
Fibre Channel interface, 434, for each statement clause, 68, declaring, 174-175
436 183 external language routines
fifth normal forms, 360 foreign keys, 46, 61-62, 131-133 and, 179-180
file header, 454 fourth normal forms, 356, IBM DB2 and, 1197-1198
file manager, 21 358-360 language constructs for,
file organization, 3-4. See also fragmentation, 827-829 176-179
storage free space control record polymorphic, 1128-1129
B+-trees and, 500-502 (FSCR), 1202 PostgreSQL, 1133-1135
blobs, 138, 166, 457, 502, 1013, from statement state transition, 1134
1198-1199, 1259 aggregate functions and, syntax and, 173-174, 178
block-access time and, 438 84-90 writing in SQL, 173-180
clobs, 138, 166, 457, 502, basic SQL queries and, 63-74 XML and, 1006-1007
1010-1013, 1196-1199 on multiple relations, 66-71 fuzzy checkpoints, 742-743,
fixed-length records and, natural join and, 71-74 750-752
452-454 null values and, 83-84
hashing, 457 rename operation and, 74-75
heap file, 457 set operations and, 79-83 generalization
indexing and, 475 (see also on single relation, 63-66 aggregation and, 301-302
indices) string operations and, 76-79 attribute inheritance and,
journaling systems and, 439 subqueries and, 95-96 298-299
multitable clustering and, 458, full outer joins, 117-120, bottom-up design and, 297
460-462 233-234, 565-566 condition-defined, 299
pointers and, 454 functional dependencies, 18, constraints on, 299-301
security and, 5-6 (see also 129 disjoint, 300
security) attribute set closure and, entity-relationship (E-R)
sequential, 457-459 340-342 model and, 297-304
structured, 451-468 augmentation rule and, 339 overlapping, 300
system structure and, 451-452 BCNF algorithm and, 349-352 partial, 300
variable-length records and, Boyce-Codd normal form representation of, 302-304
454-457 and, 333-336 subclass set and, 298
file scan, 541-544, 550, 552, 570 canonical cover and, 342-345 superclass set and, 298
final/not final expressions, 949, closure of a set, 338-340 top-down design and, 297
953 decomposition rule, 339 total, 300
fine-granularity parallelism, dependency preservation user-defined, 299
771 and, 334-336, 346-348 Generalized Inverted Index
FireWire interface, 434 extraneous, 342 (GIN), 1149
first committer wins, 692-693 higher normal forms and, generalized-projection, 235
first updater wins, 693 337-338 Generalized Search Tree
flash storage, 403 keys and, 330-333 (GiST), 1148-1149
B+-trees and, 506 lossless decomposition and, geographic data, 1061
cost of, 439 345-346 applications of, 1068
erase speed and, 440 multivalued, 355-360 information systems and,
hybrid disk drives and, pseudotransitivity rule, 339 1065
440-441 reflexivity rule and, 339 raster data and, 1069
NAND, 430, 439-440 theory of, 338-348 representations of, 1065-1066,
NOR, 430, 439 third normal form and, 1069-1070
flash translation layer, 440 336-337, 352-355 spatial queries and, 1070-1071
floppy disks, 430 transitivity rule and, 339 vector data and, 1069
Index 1325
getColumnCount method, queries and, 516-522 hybrid hash join, 562

164-165 skew and, 512 hybrid merge join, 557
getConnection method, 160 static, 509-515, 522-523 hybrid OLAP (HOLAP), 204
GET method, 405 updates and, 516-522 hypercube architecture, 781
getString method, 161 hash join, 602 hyperlinks
Gini measure, 897 basics of, 558-559 PageRank and, 922-923, 925
Glassfish, 386 build input and, 558 popularity ranking and,
global class identifier, 1055 cost of, 561-562 920-922
global company identifier, 1055 double-pipelined, 571-572 search engine spamming and,
Global Positioning System hybrid, 562 924-925
(GPS), 1068 overflows and, 560 HyperText Markup Language
global product identifier, 1055 query processing and, 557-562 (HTML), 378-380
global wait-for graph, 845-847 recursive partitioning and, client-side scripting and,
Google, 31 539-540 388-391
application design and, skewed partitioning and, 560 DataGrid and, 398
378-382, 396, 407 hash-table overflow, 560 embedded, 397
distributed databases and, having, 88-89, 96 information retrieval and, 915
862, 866-867 heap file, 457, 523, 1147-1149, (see also Information
information retrieval and, 933 1153 retrieval)
(see also information heuristics, 1075-1076 Java Server Pages (JSP) and,
retrieval) data analysis and, 899, 910 387-391
PageRank and, 922-925 distributed databases and, 859 rapid application
grant, 144-150 greedy, 910 development (RAD) and,
granted by current role, 150 IBM DB2 and, 1211 397
graph-based protocols, 671-674 information retrieval and, 934 security and, 402-417
Greenplum, 816 Microsoft SQL Server and, server-side scripting and,
group by, 86-89, 96, 194, 203, 1240 386-388
206-209 Oracle and, 1176 stylesheets and, 380
growing phase, 667-669 parallel databases and, 815 Web application frameworks
query optimization and, and, 398-400
hackers. See security 598-605, 615-616 web sessions and, 380-382
handoff, 1081 Hibernate system, 393-395 XML and, 981
hard disks, 29-30 hierarchical architecture, 781, HyperText Transfer Protocol
hardware RAID, 448 784 (HTTP)
hardware tuning, 1035-1038 hierarchical classification, application design and,
harmonic mean, 1046 935-937 377-381, 383, 395,
hash cluster access, 1173 hierarchical clustering, 907-908 404-406, 417
hash functions, 457-458, 476, hierarchical data model, 9 as connectionless, 381
530-531 high availability, 756 digital certificates and, 417
closed, 513 HIPAA, 1248 man-in-the-middle attacks
data structure and, 515-516 histograms, 195, 591-596, 616, and, 406
deletion and, 510, 513, 516, 801, 901, 1152, 1175, 1211, Representation State Transfer
523-524 1239 (REST) and, 395
dynamic, 515-523 HITS algorithm, 925 Simple Object Access Protocol
extendable, 515 home processor, 803 (SOAP) and, 1017-1018,
indices and, 514-515, 523-524 homonyms, 925-927 1249-1250
insertion and, 513, 516-524 horizontal fragmentation,
insufficient buckets and, 512 827-828 IBM AIX, 1193
joins and, 809-810 horizontal partitioning, 798 IBM DB2, 30, 96, 141, 160n3,
lookup and, 516-518, 522, 524 hot-spare configuration, 758 172, 180, 184-185, 216,
open, 513 hot swapping, 449 1121
Oracle and, 1170 householding, 891 administrative tools,
overflows and, 512-513 HP-UX, 1193 1215-1216
partitioning and, 807 hubs, 924 autonomic features, 1214-1215
PostgreSQL and, 1148 hybrid disk drives, 440-441 buffer pools and, 1201-1202
1326 Index
business intelligence features, Illustra Information materialized views and, 612

1221-1222 Technologies, 1123-1124 Microsoft SQL Server and,
concurrency control and, immediate-modification 1231-1236
1200-1203, 1217-1218 technique, 729 multicolumn, 1149
constraints and, 1199 incompleteness, 262 multilevel, 480-482
Control Center, 1195, 1215 inconsistent state, 630. See also multiple-key access and, 485,
database-design tools and, consistency 506-509
1194-1195 in construct, 91, 92n8 nonclustering, 477
data type support and, incremental view maintenance, operator classes and, 1150
1196-1197 608-611 Oracle and, 1162-1173
Data Warehouse Edition, 1221 independent parallelism, 814 ordered, 475-485, 523-524
development of, 1193-1193 indexed nested-loop join, partial, 1150
distribution, 1220-1221 552-553 partitions and, 1169-1171
external data and, 1220-1221 index entry, 477 performance tuning and,
indexing and, 1199-1205 indexing strings, 502-503 1039-1041
isolation and, 1217-1218 index-organized tables (IOTs), persistent programming
joins and, 1209-1210 1164-1165 languages and, 964-972
large objects and, 1198-1199 index record, 477 pointers and, 546
locks and, 1217-1220 indices, 21, 137-138, 530-531 PostgreSQL and, 1135-1136,
massively parallel processors access time and, 476, 479, 523 1146-1151
(MPP) and, 1193 access types and, 476 primary, 476-477, 542, 544
materialized views and, bitmap, 507, 509, 524-528, 531, query processing and, 541-544
1212-1214 536, 1166-1167 record relocation and, 502
multidimensional clustering block, 1205 search key and, 476
and, 1203-1207 bulk loading of, 503-504 secondary, 477, 483-485, 502,
query processing and, 593, clustering, 476-477, 483-485, 542, 544-545
604, 612, 1207-1216 542 selection operation and,
comparisons and, 544-545 541-544
recovery and, 1200-1203
composite, 545-546 sequential, 485-486
replication, 1220-1221
concurrency control and, sorting and, 547-549
rollback and, 1218
704-708 space overhead and, 476, 479,
set operations and, 1209-1210
construction of, 1150-1151 486, 522
SQL variations and, 1195-1200 covering, 509 sparse, 477-480, 482-483
storage and, 1200-1203 definition in SQL and, 528-529 spatial data and, 1071-1076
system architecture, 1219-1220 deletion time and, 476, 483, support routines and,
System R and, 1193 491, 495-501, 523-524 1135-1136
Universal Database Server, dense, 477-483 trees and, 1148-1149 (see also
1193-1194 domain, 1168-1169 trees)
user-defined functions and, on expressions, 1149 unique, 1149
1197-1198 function-based, 1167-1168 updates and, 482-483
utilities, 1215-1216 Generalized Inverted Index XML, 1160
Web services, 1199-1200 (GIN) and, 1149 information-extraction
XML and, 1195-1196 hashed, 476 (see also hash systems, 932-933
IBM MVS, 1193 functions) information gain, 897-898
IBM OS/400, 1193-1194 IBM DB2 and, 1199-1205 information retrieval, 25-26,
IBM VM, 1193-1194 identifiers and, 546 885, 938
IBM z/OS, 1194 information retrieval and, adjacency test and, 922-923
identifiers, 546 927-929 applications of, 915-917,
global, 1055 insertion time and, 476, 931-935
OrdPath, 1260-1261 482-483, 491-495, 499-501, categories and, 935-937
standards and, 1055-1056 523-524 defined, 915
tags and, 982-985 inverted, 927-929 development of field, 915
identifying relationship, 280 join, 1168 (see also joins) directories and, 935-937
identity declaration, 1043 linear search and, 541-542 false negatives and, 929-930
if clause, 184 logical row-ids and, 1164-1165 false positives and, 929-930
Index 1327
homonyms and, 925-927 instead of triggers, 1161-1162 duplicate elimination and, 811
indexing of documents and, integrated development operation evaluation costs
927-929 environment (IDE), 111, and, 812
information extraction and, 307, 386, 397, 426, 434, 932 parallel external sort-merge
932-933 integrity constraints, 4, 58 and, 806
keywords and, 916-927 add, 129 parallel join and, 806-811
measuring effectiveness of, alter table, 129 parallel sort and, 805-806
929-930 assertion, 135-136 projection and, 811
ontologies and, 925-927 check clause, 130, 134-136 range-partitioning sort and,
PageRank and, 922-923, 925 create table, 129, 130 805
precision and, 929-930 deferred, 134 selection and, 811
query result diversity and, 932 examples of, 128 intraquery parallelism, 803-804
question answering and, foreign key, 131-133 invalidation reports, 1083
933-934 functional dependencies and, inverse document frequency
recall and, 929-930 129 (IDF), 918
relevance ranking using hashing and, 809-810 I/O parallelism
terms, 917-920 not null, 129-130, 133 hashing and, 799-800
relevance using hyperlinks, primary key, 130-131 partitioning techniques and,
920-925 referential, 11, 46-47, 131-136, 798-800
result diversity and, 932 151, 181-182, 628 range scheme and, 800
search engine spamming and, schema diagrams and, 46-47 round-robin scheme and, 799
924-925 on single relation, 129 skew handling and, 800-802
similarity-based, 919-920 unique, 130-131 is not null, 83
stop words and, 918 user-defined types and, 140
is not unknown, 84
structured data queries and, violation during transaction,
is null, 83
934-935 133-134
isolation, 4, 1094
synonyms and, 925-927 XML and, 1003-1004
atomicity and, 646-648
TF-IDF approach and, 917-925 integrity manager, 21
intention-exclusive (IX) mode, cascadeless schedules and,
Web crawlers and, 930-931
680 647-648
Ingres, 30
inheritance, 298-299 intention-shard (IS) mode, 680 concurrency control and, 631,
interconnection networks, 636-637, 639, 650,
overriding method, 952
780-781 1137-1138
SQL and, 949-956
structured types and, 949-952 interesting sort order, 601 defined, 628
tables and, 954-956 Interface Description Language dirty read and, 1137
types and, 952-953 (IDL), 1054-1055 distributed transactions and,
initially deferred integrity interference, 780 830-832
constraints, 134 internal nodes, 487 factorials and, 639
inner joins, 117-120, 601 International Organization for improved throughput and,
inner relation, 550 Standardization (ISO), 635-636
insertion, 61, 100-102 57, 871, 1051 inconsistent state and, 631
concurrency control and, Internet, 31 levels of, 648-653
697-701 direct user access and, 2 locking and, 651
EXEC SQL and, 171 wireless, 1081-1082 multiple versions and,
hashing and, 513, 516-523 interoperation parallelism, 804, 652-653, 1137-1138
lookup and, 705 813-814 nonrepeatable read and, 1137
phantom phenomenon and, interquery parallelism, 802-803 Oracle and, 1181-1182
698-701 intersect, 81-82, 585 phantom read and, 1137-1138
PostgreSQL and, 1130-1131 intersect all, 81-82 PostgreSQL and, 1137-1138,
prepared statements and, intersection, 50 1142
162-164 intersection set, 960 read committed, 649, 1042
privileges and, 143-145 intervals, 1063-1064 read uncommitted, 648, 649
transactions and, 629, 653 intraoperation parallelism recoverable schedules and,
views and, 124-125 aggregation and, 811 647
instances, 8, 904-905 degree of parallelism and, 804 repeated read, 649
1328 Index
resource allocation and, JDBC (Java Database outer, 115-120, 232-235,

635-636 Connectivity), 380, 1052, 565-566, 597
row versioning and, 1244 1154 outer relation, 550
serializability and, 640, advanced SQL and, 158-159 parallel, 806-811, 857
648-653 blob column, 166 partitioned, 539-540, 807-810
snapshot, 652-653, 692-697, caching and, 400-401 PostgreSQL and, 1153
704, 729-730, 1042, 1242, callable statements and, 164 prediction, 1267
1244 clob column, 166 query processing and,
timestamps and, 651-652 connecting to database, 549-566, 855-857
transactions and, 628, 635-640, 159-161 relational algebra and,
646-653 E-R model and, 269, 275 229-232, 239
utilization and, 636 information protocol of, right outer, 117-120, 233-235,
wait time and, 636 160-161 565-566
is unknown, 84 metadata features and, semijoin strategy and, 856-857
item shipping, 776 164-166 size estimation and, 595-596
iteration, 176, 188-190 prepared statements and, sort-merge-join, 553
162-164 theta, 584-585
J++, 1228, 1253 query result retrieval and, types and, 115-120
Jakarta Project, 386 161-162 view maintenance and, 609
.jar files, 160 shipping SQL statements to, join using, 74, 113-114
Java, 14, 157, 169, 173, 387, 945 161 journaling file systems, 439
DOM API, 1008-1009 updatable result sets and, 166 JPEG (Joint Picture Experts
JDBC and, 158-166 join dependencies, 360 Group), 1077
metadata and, 164-166 joins jukebox systems, 431
persistent systems and, complex, 563
971-972 conditions and, 114-115 k-d trees, 1071-1072
SQLJ and, 172 cost analysis and, 555-557, kernel functions, 901-902
Unified Modeling Language 599-601 keys, 45-46
(UML) and, 308 distributed processing and, constraints and, 271-272
Java 2 Enterprise Edition 855-857 decomposition and, 354-355
(J2EE), 386, 1157-1158 equi-joins, 549-559, 563, 566, encryption and, 412-418
Java Database Objects (JDO), 571, 807, 819 entity-relationship (E-R)
971 filtering of, 1187 model and, 271-272
JavaScript fragment-and-replicate, equality on, 542
application design and, 808-809 functional dependencies and,
389-391, 398 full outer, 117-120, 233-234, 330-333
Representation State Transfer 565-566 hashing and, 509-519, 524
(REST) and, 395 hash join, 539-540, 557-562, indexing and, 476-508,
security and, 402-411 571-572, 602 476-509, 524, 529
JavaScript Object Notation hybrid merge, 557 multiple access and, 506-509
(JSON), 395, 863-864 IBM DB2 and, 1209-1210 nonunique, 497-499
JavaServer Faces (JSF) inner, 117-120, 601 smart cards and, 415-416
framework, 397 inner relation and, 550 storage and, 457-459
Java Server Pages (JSP) left outer, 116-120, 233-235, USB, 430
application design and, 377, 565-566 uniquifiers and, 498-499
383-391 lossless decomposition and, keywords
client-side scripting and, 345-346 complex data types and,
389-391 merge-join, 553-555 947-949
security and, 405 minimization and, 613 homonyms and, 925-927
server-side scripting and, natural, 71-74, 87, 113 (see indices and, 927-929
386-388 also natural joins) ontologies and, 925-927
servlets and, 383-391 nested-loop, 550-553 (see also PostgreSQL and, 1130-1131
Web application frameworks nested-loop join) query simplification and,
and, 399 Oracle and, 1168, 1187 1237-1238
JBoss, 386, 399 ordering and, 588-589 ranking and, 915-925
Index 1329
search engine spamming and, distributed databases and, compensation log records
924-925 839-847 (CLRs) and, 751-752, 754
stop words and, 918 dynamic, 1243 identifiers and, 727
synonyms and, 925-927 exclusive, 651, 661-662, old/new values and, 727-728
668-669, 672-673, 679, 691, physical, 745
language constructs, 176-179 698-702, 706-710, 729-730, recovery systems and,
Language Integrated Querying 740-741, 803, 839, 841 726-728, 730-734
(LINQ), 1055, 1249 false cycles and, 846-847 redo and, 729-734
large-object types, 138 fine-grained, 756 steal/no-steal policy and, 740
latent failure, 448 granting of, 666-667 undo, 729-734, 745-746
lazy propagation, 844, 868 growing phase and, 667-669 write-ahead logging (WAL)
lazywriter, 1246 IBM DB2 and, 1217-1220 rule and, 739-741
LDAP Data Interchange implementation of, 670-671 log sequence number (LSN),
Format (LDIF), 872 intention modes and, 680 750-755
least recently used (LRU) logical undo operations and, log writer process, 773-774
scheme, 465-467 744-750 long-duration transactions
left outer join, 116-120, 233-235, long-duration transactions compensation transactions
565-566 and, 1110-1111 and, 1113-1114
legacy systems, 1050-1051 lower/higher level, 745 concurrency control and,
lightweight directory access Microsoft SQL Server and, 1111-1112
protocol (LDAP), 406, 1242-1244, 1246 graph-based protocols and,
871-875 multiple granularity and, 1110
like, 76-77 679-682 implementation issues,
linear regression, 902 multiversion schemes and, 1114-1115
linear search, 541-542 691-692 multilevel, 1111-1112
linear speedup, 778-780 PostgreSQL and, 1143-1145 nesting and, 1111-1112
Linux, 1124, 1193-1194, 1212 recovery systems and, 744-750 nonserializable executions
local-area networks (LANs), request operation and, and, 1110-1111
788-789, 1081 662-671, 675-680, 709 operation logging and, 1115
localtimestamp, 137 shared, 661, 841 performance and, 1110
local wait-for graph, 845 shrinking phase and, 667-669 recoverability and, 1110
location-dependent queries, starvation and, 679 subtasks and, 1109
1080 timestamps and, 682-686 timestamps and, 1110
locking protocols, 666 transaction servers and, two-phase locking and, 1110
biased, 841 773-775 uncommitted data and, 1109
distributed lock manager, 840 true matrix value and, 662 lookup, 600, 1086
graph-based, 671-674 wait-for graph and, 676-678, concurrency control and, 700,
majority, 840-841 845-847 704-708
primary copy, 840 log disk, 438-439
distributed databases and,
quorum consensus, 841-842 logical clock, 843
865, 867, 870
single lock-manager, 839 logical counter, 682
fuzzy, 890, 1266
timestamping, 842-843 logical-design phase, 16,
indices and, 482, 485-500,
two-phase, 667-669 260-261
505-513, 516-518, 522, 524
lock manager, 670-671, 773 logical error, 721
Microsoft SQL Server and,
locks logical logging, 745-746, 1115
1238, 1241, 1266
adaptive granularity and, 776 logical operations
PostgreSQL and, 1148
caching and, 776 consistency and, 746
query processing and, 544,
call back and, 776 early lock release and, 744-750
552-553
compatibility function and, rollback and, 746-749
lossless-join decomposition,
662 undo log records, 745-750
345-346
concurrency control and, logical row-ids, 1164-1165
lossy decomposition, 345-346
661-674 logical undo operation, 745-750
log records lost update, 692
deadlock and, 665-666,
674-679, 839, 841, 844-847, ARIES and, 750-756
1217-1220, 1243-1246 buffering and, 738-739 machine learning, 25-26
1330 Index
magnetic disks, 430 materialized query tables recovery systems and, 724-726
blocks and, 436-439 (MQTs), 1212-1214, 1221 (see also Recovery
buffering and, 437-438 materialized views, 123-124, systems)
checksums and, 434 607 redo log buffer and, 1184
crashes and, 434 aggregation and, 610-611 shared pool, 1184
data-transfer rate and, 435-436 IBM DB2 and, 1212-1214 merge-join, 553
disk controller and, 434 index selection and, 612 merge-purge operation, 890-891
failure classification and, 722 join operation and, 609 merging
hybrid, 440-441 maintenance and, 608-611 complex, 1173-1174
log disk and, 438-439 Oracle and, 1171-1172, 1174, duplicate elimination and,
mean time to failure and, 436 1188 563-564
optimization of disk-block performance tuning and, Oracle and, 1173-1174
access and, 436-439 1039-1040 parallel external sort-merge
parallel systems and, 781-782 and, 806
projection and, 609-610
performance measures of, performance tuning and, 1033
query optimization and,
435-436 query processing and,
611-612
physical characteristics of, 547-549, 553-555, 557
replication and, 1251-1253 mesh system, 780-781
432-435 selection and, 609-610
read-ahead and, 437 message delivery process, 838
max, 84, 86, 96, 236, 566-567 metadata, 12, 164-166
read-write heads and, 432-435 mean time to failure (MTTF),
recording density and, Microsoft, 3, 31, 141
436 advanced SQL and, 160n3,
433-434 Media Access Control (MAC), 169, 173, 180, 184, 197, 205
redundant arrays of 1129 application design and, 387,
independent disks mediators, 859-860, 1018-1019 395-401, 406-407
(RAID) and, 441-449
memory. See also storage distributed databases and, 863
scheduling and, 437
buffers and, 1184 (see also parallel databases and, 816
scrubbing and, 448
buffers) query optimization and, 612
sectors and, 432-434
bulk loading of indices and, storage and, 438
seek-time and, 435-436 Microsoft Active Server Pages
503-504
sizes of, 433 (ASP), 397
cache, 429, 817-818 (see also
main-memory database Microsoft Database Tuning
caching)
systems, 724n1 Assistant, 1040
data access and, 724-726
majority protocol, 840-841, Microsoft Distributed
848-849 flash, 403, 430, 439-441, 506
force output and, 725-726 Transaction Coordinator
man-in-the-middle attacks, 406 (MS DTC), 1242
many server, many-router magnetic-disk, 430, 432-439
main, 429-430 Microsoft Office, 55, 399, 1016
model, 1094 Microsoft SQL Server, 1042,
many-server, single-router main-memory databases and,
1121
model, 1093 1105-1108
business intelligence and,
many-to-many mapping, 270, Microsoft SQL Server and,
1263-1267
276-277 1246-1247
compilation and, 1236-1237
many-to-one mapping, 270, 276 multitasking and, 1092-1095 compression and, 1236
mapping cardinalities, 269-270, .NET Common Language concurrency control and,
276-277 Runtime (CLR) and, 1241-1246
markup languages. See also 1255-1256 data access and, 1248-1250
specific language nonvolatile random-access, database mirroring and,
file processing and, 981-982 438 1245-1246
structure of, 981-990 optical, 430 data mining and, 1266-1267
tags and, 982-985 Oracle structures and, data types and, 1229-1230
transactions and, 983-985 1183-1184 design tools and, 1223-1228
master-slave replication, overflows and, 560 development of, 1223
843-844 persistent programming filegroups and, 1233-1234
master table, 1032-1033 languages and, 964-972 indexing and, 1231-1236
materialization, 567-568 query costs and, 544 locks and, 1242-1244
Index 1331
management tools and, disconnectivity and, multiversion concurrency

1223-1228 1083-1085 control (MVCC)
memory management and, handoff and, 1081 DDL commands and,
1246-1247 invalidation reports and, 1083 1144-1145
page units and, 1233-1234 mobile computer model and, DML commands and,
partitions and, 1235 1080-1082 1138-1139
Query Editor, 1224-1225 queries and, 1082 implementation of, 1139-1143
query processing and, recoverability and, 1083 implications of using,
1223-1231, 1236-1241, routing and, 1082 1143-1144
1250-1251 updates and, 1083-1084 indices and, 1145
read-ahead and, 1235-1236 version-numbering schemes isolation levels and, 1137-1138
recovery and, 1241-1246 and, 1083-1084 locks and, 1145
reordering and, 1238-1239 wireless communications recovery and, 1145-1146
replication and, 1251-1253 and, 1080-1082 schema for, 689-692
routines and, 1231 Model-View-Control design, multiway splits, 898
security and, 1247-1248 1157-1158 MySQL, 31, 76, 111, 160n3,
server programming in .NET, most recently used (MRU) 1123, 1155
1253-1258 scheme, 467
snapshot isolation and, 1242, most-specific type, 953 Naı̈ve Bayesian classifiers, 901,
1244 MPEG (Moving Picture Experts 1191, 1266
SQL Profiler and, 1225-1227 Group), 1077-1078 naı̈ve users, 27-28
SQL Server Broker and, multicore processors, 817-819 name servers, 829
1261-1263 multidatabase system, 857-861 NAND flash memory, 430,
SQL Server Management multidimensional data, 199 440-441
Studio and, 1223-1224, multimaster replication, 844 natural joins, 49-50, 87, 113
1227-1228 multimedia data, 1062 conditions and, 114-115
SQL variations and, 1228-1233 multimedia databases, full outer, 117-120, 233-234,
storage and, 1233-1236 1076-1079 565-566
system architecture, 1246-1248 multiple granularity inner, 117-120, 601
concurrency control and, left outer, 116-120, 233-235,
tables and, 1234
679-682 565
thread pooling and, 1246
hierarchy definition for, 679 on condition and, 114-115
triggers and, 1232-1233
intention-exclusive (IX) mode outer, 115-120
tuning and, 1224, 1227
and, 680 SQL queries and, 71-74
types and, 1257-1258 intention-shared (IS) mode relational algebra and,
updates and, 1232-1233, 1239 and, 680 229-232
Windows Mobile and, 1223 locking protocol and, 681-682 right outer, 117-120, 233-234,
XML support and, 1258-1261 shared and 565-566
Microsoft Transaction Server, intention-exclusive (SIX) types and, 115-120
1091 mode and, 680 nearest-neighbor query,
Microsoft Windows, 195, 426, tree architecture and, 679-682 1070-1071
1078 multiple-key access, 506-509 negation, 595
IBM DB2 and, 1193-1194, 1212 multiquery optimization, 614 nested-loop join, 1071
PostgreSQL and, 1124, 1155 multiset relational algebra, 238 IBM DB2 and, 1209-1210
SQL Server and, 1223-1224, multiset types, 956-961 Oracle and, 1173
1228, 1242, 1246-1248 multisystem applications, 1096 parallel, 807, 810-811
storage and, 438 multitable cluster file PostgreSQL and, 1152-1153
min, 84, 86, 236, 566-567 organization, 458, query optimization and,
minpctused, 1203 460-462 600n2, 602, 604
minus, 82n7 multitasking, 771, 1092-1095 query processing and,
mirroring, 441-442, 444, multithreading, 817-818, 1093 550-555, 558-560, 565,
1245-1246 multivalued attributes, 267-268, 571, 573
mobility, 1062, 1086 327-329 nested subqueries
broadcast data and, 1082-1083 multivalued dependencies, application development and,
consistency and, 1083-1085 355-360 1031, 1047
1332 Index
duplicate tuples and, 94-95 multiple granularity and, decode and, 208
empty relations and, 93-94 679-682 decorrelation and, 1174
from clause and, 95-96 nonleaf, 487 file organization and, 451-468
optimization of, 605-607 overfitting and, 899-900 integrity constraints and,
scalar, 97-98 splitting of, 491, 706 128-130, 133-134
set operations and, 90-93 updates and, 491-500 left outer join, 233
with clause and, 97 XML and, 998 OLAP and, 202
nesting no-force policy, 739-740 query simplification and,
ARIES and, 755-756 nonacceptable termination 1237-1238
concurrency control and, 679, states, 1099 right outer join, 234-235
709 nonclustering, 477 temporal data and, 364-367
granularities and, 679 nondeclarative actions, 158 user-defined types and, 140
IBM DB2 and, 1197, nonleaf nodes, 487 numeric, 59, 62
1209-1210, 1218 nonprocedural languages, nvarchar, 60
long-duration transactions 47-48 N-way merge, 547
and, 1112-1113 nonrepeatable read, 1137
object-based databases and, nonrepudiation, 416 object-based databases, 975
945, 948, 958-961 nonunique search keys, 497-499 array types and, 956-961
Oracle and, 1159, 1164, 1182 nonvolatile random-access collection volumes and,
queries and, 601-607, memory (NVRAM), 438 957-958
1004-1007, 1013-1014, nonvolatile storage, 432, 632, complex data types and,
1017 722, 724-726, 743-744 946-949
transactions and, 1091, nonvolatile write buffers, 438 correspondence and, 955
1112-1113, 1116, 1218 NOR flash memory, 430, 439 feature implementation and,
XML and, 27, 943, 984-998, normal forms, 18 963-964
1001, 1004-1007, 1010 atomic domains and, 327-329 inheritance and, 949-956
.NET, 169 Boyce-Codd, 333-336, 349-352, mapping and, 973
NetBeans, 386, 397 354-356 multiset types and, 956-961
.NET Common Language complex data types and, 947 nesting and, 945, 948, 958-961
Runtime (CLR) domain-key, 360 object-identity types and,
aggregates and, 1257-1258 fifth, 360 961-963
basic concepts of, 1254 first, 327-329 object-oriented vs.
extensibility contracts and, fourth, 356, 358-360 object-relational
1256-1258 higher, 337-338 approaches and, 973-974
Microsoft SQL Server and, join dependencies and, 360 persistent programming
1253-1258 project-join, 360 languages and, 964-972,
routines and, 1256-1257 second, 361 974
SQL hosting and, 1254-1256 third, 336-337 reference types and, 961-963
table functions and, 1256-1257 normalization, 16, 18-20 relational data model and, 945
types and, 1257-1258 denormalization and, 363-364 structured types and, 949-953
Netezza, 816 entity-relationship (E-R) unnesting and, 958-961
networks model and, 361-362 Object Database Management
data model and, 9, 1080-1082 performance and, 363-364 Group (ODMG),
local area, 788-789, 1081 relational database design 1054-1055
mobility and, 1079-1085 and, 361-362 object-oriented databases, 393
wide-area types and, 788, no-steal policy, 740 object-oriented data model, 27
790-791 not connective, 66 object-relational data model, 27
nextval for, 1043 not exists, 93, 192 object-relational mapping, 393,
nodes. See also storage not in, 90-91, 92n8 946, 973
B+-trees and, 485-506 not null, 61, 83, 129-130, 133, 140 observable external writes,
coalescing, 491, 706 not unique, 95 634-635
distributed systems and, 784 null bitmap, 456 ODBC (Open Database
IBM DB2 and, 1200-1201 null values, 19, 83-84 Connectivity), 380, 1052
mesh architecture and, aggregation with, 89-90 advanced SQL and, 166-169
780-781 attributes and, 268-269 API definition and, 166-167
Index 1333
caching and, 401 analytic workspaces and, 1161 shared server and, 1185
conformance levels and, archiver and, 1185 as Software Development
168-169 caching and, 1179-1180, 1184 Laboratories, 1157
Microsoft SQL Server and, checkpoint and, 1185 SQL basics and, 55, 75n4,
1249 clusters and, 1173, 1186 82n7, 96, 141, 160-161,
PostgreSQL and, 1154 compression and, 1165 172-174, 178, 180,
standards for, 1053-1054 concurrency control and, 184-185, 197, 205
type definition and, 168 1180-1183 SQL Loader and, 1189
OLAP (online analytical database administration tools SQL Plan Management and,
processing), 1046 and, 1189-1191 1177-1178
all attribute and, 201-203, 205 database design and, 355, 386, SQL Tuning Advisor,
applications for, 197-201 396, 408n5, 409, 1157-1158 1176-1177
cross-tabulation and, 199-203, database writer, 1185 SQL variations and, 1158-1162
205, 210 data guard, 1183 subquery flattening and, 1174
data cube and, 200, 206-210 data mining and, 1191 system architecture, 795, 803,
decode function and, 208 data warehousing and, 1158 843, 1183-1188
dicing and, 201 dedicated servers and, system monitor, 1185
drill down and, 201 1183-1185 tables and, 1163-1166,
implementation of, 204 dimensional modeling and, 1172-1173, 1187, 1189
Microsoft SQL Server and, 1160, 1171 transactions and, 649, 653,
1223, 1266 distribution and, 1188-1189 692-693, 697, 710, 718
multidimensional data and, encryption and, 1165-1166 transformations and,
199 Exadata and, 1187-1188 1173-1174
null value and, 202 external data and, 1188-1189 trees and, 1191
Oracle and, 1161 hashing and, 1170 triggers and, 1161-1162
order by clause and, 205 indices and, 1162-1173 updates and, 1179-1180
pivot clause and, 205, 210 isolation levels and, 1181-1182 virtual private database and,
relational tables and, 202-203, joins and, 1168, 1187 1166
205 logical row-ids and, 1164-1165 XML DB and, 1159-1160
rollup and, 201, 206-210 log writer and, 1185 Oracle Application
slicing and, 201 materialized views and, Development
in SQL, 205-209 1171-1172, 1174, 1188 Framework (ADF),
OLE-DB, 1249 memory structures and, 1157-1158
OLTP (online transaction 1183-1184 Oracle Automatic Storage
processing), 1046-1047, optimizer of, 1174-1176 Manager, 1186-1187
1165, 1186, 1264 parallel execution and, 1178 Oracle Automatic Workload
on condition, 114-115 partitions and, 1169-1172, Repository (AWR), 1190
on delete cascade, 133, 185 1176 Oracle Business Intelligence
one-to-many mapping, 269, 276 process monitor and, 1185 Suite (OBI), 1158
one-to-one mapping, 269, 276 process structures and, Oracle Database Resource
ontologies, 925-927 1184-1185 Management, 1190-1191
OOXML (Office Open XML), projection and, 1187 Oracle Designer, 1158
1016 query optimization and, 582, Oracle Enterprise Manager
Open Document Format 593, 603-604, 612, (OEM), 1190
(ODF), 1016 1173-1178 Oracle JDeveloper, 1158
open statement, 170-171 query processing and, Oracle Tuxedo, 1091
operation logging, 1115 1157-1158, 1162-1172 or connective, 66
operator tree, 803-804 Real Application Clusters order by, 77-78, 193
optical storage, 430-431, 449-450 (RAC) and, 1186 organize by dimensions, 1204
optimistic concurrency control recovery and, 1180-1183, 1185 or operation, 83-84
without read validation, replication and, 1188-1189 outer-join, 115-120, 232-235,
704 result caching and, 1179-1180 565-566, 597
Oracle, 3, 30, 216, 1121 security and, 1165-1166 outer relation, 550
access path selection and, segments and, 1163 overfitting, 899-900
1174 serializability and, 1181-1182 overflow avoidance, 560
1334 Index
overflow buckets, 512-514 parallel systems passwords. See also security

overflow resolution, 560 coarse-grain, 777 application design and, 376,
overlapping entity sets, 300 fine-grain, 777 382, 385, 393, 405-407, 415
overlapping specialization, hierarchical, 781 dictionary attacks and, 414
296-297 interconnection networks distributed databases and, 871
overloading, 968 and, 780-781 leakage of, 405
interference and, 780 man-in-the-middle attacks
massively parallel, 777-778 and, 406
P + Q redundancy schema, 446
scaleup and, 778-780 one-time, 406
PageLSN, 751, 753, 754
shared disk, 781 single sign-on system and,
PageRank, 922-925, 928
shared memory, 781-783 406-407
page shipping, 776
SQL and, 142, 160, 168, 170
parallel databases shared nothing, 781
storage and, 463-464
cache memory and, 817-818 skew and, 780
PATA (parallel ATA), 434
cost of, 797 speedup and, 778-780
pctfree, 1203
decision-support queries and, start-up costs and, 780 performance
797 throughput and, 778 access time and, 431-439, 447,
failure rates and, 816 parameter style general, 179 450-451, 476, 479, 523,
increased use of, 797 parametric query optimization, 540-541, 817
interoperation parallelism 615 application design and,
and, 813-814 parity bits, 444-446 400-402
interquery parallelism and, parsing B+-trees and, 485-486
802-803 application design and, 388 caching and, 400-401
intraoperation parallelism bulk loads and, 1031-1033 data-transfer rate and, 435-436
and, 804-812 query processing and, denormalization and, 363-364
intraquery parallelism and, 537-539, 572, 1236-1237 magnetic disk storage and,
803-804 participation constraints, 270 435-436
I/O parallelism and, 798-802 partitioning vector, 798-799 parallel processing and,
massively parallel processors partitions 401-402
(MPP) and, 1193 attributes and, 896-897 response time and, 400 (see
multicore processors and, availability and, 847-853 also response time)
817-819 seek times and, 433, 435-439,
balanced range, 801
multithreading and, 817-818 450-451, 540, 555
classifiers and, 896-897
operator tree and, 803-804 sequential indices and,
cloud computing and, 865-866
Oracle and, 1178-1179 485-486
partitioning techniques and, composite, 1170-1171
condition and, 896-897 transaction time and, 365n8,
798-799 1062
pipelines and, 814-815 distributed databases and,
832, 835 web applications and, 377-382
query optimization and, performance benchmarks
814-817 hash, 798-799, 807, 1170
database-application classes
raw speed and, 817 joins and, 807-810
and, 1046
skew and, 800-801, 805-808, list, 1170 suites of tasks, 1045-1046
812, 814, 819 Microsoft SQL Server and, Transaction Processing
success of, 797 1235 Performance Council
system design and, 815-817 Oracle and, 1169-1172, 1176 (TPC), 1046-1048
parallel external sort-merge, point queries and, 799 performance tuning
806 pruning and, 1176 bottleneck locations and,
parallelism, 442-444 query optimization and, 1033-1035
parallel joins, 806, 857 814-815 bulk loads and, 1031-1033
fragment-and-replicate, range, 798-800, 805, 1170 concurrent transactions and,
808-809 reference, 1171 1041-1044
hash, 809-810 round-robin, 798-801 hardware and, 1035-1038
nested-loop, 810-811 scanning a relation and, 799 indices and, 1039-1041
partitioned, 807-810 Partner Interface Processes materialized views and,
parallel processing, 401-402 (PIPs), 1055 1039-1041
Index 1335
parameter adjustment and, PL/SQL, 173, 178 multiversion concurrency

1029-1030, 1035 pointers. See also indices control (MVCC) and,
physical design and, application design and, 409 1137-1146
1040-1041 child nodes and, 1074 operator classes and, 1150
RAID choice and, 1037-1038 concurrency control and, operator statements and, 1136
of schema, 1038-1039 706-707 parallel databases and,
set orientation and, 1030-1031 IBM DB2 and, 1199, 1202-1203 816-817
simulation and, 1044-1045 information retrieval and, 936 performance tuning and, 1042
updates and, 1030-1033 main-memory databases and, pointers and, 1134, 1147-1148
Perl, 180, 387, 1154 1107 procedural languages and,
persistent messaging, 836-837 multimedia databases and, 1136
persistent programming 1077 query optimization and, 582,
languages, 974 593
Oracle and, 1165
approaches for, 966-967 query processing and,
byte code enhancement and, persistent programming
1151-1154
971 languages and, 967, 969,
recovery and, 718
972
C++, 968-971 rollbacks and, 1142-1144
class extents and, 969, 972 PostgreSQL and, 1134, rules and, 1130-1131
database mapping and, 971 1147-1148 serializability and, 1142-1143
defined, 965 query optimization and, 612 server programming
iterator interface and, 970 query processing and, interface, 1136
Java, 971-972 544-546, 554 sort and, 1153
object-based databases and, recovery systems and, 727, SQL basics and, 140, 160, 173,
964-972, 974 754 180, 184
object identity and, 967 SQL basics and, 166, 179-180 state transition and, 1134
object persistence and, storage and, 439, 452-462 storage and, 1146-1151
966-968 point queries, 799 system architecture, 1154-1155
overloading and, 968 polymorphic types, 1128-1129 system catalogs and, 1132
persistent objects and, 969 popularity ranking, 920-925 transaction management in,
pointers and, 967, 969, 972 PostgreSQL, 31, 1121 649, 653, 1137-1146
reachability and, 971 access methods and, 1153 trees and, 1148-1149
relationships and, 969 aggregation and, 1153 triggers and, 1153-1154
single reference types and, 972 command-line editing and, trusted/untrusted languages
transactions and, 970 1124 and, 1136
updates and, 970 concurrency control and, 692, tuple ID and, 1147-1148
person-in-the-middle attacks, 697, 701, 1137-1145 tuple visibility and, 1139
1105 constraints and, 1130-1131, types, 1126-1129, 1132-1133
phantom phenomenon, 698-701 1153-1154 updates and, 1130, 1141-1144,
phantom read, 1137-1138, 1142, DML commands and, 1147-1148
1217-1218 1138-1139 user interfaces, 1124-1126
PHP, 387-388 vacuum, 1143
extensibility, 1132
physical data independence, 6 precedence graph, 644
functions, 1133-1135
physical-design phase, 16, 261 precision, 903
physiological redo, 750 Generalized Inverted Index predicate reads, 697-701
pinned blocks, 465 (GIN) and, 1149 prediction
pipelining, 539, 568 Generalized Search Tree classifiers and, 894-904
demand-driven, 569-570 (GiST) and, 1148-1149 data mining and, 894-904
double-pipelined hash-join hashing and, 1148 joins and, 1267
and, 571-572 indices and, 1135-1136, prepared statements, 162-164
parallel databases and, 1146-1151 presentation facilities,
813-815 isolation levels and, 1094-1095
producer-driven, 569-571 1137-1138, 1142 presentation layer, 391
pulling data and, 570-571 joins and, 1153 prestige ranking, 920-925,
pivot clause, 205, 210, 1230 locks and, 1143-1145 930-931
plan caching, 605 major releases of, 1123-1124 primary copy, 840
1336 Index
primary keys, 45-46, 60-62 pseudotransitivity rule, 339 ODBC and, 166-169
decomposition and, 354-355 public-key encryption, 412-414 OLAP and, 197-209
entity-relationship (E-R) publishing, 1013, 1251-1253 Oracle and, 1171-1172
model and, 271-272 pulling data, 570-571 PageRank and, 922-923
functional dependencies and, purity, 897 parallel databases and,
330-333 Python, 180, 377, 387, 1123, 797-820
integrity constraints and, 1125, 1136 persistent programming
130-131 languages and, 964-972
primary site, 756 QBE, 37, 245, 770 point, 799
privacy, 402, 410-411, 418, 828, quadratic split, 1075-1076 programming language
869-870, 1104 quadtrees, 1069, 1072-1073 access and, 157-173
privileges queries, 10 range, 799
all, 143-144 ADO.NET and, 169 read only, 804
execute and, 147 availability and, 826-827 recursive, 187-192
granting of, 143-145 B+-trees and, 488-491 result diversity and, 932
public, 144 basic structure of SQL, 63-71
revoking of, 143-145, 149-150 ResultSet object and, 159, 161,
caching and, 400-401
transfer of, 148-149 164-166, 393, 397-398, 490
Cartesian product and, 50-51,
procedural DMLs, 10 retrieving results, 161-162
68-69, 71-75, 120, 209,
procedural languages, 20 217, 222-229, 232, 573, scalar subqueries and, 97-98
advanced SQL and, 157-158, 584, 589, 595-596, 606, 616 security and, 402-417
173, 178 complex data types and, servlets and, 383-391
IBM DB2 and, 1194 946-949 set operations and, 79-83,
Oracle and, 1160, 1191 correlated subqueries and, 93 90-93
PostgreSQL and, 1130, 1133, data-definition language on single relation, 63-66
1136 (DDL) and, 21-22 spatial data and, 1070-1071
relational model and, 47-48 data-manipulation language string operations and, 76-79
procedures (DML) and, 21-22 transaction servers and, 775
declaring, 174-175 decision-support, 797 universal Turing machine
external language routines delete and, 98-100 and, 14
and, 179-180 distributed databases and, user requirements and,
language constructs for, 825-878 (see also 311-312
176-179 distributed databases) views and, 120-128
syntax and, 173-174, 178 hashing and, 475, 516-522 (see XML and, 998-1008
writing in SQL, 173-180 also hash functions) query cost
producer-driven pipeline, indices and, 475 (see also Microsoft SQL Server and,
569-570 indices) 1237-1239
program global area (PGA), information retrieval and, optimization and, 580-581,
1183 915-938 590-602
programming languages. See insert and, 100-101 processing and, 540-541, 544,
also specific language intermediate SQL and, 548, 555-557, 561
accessing SQL from, 157-173 113-151 query evaluation engine, 22
mismatch and, 158 JDBC and, 158-166
query-evaluation plans,
variable operations of, 158 location-dependent, 1080
537-539
projection metadata and, 164-166
intraoperation parallelism choice of, 598-607
multiple-key access and,
and, 811 506-509 expressions and, 567-572
Oracle and, 1187 on multiple relations, 66-71 materialization and, 567-568
queries and, 564, 597 natural joins and, 71-74, 87, optimization and, 579-616
view maintenance and, 113-120 (see also joins) pipelining and, 568-572
609-610 nearest-neighbor, 1070-1071 response time and, 541
project-join normal form nested subqueries, 90-98 set operations and, 564
(PJNF), 360 null values and, 83-84 viewing, 582
project operation, 219 object-based databases and, query-execution engine, 539
PR quadtrees, 1073 945-975 query-execution plan, 539
Index 1337
query languages, 249. See also relational algebra and, PostgreSQL and, 1151-1154
specific language 579-590 projection and, 563-564
accessing from a result caching and, 1179-1180 recursive partitioning and,
programming language, set operations and, 597 539-540
157-173 shared scans and, 614 relational algebra and,
centralized systems and, simplification and, 1237-1238 537-539
770-771 SQL Plan Management and, reordering and, 1238-1239
domain relational calculus 1177-1178 selection operation and,
and, 245-248 SQL Tuning Advisor and, 541-546
expressive power of 1176-1177 set operations and, 564-565
languages, 244, 248 top-K, 613 sorting and, 546-549
formal relational, 217-248 transformations and, 582-590, SQL and, 537-538
nonprocedural, 239-244 1173-1174 standard planner and, 1152
procedural, 217-239 updates and, 613-614 syntax and, 537
relational algebra and, query processing, 21-22, 30, 32 transformation and, 854-855
217-239 aggregation, 566-567 triggers and, 1153-1154
relational model and, 47-48, basic steps of, 537 XML and, 1259-1260
50 binding and, 1236-1237 question answering, 933-934
temporal, 1064 comparisons and, 544-545 queueing systems, 1034-1035
tuple relational calculus and, compilation and, 1236-1237 quorum consensus protocol,
239-244 cost analysis of, 540-541, 544, 841-842
query optimization, 22, 537, 548, 555-557, 561
539, 552-553, 562, 616 CPU speeds and, 540 random access, 437
access path selection and, distributed databases and, random samples, 593
1174-1176 854-857, 859-860 random walk model, 922
aggregation and, 597 distributed heterogeneous, range-partitioning sort, 805
cost analysis and, 580-581, 1250-1251 range-partitioning vector, 801
590-602 duplicate elimination and, range queries, 799
distributed databases and, 563-564 ranking, 192-195
854-855 evaluation of expressions, rapid application development
equivalence and, 582-588 567-572 (RAD)
estimating statistics of executor module and, functions library and, 396
expression results, 1152-1153 report generators and, 399-400
590-598 file scan and, 541-544, 550, user interface building tools
heuristics in, 602-605 552, 570 and, 396-398
IBM DB2 and, 1211-1212 hashing and, 557-562 Web application frameworks
join minimization, 613 IBM DB2 and, 1207-1216 and, 398-399
materialized views and, identifiers and, 546 raster data, 1069
607-612 information retrieval and, Rational Rose, 1194
Microsoft SQL Server and, 915-937 read-ahead, 437
1236-1241 join operation and, 549-566 read committed
multiquery, 614 LINQ and, 1249 application development and,
nested subqueries and, materialization and, 567-568, 1042
605-607 1212-1214 Microsoft SQL Server and,
Oracle and, 1173-1178 Microsoft SQL Server and, 1242
parallel databases and, 1223-1231, 1236-1241, Oracle and, 1181
814-817 1250-1251 PostgreSQL and, 1138,
parametric, 615 mobile, 1082 1141-1142
parallel execution and, operation evaluation and, transaction management and,
1178-1179 538-539 649, 658, 685, 701-702
partial search and, 1240 Oracle and, 1157-1158, read one, write all available
partitions and, 1174-1176 1172-1180 protocol, 849-850
plan choice for, 598-607 parsing and, 537-539, 572-573, read one, write all protocol, 849
PostgreSQL and, 1151-1154 1236-1237 read only queries, 804
process structure and, 1179 pipelining and, 568-572 read quorum, 841-842
1338 Index
read uncommitted, 648 shadow-copy scheme and, referential integrity, 11, 46-47,
read-write contention, 727 131-136, 151, 181-182, 628
1041-1042 snapshot isolation and, referrals, 875
read/write operations, 653-654 729-730 reflexivity rule, 339
real, double precision, 59 steal/no-steal policy and, 740 region quadtrees, 1073
real-time transaction systems, storage and, 722-726, 734-735, regression, 902-903, 1048-1049
1108-1109 743-744 relational algebra, 51-52,
recall, 903 successful completion and, 248-249, 427
recovery interval, 1244-1245 723 aggregate functions, 235-239
recovery manager, 22-23 undo and, 729-738 assignment, 232
recovery systems, 186, 631, workflows and, 1101 avg, 236
760-761, 1083 write-ahead logging (WAL) Cartesian-product, 222-226
actions after crash, 736-738 rule and, 739-741, composition of relational
algorithm for, 735-738 1145-1146 operations and, 219-220
ARIES, 750-756 recovery time, 758 count-distinct, 236
atomicity and, 726-735 recursive partitioning, 539-540 equivalence and, 582-588,
buffer management and, recursive queries, 187 601-602
738-743 iteration and, 188-190 expression transformation
checkpoints and, 734-735, SQL and, 190-192 and, 582-590
742-743 transitive closure and, 188-190 expressive power of
concurrency control and, recursive relationship sets, 265 languages, 244
729-730 redo formal definitions of, 228
actions after crash, 736-738 fundamental operations,
data access and, 724-726
pass, 754 217-228
database mirroring and,
phase, 736-738 generalized-projection, 235
1245-1246
recovery systems and, 729-738 join expressions, 239
database modification and,
redundancy, 4, 261-262, 272-274 max, 236
728-729
redundant arrays of min, 236
disk failure and, 722 independent disks multiset, 238
distributed databases and, (RAID), 435, 759, 1147 natural-join, 229-232
835-836 bit-level striping, 442-444 outer-join, 232-235
early lock release and, 744-750 error-correcting-code (ECC) project operation, 219
fail-stop assumption and, 722 organization and, 444-445 query optimization and,
failure and, 721-723, 743-744 hardware issues, 448-449 579-590
force/no-force policy and, hot swapping and, 449 query processing and, 537-539
739-740 levels, 444-448 rename, 226-228
IBM DB2 and, 1200-1203, mirroring and, 441-442, 444 select operation, 217-219
1217-1218 parallelism and, 442-444 semijoin strategy and, 856-857
logical undo operations and, parity bits and, 444-446 set-difference, 221-222
744-750 performance reliability and, set-intersection, 229
log records and, 726-728, 442-444 SQL and, 219, 239
730-734, 738-739 performance tuning and, sum, 235-236
log sequence number (LSN) 1037-1038 union operation, 220-221
and, 750 recovery systems and, 723 relational database design, 368
long-duration transactions reliability improvement and, atomic domains and, 327-329
and, 1110 441-442 attribute naming, 362-363
Microsoft SQL Server and, scrubbing and, 448 decomposition and, 329-338,
1241-1246 software RAID and, 448 348-360
Oracle and, 1180-1183 striping data and, 442-444 design process and, 361-364
partitions and, 1169-1172 references, 131-133, 148 features of good, 323-327
PostgreSQL and, 1145-1146 referencing new row as, 181-182 first normal form and, 327-329
redo and, 729-738 referencing new table as, 183 fourth normal form and, 356,
remote backup, 723, 756-759, referencing old row as, 182 358-360
850, 1095-1096 referencing old table as, 183 functional dependencies and,
rollback and, 729-734, 736 referencing relation, 46 329-348
Index 1339
larger schemas and, 324-325 redundancy and, 288 multiversion schemes and,
multivalued dependencies representation of, 286-290 691
and, 355-360 schema combination and, snapshot isolation and, 693
normalization and, 361-362 288-290 timestamps and, 682
relationship naming, 362-363 superclass-subclass, 296-297 resource managers, 1095
second normal form and, Unified Modeling Language response time
336n5, 361 (UML) and, 308-310 application design and, 400,
smaller schemas and, 325-327 relative distinguished names 1037, 1046
temporal data modeling and, (RDNs), 872 concurrency control and, 688
364-367 relevance E-R model and, 311
third normal form and, adjacency test and, 922-923 Microsoft SQL Server and,
336-337 hubs and, 924 1261
relational databases PageRank and, 922-923, 925 Oracle and, 1176-1177, 1190
access from application popularity ranking and, query evaluation plans and,
programs and, 14-15 920-922 541
data-definition language and, ranking using TF-IDF, query processing and, 541
14 917-920, 925 storage and, 444, 1106,
data-manipulation language search engine spamming and, 1109-1110
(DML) and, 13-14 924-925 transactions and, 636
storage and, 1010-1014 similarity-based retrieval and, system architecture and, 778,
tables and, 12-13 919-920 798, 800, 802
relational model, 9 TF-IDF approach and, restriction, 149-150, 347
disadvantages of, 30 917-925, 928-929 ResultSet object, 159, 161,
domain and, 42 using hyperlinks and, 3421 164-166, 393, 397-398, 490
keys and, 45-46 Web crawlers and, 930-931 revoke, 145, 149
natural joins and, 49-50 relevance feedback, 919-920 right outer join, 117-120,
operations and, 48-52 remote backup systems, 723, 233-235, 565-566
query languages and, 47-48, 756-759, 850, 1095-1096 Rijndael algorithm, 412-413
50 remote-procedure-call (RPC) robustness, 847
referencing relation and, 46 mechanism, 1096 roles, 264-265
schema for, 42-47, 302-304, rename operation, 75-76, authorization and, 145-146
1012 226-228 entity-relationship diagrams,
structure of, 39-42 repeat, 176 278
tables for, 39-44, 49-51, repeatable read, 649 Unified Modeling Language
202-205 repeat loop, 188, 341, 343, 490 (UML) and, 308-310
tuples and, 40-42, 49-50 replication rollback, 173
relation instance, 42-45, 264 cloud computing and, 866-868 ARIES and, 754-755
relationship sets distributed databases and, cascading, 667
alternative notations for, 843-844 concurrency control and, 667,
304-310 Microsoft SQL Server and, 670, 674-679, 685, 689,
atomic domains and, 327-329 1251-1253 691, 709
attribute placement and, system architectures and, 785, IBM DB2 and, 1218
294-295 826, 829 logical operations and,
binary vs. n-ary, 292-294 report generators, 399-400 746-749
descriptive attributes, 267 Representation State Transfer PostgreSQL and, 1142-1144
design issues and, 291-295 (REST), 395 recovery systems and,
entity-relationship diagrams request forgery, 403-405 729-734, 736
and, 278-279 request operation remote backup systems and,
entity-relationship (E-R) deadlock handling and, 758-759
model and, 264-267, 675-679 transactions and, 736
286-290, 296-297 locks and, 662-671, 675-680, timestamps and, 685-686
entity sets and, 291-292 709 undo and, 729-734
naming of, 362-363 lookup and, 706 rollback work, 127
nonbinary, 278-279 multiple granularity and, rollup, 201, 206-210, 1221-1222
recursive, 265 679-680 RosettaNet, 1055
1340 Index
row triggers, 1161-1162 relational database design man-in-the-middle attacks

R-trees, 1073-1076 and, 323-368 and, 406
Ruby on Rails, 387, 399 relational model and, 42-47 Microsoft SQL Server and,
runstats, 593 relationship sets and, 286-288 1247-1248
shadow-copy, 727 observable external writes
SAS (Serial attached SCSI), 434 smaller, 325-327 and, 634-635
Sarbanes-Oxley Act, 1248 strong entity sets and, 283-285 Oracle and, 1165-1166
SATA (serial ATA), 434, 436 timestamps and, 682-686 passwords and, 142, 160, 168,
savepoints, 756 tuple relational calculus, 170, 376, 382, 385,
scalar subqueries, 97-98 239-244 393, 405-407, 415, 463-464, 871
scaleup, 778-780 version-numbering, 1083-1084 person-in-the-middle attacks
scheduling weak entity sets and, 285-286 and, 1105
Microsoft SQL Server and, XML documents, 990-998 physical data independence
1254-1255 scripting languages, 389 and, 6
PostgreSQL and, 1127 scrubbing, 448 privacy and, 402, 410-411, 418,
query optimization and, search engine spamming, 828, 869-870, 1104
814-815 924-925 remote backup systems and,
storage and, 437 search keys 756-759
transactions and, 641, hashing and, 509-519, 524 request forgery and, 403-405
1099-1100, 1108 indexing and, 476-509, 524, single sign-on system and,
schema definition, 28 529 406-407
schema diagrams, 46-47 nonunique, 497-499 SQL injection and, 402-403
schemas, 8 storage and, 457-459 unique identification and,
alternative notations for uniquifiers and, 498-499 410-411
modeling data, 304-310 secondary site, 756 virtual private database and,
authorization on, 147-148 second normal form, 361 1166
basic SQL query structures Secure Electronic Transaction Security Assertion Markup
and, 63-74 (SET) protocol, 1105 Language (SAML), 407
canonical cover and, 342-345 security, 5, 147 seek times, 433, 435-439,
catalogs and, 142-143 abstraction and, 6-8, 10 450-451, 540, 555
combination of, 288-290 application design and, select, 363
concurrency control and, 402-417
aggregate functions and,
661-710 (see also audit trails and, 409-410
84-90
concurrency control) authentication and, 405-407
attribute specification, 77
data-definition language authorization and, 11, 21,
basic SQL queries and, 63-74
(DDL) and, 58, 60-63 407-409
on multiple relations, 66-71
data mining, 893-910 concurrency control and,
661-710 (see also natural join and, 71-74
data warehouses, 889-893
entity-relationship (E-R) concurrency control) null values and, 83-84
model and, 262-313 cross site scripting and, privileges and, 143-145, 148
functional dependencies and, 403-405 ranking and, 194
329-348 dictionary attacks and, 414 rename operation and, 74-75
generalization and, 297-304 encryption and, 411-417, set membership and, 90-91
larger, 324-325 1165-1166 set operations and, 79-83
locks and, 661-686 end-user information and, on single relation, 63-65, 63-66
performance tuning of, 407-408 string operations and, 76-79
1038-1039 GET method and, 405 select all, 65
physical-organization integrity manager and, 21 select distinct, 64-65, 84-85, 91,
modification and, 28 isolation and, 628, 635-640, 125
recovery systems and, 721-761 646-653 select-from-where
reduction to relational, keys and, 45-46 delete and, 98-100
283-290 locks and, 661-686 (see also function/procedure writing
redundancy of, 288 locks) and, 174-180
relational algebra and, long-duration transactions inheritance and, 949-956
217-239 and, 1109-1115 insert and, 100-101
Index 1341
join expressions and, 71-74, snapshot isolation and, Sherpa/PNUTS, 866-867

87, 113-120 693-697 shredding, 1013, 1258-1259
natural joins and, 71-74, 87, topological sorting and, similarity-based retrieval,
113-120 644-646 919-920, 1079
nested subqueries and, 90-98 transactions and, 640-646, 648, Simple API for XML (SAX),
transactions and, 651-654 650-653 1009
types handling and, 949-963 view, 687 Simple Object Access Protocol
update and, 101-103 serializable schedules, 640 (SOAP), 1017-1018, 1056,
views and, 120-128 server programming interface 1249-1250
selection (SPI), 1136 single lock-manager, 839-840
comparisons and, 544-545 server-side scripting, 386-388 single-server model, 1092-1093
complex, 545-546 server systems single-valued attributes,
conjunctive, 545-546 categorization of, 772-773 267-268
disjunctive, 545-546 client-server, 771-772 site reintegration, 850
equivalence and, 582-588 cloud-based, 777 skew, 512
file scans and, 541-544, 550, data servers, 773, 775-777 attribute-value, 800-801
552, 570 transaction-server, 773-775 parallel databases and,
identifiers and, 546 servlets 800-801, 805-808, 812,
indices and, 541-544 client-side scripting and, 814, 819
intraoperation parallelism 389-391 parallel systems and, 780
and, 811 example of, 383-384 partitioning and, 560, 800-801
linear search and, 541-542 life cycle and, 385-386 slicing, 201
server-side scripting and, small-computer-system
relational algebra and,
386-388 interconnect (SCSI), 434
217-219
sessions and, 384-385 snapshot isolation, 652-653,
SQL and,
support and, 385-386 704, 1042
view maintenance and,
set clause, 103 Microsoft SQL Server and,
609-610
set default, 133 1244
Semantic Web, 927
set difference, 50, 221-222, 585 recovery systems and, 729-730
semistructured data models, 9, serializability and, 693-697
set-intersection, 2229
27 validation and, 692-693
set null, 133
sensitivity, 903 snapshot replication, 1252-1253
set operations, 79, 83
Sequel, 57 IBM DB2 and, 1209-1210 snapshots
sequence associations, 906-907 intersect, 50, 81-82, 585, 960 DML commands and,
sequence counters, 1043 nested subqueries and, 90-93 1138-1139
sequential-access storage, 431, query optimization and, 597 Microsoft SQL Server and,
436 query processing and, 564-565 1242
sequential files, 459 set comparison and, 91-93 multiversion concurrency
sequential scans, 1153 union, 80-81, 220-221, 339, 585 control (MVCC) and,
serializability set role, 150 1137-1146
blind writes and, 687 set transactions isolation level PostgreSQL and, 1137-1146
concurrency control and, 662, serializable, 649 read committed, 1242
666-667, 671, 673, 681-690, shadow-copy scheme, 727 software RAID, 448
693-697, 701-704, 708 shadowing, 441-442 Solaris, 1193
conflict, 641-643 shadow-paging, 727 sold-state drives, 430
distributed databases and, shared and intention-exclusive some, 90, 92, 92n8
860-861 (SIX) mode, 680 sorting, 546
isolation and, 648-653 shared-disk architecture, 781, cost analysis of, 548-549
Oracle and, 1181-1182 783, 789 duplicate elimination and,
order of, 644-646 shared-memory architecture, 563-564
performance tuning and, 1042 781-783 external sort-merge algorithm
PostgreSQL and, 1142-1143 shared-mode locks, 661 and, 547-549
precedence graph and, 644 shared-nothing architecture, parallel external sort-merge
predicate reads and, 701 781, 783-784 and, 806
in the real world, 650 shared scans, 614 PostgreSQL and, 1153
1342 Index
range-partitioning, 805 catalogs, 142-143 Oracle variations and,

topological, 644-646 clobs and, 138, 166, 457, 502, 1158-1162
XML and, 1106 1010-1013, 1196-1199 overview of, 57-58
sort-merge-join, 553 CLR hosting and, 1254-1256 persistent programming
space overhead, 476, 479, 486, create table, 60-63, 141-142 languages and, 964-972
522 database modification and, PostgreSQL and, 31 (see also
spatial data 98-103 PostgreSQL)
computer-aided-design data data-definition language prepared statements and,
and, 1061, 1064-1068 (DDL) and, 57-63, 104 162-164
geographic data and, 1061, data-manipulation language procedure writing and,
1064-1066 (DML) and, 57-58, 104 173-180
indexing of, 1071-1076 data mining and, 26 query processing and, 537-538
queries and, 1070-1071 date/time types in, 136-137 (see also query
representation of geometric decision-support systems processing)
information and, and, 887-889 rapid application
1065-1066 default values and, 137 development (RAD) and,
topographical information delete and, 98-100 397
and, 1070 dumping and, 743-744 relational algebra and, 219,
triangulation and, 1065 dynamic, 58, 158 239
vector data and, 1069 embedded, 58, 158, 169-173, rename operation and, 74-80
specialization 773 report generators and, 399-400
entity-relationship (E-R) Entity, 395 ResultSet object and, 159, 161,
model and, 295-296 environments, 43 164-166, 393, 397-398, 490
partial, 300 function writing and, 173-180 revoking of privileges and,
single entity set and, 298 IBM DB2 and, 1195-1200, 1210 149-150
total, 300 index creation and, 137-138, roles and, 145-146
specialty databases, 943 528-529 schemas and, 47, 58-63,
object-based databases and, inheritance and, 949-956 141-143, 147-148
945-975 injection and, 402-403 security and, 402-403
XML and, 981-1020 insert and, 100-101 select clause and, 77
specification of functional integrity constraints and, 58, set operations and, 79-83
requirements, 16, 260 128-136 as standard relational
specificity, 903 intermediate, 113-151 database language, 57
speedup, 778-780 isolation levels and, 648-653 standards for, 1052-1053
spider traps, 930 JDBC and, 158-166 string operations and, 76-77
SQL (Structured Query join expressions and, 71-120 System R and, 30, 57
Language), 10, 13-14, 57, (see also joins) time specification in,
151, 210, 582 lack of fine-grained 1063-1064
accessing from a authorization and, transactions and, 58, 127-128,
programming language, 408-409 773 (see also transactions)
157-163 large-type objects, 138 transfer of privileges and,
advanced, 157-210 Management of External Data 148-149
aggregate functions, 84-90, (MED) and, 1077 triggers and, 180-187
192-197 Microsoft SQL Server and, tuples and, 77-78 (see also
application-level 1223-1267 tuples)
authorization and, multiset types and, 956-961 under privilege and, 956
407-409 MySQL and, 31, 76, 111, update and, 101-103
application programs and, 160n3, 1123, 1155 user-defined types, 138-141
14-15 nested subqueries and, 90-98 views and, 58, 120-128,
array types and, 956-961 nonstandard syntax and, 178 146-147
authorization and, 58, 143-150 null values and, 83-84 where clause predicates, 78-79
basic types and, 59-60 object-based databases and, SQLLoader, 1032, 1189
blobs and, 138, 166, 457, 502, 945-975 SQL Access Group, 1053
1013, 1198-1199, 1259 ODBC and, 166-169 SQL/DS, 30
bulk loads and, 1031-1033 OLAP and, 197-209 SQL environment, 143
Index 1343
SQLJ, 172 number of distinct values hard disks and, 29-30

SQL Plan Management, and, 597-598 IBM DB2 and, 1200-1203
1177-1178 query optimization and, indices and, 21 (see also
SQL Profiler, 1225-1227 590-598 indices)
SQL Security Invoker, 147 random samples and, 593 information retrieval and,
SQL Server Analysis Services selection size estimation and, 915-937
(SSAS), 1264, 1266-1267 592-595 integrity manager and, 21
SQL Server Broker, 1261-1263 steal policy, 740 jukebox, 431
SQL Server Integration steps, 1096 magnetic disk, 430, 432-439
Services (SSIS), stop words, 918 main memory and, 429-430
1263-1266 storage, 427 Microsoft SQL Server and,
SQL Server Management archival, 431 1233-1236
Studio, 1223-1224, atomicity and, 632-633 mirroring and, 441-442,
1227-1228 authorization and, 21 1245-1246
SQL Server Reporting Services Automatic Storage Manager native, 1013-1014
(SSRS), 1264, 1267 and, 1186-1187 nonrelational data, 1009-1010
sqlstate, 179 backup, 431, 723, 756-759, 850, nonvolatile, 432, 632, 722,
SQL Transparent Data 1095-1096 724-726, 743-744
Encryption, 1248 bit-level striping, 442-444 optical, 430-431, 449-450
SQL Tuning Advisor, 1176-1177 buffer manager and, 21 (see Oracle and, 1162-1172,
SQL/XML standard, 1014-1015 also buffers) 1186-1188
Standard Generalized Markup byte amount and, 20 parallel systems and, 777-784
Language (SGML), 981 checkpoints and, 734-735, persistent programming
standards 742-743 languages and, 967-968
ANSI, 57, 1051 clob values and, 1010-1011 physical media for, 429-432
anticipatory, 1051 cloud-based, 777, 862-863 PostgreSQL and, 1146-1151
Call Level Interface (CLI), column-oriented, 892-893 publishing/shredding data
1053 content dump and, 743 and, 1013, 1258-1259
cost per bit, 431 punched cards and, 29
database connectivity,
crashes and, 467-468 (see also query processor and, 21-22
1053-1054
crashes) recovery systems and, 722-726
data pump export/import
data access and, 724-726 (see also recovery
and, 1189
data-dictionary, 462-464 systems)
DBTG CODASYL, 1052
data mining and, 25-26, redundant arrays of
ISO, 57, 871, 1051
893-910 independent disks
ODBC, 1053-1055 data-transfer rate and, 435-436 (RAID), 435, 441-449
reactionary, 1051 data warehouses and, 888 relational databases and,
SQL, 1052-1053 decision-storage systems and, 1010-1014
Wi-Max, 1081 887-889 remote backup systems and,
XML, 1055-1056 direct-access, 431 723, 756-759, 850,
X/Open XA, 1053-1054 distributed databases and, 1095-1096
Starburst, 1193 826-830 replication and, 826, 829
start-up costs, 780 distributed systems and, scrubbing and, 448
starvation, 679 784-788 seek times and, 433, 435-439,
Statement object, 161-164 dumping and, 743-744 450-451, 540, 555
statement triggers, 1161-1162 durability and, 632-633 segments and, 1163
state transition, 1134 error-correcting-code (ECC) sequential-access, 431, 436
state value, 1134 organization and, 444-445 solid-state drives and, 430
statistics Exadata and, 1187-1188 stable, 632, 722-724
catalog information and, file manager and, 21 striping data and, 442-444
590-592 file organization and, 451-462 tape, 431, 450-451
computing, 593 flash, 403, 430, 439-441, 506 tertiary, 431, 449-451
join size estimation and, flat files and, 1009-1010 transaction manager and, 21
595-596 force output and, 725-726 (see also transactions)
maintaining, 593 fragmentation and, 826-829 transparency and, 829-830
1344 Index
volatile, 431, 632, 722 materialized, 1212-1214 harmonic mean of, 1046
wallets and, 415 Microsoft SQL Server and, improved, 635-636, 655
XML and, 1009-1016 1230, 1234 log records and, 1106
storage area network (SAN), .NET Common Language main memories and, 1116
434-435, 789 Runtime (CLR) and, Microsoft SQL Server and,
storage manager, 20-21 1257-1258 1255
string operations Oracle and, 1163-1166, 1187, Oracle and, 1159, 1184
aggregate, 84 1189 parallel systems and, 778
attribute specification, 77 partitions and, 1169-1172 performance and, 1110
escape, 77 relational model and, 39-44, range partitioning and, 800
JDBC and, 158-166 49-51 storage and, 444, 468
like, 76-77 SQL Server Broker and, 1262 system architectures and, 771,
lower, 76 tablespaces, 1146, 1172-1173 778, 800, 802, 819
query result retrieval and, tag library, 388 transactions and, 635-636, 655
161-162 tag timestamps, 136-167
similar to, 77 application design and, concurrency control and,
trim, 76 378-379, 388, 404 682-686, 703
tuple display order, 77-78 information retrieval and, 916 distributed databases and,
upper function, 76 XML and, 982-986, 989, 994, 842-843
where predicates, 78-79 999, 1004, 1019 logical counter and, 682
striping data, 442-444 tape storage, 431, 450-451 long-duration transactions
structured types, 138-141, Tapestry, 399 and, 1110
949-952 task flow. See workflows multiversion schemes and,
stylesheets, 380 Tcl, 180, 1123-1125, 1136 690-691
sublinear speedup, 778-780 temporal data, 1061 ordering scheme and, 682-685
submultiset, 960 intervals and, 1063-1064 rollback and, 685-686
suffix, 874 query languages and, 1064 temporal data and, 1063-1064
sum, 84, 123, 207, 235-236, relational databases and, Thomas’ write rule and,
566-567, 1134 364-367 685-686
superclass-subclass time in databases and, transactions and, 651-652
relationship, 296-297 1062-1064 with time zone, 1063
superkeys, 45-46, 271-272, timestamps and, 1063-1064 time to completion, 1045
330-333 transaction time and, 1062 time with time zone, 1063
superuser, 143 temporal relation, 1062-1063 timezone, 136-137, 1063
Support Vector Machine Teradata Purpose-Built Tomcat, 386
(SVM), 900-901, 1191 Platform Family, 806 top-down design, 297
swap space, 742 term frequency (TF), 918 top-K optimization, 613
Swing, 399 termination states, 1099 topographic information, 1070
Sybase, 1223 tertiary storage, 431, 449-451 topological sorting, 644-646
symmetric multiprocessors TF-IDF approach, 928-929 training instances, 895
(SMPs), 1193 theta join, 584-585 transactional replication,
synonyms, 925-927 third normal form (3NF) 1252-1253
sysaux, 1172-1173 decomposition algorithms transaction control, 58
system architecture. See and, 352-355 transaction coordinator,
architectures relational databases and, 830-831, 834-835, 850-852
system catalogs, 462-464, 1132 336-337, 352-355 transaction manager, 21, 23,
system change number (SCN), Thomas’ write rule, 685-686 830-831
1180-1181 thread pooling, 1246 transaction-processing
system error, 721 three-phase commit (3PC) monitors, 1091
System R, 30, 57, 1193 protocol, 826 application coordination
three-tier architecture, 25 using, 1095-1096
table inheritance, 954-956 throughput architectures of, 1092-1095
tables, 12-13 application development and, durable queue and, 1094
filtering and, 1187 1037, 1045-1046 many-server, many-router
IBM DB2 and, 1200-1203 defined, 311 model and, 1094
Index 1345
many-server, single-router force/no-force policy and, serializability and, 640-653

model and, 1093 739-740 shadow-copy scheme and,
multitasking and, 1092-1095 global, 784, 830, 860-861 727
presentation facilities and, integrity constraint violation simple model for, 629-631
1094-1095 and, 133-134 SQL Server Broker and,
single-server model and, isolation and, 628, 635-640, 1261-1263
1092-1093 646-653 (see also as SQL statements, 653-654
switching and, 1092 isolation) starved, 666
Transaction Processing killed, 634 states of, 633-635
Performance Council local, 784, 830, 860-861 steal/no-steal policy and, 740
(TPC), 1046-1048 locks and, 661-669, 661-686 storage structure and, 632-633
transactions, 32, 625, 655-656, (see also locks) timestamps and, 682-686
1116 log records and, 726-728, two-phase commit protocol
aborted, 633-634, 647 730-734 (2PC) and, 786-788
actions after crash, 736-738 long-duration, 1109-1115 uncommitted, 648
active, 633 main-memory databases and, as unit of program, 627
advanced processing of, 1105-1108 validation and, 686-689
1091-1116 multidatabases and, 860-861 wait-for graph and, 676-678
association rules and, 904-907 multilevel, 1112-1113 workflows and, 836-838,
atomicity and, 22-23, 628, multitasking and, 1092-1095 1096-1102
633-635, 646-648 (see also multiversion concurrency write-ahead logging (WAL)
atomicity) control (MVCC) and, rule and, 739-740, 739-741
availability and, 847-853 1137-1146 transaction scaleup, 779
multiversion schemes and, transactions-consistent
begin/end operations and,
627 689-692 snapshot, 843-844
object-based databases and, transaction-server systems,
cascadeless schedules and,
945-975 773-775
647-648
observable external writes transactions per second (TPS),
check constraints and, 628
and, 634-635 1046-1047
cloud computing and, 866-868
parallel databases and, transaction time, 365n8, 1062
commit protocols and, 797-820 TransactSQL, 173
832-838 performance tuning and, transfer of control, 757
committed, 127, 633-635, 639, 1041-1044 transfer of prestige, 921-922
647, 692-693, 730, 758, persistent messaging and, transformations
832-838, 1107, 1218 836-837 equivalence rules and,
compensating, 633, 1113-1114 persistent programming 583-586
concept of, 627-629 languages and, 970 examples of, 586-588
concurrency control and, person-in-the-middle attacks join ordering and, 588-589
661-710, 1241-1246 (see and, 1105 query optimization and,
also concurrency control) PostgreSQL and, 1137-1146 582-590
consistency and, 22, 627-631, read/write operations and, relational algebra and,
635-636, 640, 648-650, 655 653-654 582-590
(see also consistency) real-time systems and, XML and, 998-1008
crashes and, 628 1108-1109 transition tables, 183-184
data mining and, 893-910 recoverable schedules and, transition variable, 181
decision-storage systems and, 647 transitive closure, 188-190
887-889 recovery manager and, 22-23 transitivity rule, 339-340
defined, 22, 627 recovery systems and, 631, transparency, 829-830, 854-855
distributed databases and, 633 (see also recovery trees, 1086
830-832 systems) B, 504-506, 530, 1039, 1064,
durability and, 22-23, 628, remote backup systems and, 1071-1072, 1076, 1086,
633-635 (see also 756-759 1135, 1148-1150, 1159,
durability) restart of, 634 1164-1169, 1173, 1205
E-commerce and, 1102-1105 rollback and, 127, 736, B+, 12-34-1235 (see also
failure of, 633, 721-722 746-749, 754-755 B+-trees)
1346 Index
decision-tree classifiers and, delete and, 98-100 nonstandard, 1129-1130

895-900 domain relational calculus object-based databases and,
directory information (DIT), and, 245-248 949-963
872-875 duplicate, 94-95 object-identity, 961-963
distributed directory, 874-875 eager generation of, 569-570 Oracle and, 1158-1160
Generalized Search Tree insert and, 100-101 performance tuning and, 1043
(GiST) and, 1148-1149 joins and, 550-553 (see also polymorphic, 1128-1129
index-organized tables (IOTs) joins) PostgreSQL, 1126-1129,
and, 1164-1165 lazy generation of, 570-571 1132-1133
k-d, 1071-1072 ordering display of, 77-78 pseudotypes, 1128
multiple granularity and, parallel databases and, reference, 961-963
679-682 797-820 user-defined, 138-141
Oracle and, 1164-1165, 1191 pipelining and, 568-572 single reference, 972
overfitting and, 899-900 PostgreSQL and, 1137-1146 wide-area, 788-791
PostgreSQL and, 1148-1149 query structures and, 68 XML, 990-998, 1006-1007
quadratic split and, 1075-1076 query optimization and,
quadtrees, 1069, 1072-1073 579-616
query optimization and, query processing and, 537-573 UDF. See user-defined
814-815 (see also query ranking and, 192-195 functions
optimization) relational algebra and, Ultra320 SCSI interface, 436
R, 1073-1076 217-239, 582-590 Ultrium format, 451
scheduling and, 814-815 set operations and, 79-83 under privilege, 956
spatial data support and, update and, 101-103 undo
1064-1076 views and, 120-128 concurrency control and,
XML, 998, 1011 windowing and, 195-197 749-750
triggers tuple visibility, 1139 logical operations and,
alter, 185 two-factor authentication, 745-750
disable, 185 405-407 Oracle and, 1163
drop, 185 two-phase commit (2PC) recovery systems and, 729-738
IBM DB2 and, 1210 protocol, 786-788, transaction rollback and,
Microsoft SQL Server and, 832-836 746-749
1232-1233 two-tier architecture, 24-25
undo pass, 754
need for, 180-181 types, 1017, 1159
undo phase, 737
nonstandard syntax and, 184 abstract data, 1127
Oracle and, 1161-1162 Unified Modeling Language
array, 956-961
PostgreSQL and, 1153-1154 (UML), 17-18
base, 1127
recovery and, 186 blob, 138, 166, 457, 502, 1013, associations and, 308-309
in SQL, 181-187 1198-1199, 1259 cardinality constraints and,
transition tables and, 183-184 clob, 138, 166, 457, 502, 309-310
when not to use, 186-187 1010-1013, 1196-1199 components of, 308
true negatives, 903 complex data, 946-949 (see relationship sets and, 308-309
true predicate, 67 also complex data types) uniform resource locators
true relation, 90, 93 composite, 1127 (URLs), 377-378
tuple ID, 1147-1148 document type definition union, 80-81, 585, 220-221
tuple relational calculus, 239, (DTD) , 990-994 union all, 80
249 enumerated, 1128 union rule, 339
example queries, 240-242 IBM DB2 and, 1196-1197 unique, 94-95
expressive power of inheritance and, 949-956 decomposition and, 354-355
languages, 244 Microsoft SQL Server and, integrity constraints and,
formal definition, 243 1229-1230, 1257-1258 130-131
safety of expressions, 244 most-specific, 953 uniquifier, 498-499
tuples, 40-42 multiset, 956-961 United States, 17, 45, 263, 267n3,
aggregate functions and, .NET Common Language 411, 788, 858, 869, 922
84-90 Runtime (CLR) and, Universal Coordinated Time
Cartesian product and, 50 1257-1258 (UTC), 1063
Index 1347
Universal Description, EXEC SQL and, 171 HyperText Transfer Protocol

Discovery, and hashing and, 516-522 (HTTP) and, 377-383, 395,
Integration (UDDI), 1018 indices and, 482-483 404-406, 417
Universal Serial Bus (USB) insertion time and, 491-495, IBM DB2, 1195
slots, 430 499-500 mobile, 1079-1085
universal Turing machine, 14 log records and, 726-734 persistent programming
universities, 2 lost, 692 languages and, 970
application design and, 375, Microsoft SQL Server and, PostgreSQL and, 1124-1126
392, 407-409 1232-1233, 1239 presentation layer and, 391
concurrency control and, 698 mobile, 1083-1084 report generators and, 399-400
database design and, 16-17 security and, 402-417
Oracle and, 1179-1180
databases for, 3-8, 11-12, storage and, 434 (see also
performance tuning and,
15-19, 27, 30 storage)
1030-1033, 1043-1044
E-R model and, 261-274, 280, tools for building, 396-398
282, 292, 294-299 persistent programming
Web services and, 395
indexing and, 477, 510, 529 languages and, 970
World Wide Web and, 377-382
query optimization and, 586, PostgreSQL and, 1130,
user requirements, 15-16, 27-28
589, 605 1141-1144, 1147-1148
E-R model and, 260, 298
query processing and, 566 privileges and, 143-145
performance and, 311-312
recovery system and, 724 query optimization and, response time and, 311
relational database design 613-614 throughput and, 311
and, 323-330, 334, 355, shipping SQL statements to using, 114
364-365 database, 161 utilization, 636
relational model and, 41, snapshot isolation and,
43-48 692-697
SQL and, 61-63, 70-72, 75, 99, triggers and, 182, 184 vacuum, 1143
125-134, 145-150, 153, views and, 124-128 validation, 703-704
170, 173, 187, 192-193, XML and, 1259-1260 classifiers and, 903-904
197, 226-227 user-defined entity sets, 299 concurrency control and,
storage and, 452, 458, 460 686-689
user-defined functions (UDFs),
system architecture and, 785, first committer wins and,
1197-1198
828, 872 692-693
user-defined types, 138-141
transactions and, 653 first updater wins and, 693
user interfaces, 27-28 long-duration transactions
University of California, application architectures and,
Berkeley, 30, 1123 and, 1111
391-396 phases of, 688
Unix, 77, 438, 713, 727, 1124, application programs and,
1154, 1193-1194, 1212, recovery systems and, 729-730
375-377 snapshot isolation and,
1223
as back-end component, 376 692-693
unknown, 83, 90
business-logic layer and, view serializability and, 687
unnesting, 958-961
391-392 valid time, 365
updatable result sets, 166
update-anywhere replication, client-server architecture and, varchar, 59-60, 62
844 32, 204, 376-377, 756-772, VBScript, 387
updates, 101-103 777, 788, 791 vector data, 1069
authorization and, 147, 148 client-side scripting and, vector space model, 919-920
B+-trees and, 491-497, 499-500 389-391 version-numbering schemas,
batch, 1030-1031 cloud computing and, 861-870 1083-1084
complexity of, 499-500 common gateway interface version-vector scheme, 1084
concurrency control and, (CGI), 380-381 vertical fragmentation, 828
867-868 cookies and, 382-385, 403-405 video servers, 1078
data warehouses and, 891 CRUD, 399 view definition, 58
deletion time and, 491, data access layer and, 391, view maintenance, 608-611
495-500 393, 395 views, 120
distributed databases and, disconnected operation and, authorization on, 146-147
826-827 395-396 with check option, 126
1348 Index
complex merging and, transactions and, 651-654 security and, 402-417

1173-1174 while loop, 168, 171, 176 services processing and, 395
create view and, 121-125 wide-area networks (WANs), Simple Object Access Protocol
cube, 1221-1222 788, 790-791 (SOAP) and, 1017-1018
deferred maintenance and, Wi-Max, 1081 three-layer architecture and,
1039-1040 windowing, 195-197 318
definition, 121-122 Windows Mobile, 1223 uniform resource locators
delete, 125 Wireless application protocol (URLs), 377-378
immediate maintenance and, (WAP), 1081-1082 Web application frameworks
1039-1040 wireless communications, and, 398-400
insert into, 124-125 1079-1085 Web servers and, 380-382
maintenance, 124 with check option, 126 XML and, 1017-1018
materialized, 123-124 (see also with clause, 97, 190 World Wide Web Consortium
materialized views) with data, 141-142 (W3C), 927, 1056
performance tuning and, with grant option, 148 wrapping, 1055-1056
1039-1041 with recursive clause, 190 write-ahead logging (WAL),
SQL queries and, 122-123 with timezone, 136 739-7841, 1145-1146
update of, 124-128 WordNet, 927 write once, read-many
view serializability, 687 workflows, 312-313, 836-838, (WORM) disks, 431
virtual machines, 777 1017 write quorum, 841-842
virtual processor, 801 acceptable termination states write-write contention, 1042
Virtual Reality Markup and, 1099
Language (VRML), bugs and, 1101
390-391 business-logic layer and, X.500 directory access protocol,
Visual Basic, 169, 180, 397-398, 391-392 871
1228 execution and, 1097-1101 XML (Extensible Markup
VisualWeb, 397 external variables and, Language), 31, 169, 386,
volatile storage, 431, 632, 722 1098-1099 1020
failures and, 1099-1102 application program
wait-for graph, 676-678, 845-847 management systems for, interfaces (APIs) to,
Web crawlers, 930-931 1101-1102 1008-1009
Weblogic, 386 multisystem applications and, applications, 1016-1019
WebObjects, 399 1096 clob values and, 1010-1011
Web servers, 380-382 nonacceptable termination data exchange formats and,
Web services, 395, 1199-1200 states and, 1099 1016-1017
Web Services Description performance and, 1029-1048 data mediation and,
Language (WSDL), 1018 recovery of, 1101 1018-1019
WebSphere, 386 specification and, 1097-1099 data structure, 986-990
when clause, 181, 184 steps and, 1096 document schema, 990-998
where clause, 311 tasks and, 1096 document type definition
aggregate functions and, transactional, 1096-1102 (DTD), 990-994
84-90 workload compression, 1041 as dominant format, 985
basic SQL queries and, 63-74 World Wide Web, 31, 885 file processing and, 981-982
between, 78 application design and, format flexibility of, 985
on multiple relations, 66-71 377-382 HTML and, 981
natural join and, 71-74 cookies and, 382-385, 403-405 IBM DB2 and, 1195-1196
not between, 78 encryption and, 411-417 joins and, 1003-1004
null values and, 83-84 HyperText Markup Language markup concept and, 981-985
query optimization and, (HTML), 378-380 Microsoft SQL Server and,
605-607 HyperText Transfer Protocol 1258-1261
rename operation and, 74-75 (HTTP) and, 377-381, 383, nesting and, 27, 943, 984-990,
security and, 409 395, 404-406, 417 995-998, 1001, 1004-1007,
set operations and, 79-83 information retrieval and, 915 1010
on single relation, 63-65, 63-66 (see also information Oracle XML DB and,
string operations and, 76-79 retrieval) 1159-1160
Index 1349
publishing/shredding data updates and, 1259-1260 XQuery, 31, 998

and, 1013 web services and, 1017-1018 FLWOR expressions and,
queries and, 998-1008, wrapping and, 1055-1056 1002-1003
1259-1260 xmlagg, 1015 functions and, 1006-1007
relational databases and, xmlattributes, 1015 joins and, 103-104
1010-1014 xmlconcat, 1015 Microsoft SQL Server and,
relational maps and, 1012 xmlelement, 1015 1260-1261
Simple Object Access Protocol xmlforest, 1015 nested queries and, 1004-1005
(SOAP) and, 1017-1018 Oracle and, 1160
XMLIndex, 1160
sorting and, 1006 sorting of results and, 1006
XML Schema, 994-998
SQL/XML standard and, storage and, 1009-1015
1014-1015 XMLType, 1159
X/Open XA standards, transformations and,
standards for, 1055-1056 1002-1008
storage and, 1009-1016 1053-1054
XOR operation, 413 types and, 1006-1007
tags and, 982-986, 989, 994, XSLT, 1160
999, 1004, 1019 XPath, 1160
textual context and, 986 document schema and, 997
transformation and, 998-1008 queries and, 998-1002 Yahoo, 390, 863
tree model of, 998 storage and, 1009-1015 YUI library, 390

SQL Server chapter on optimization

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

SQL Server chapter on optimization

Hochgeladen von

Copyright:

Verfügbare Formate

1268 Chapter 30 Microsoft SQL Server

considers for cost-based optimization of subqueries. Additional information on

• Appendix B (Advanced Relational Database Design), first covers the theory

For appendices B through E, we illsutrate our concepts using a bank enterprise

A.1 Full Schema

classroom(building, room number, capacity)

Figure A.1 Schema of the University database.

teaches takes grade

Figure A.2 E-R diagram for a university enterprise.

create table classroom

create table department

create table course

create table instructor

create table section

In the above DDL we add the on delete cascade specification to a foreign

its identifying relationship). In other foreign key constraints we either specify

create table teaches

create table student

create table takes

create table advisor

create table prereq

create table timeslot

create table timeslot

A.3 Sample Data

building room number capacity

Figure A.3 The classroom relation.

dept name building budget

Figure A.4 The department relation.

course id title dept name credits

Figure A.5 The course relation.

ID name dept name salary

Figure A.6 The instructor relation.

course id sec id semester year building room number time slot id

Figure A.7 The section relation.

ID course id sec id semester year

Figure A.8 The teaches relation.

ID name dept name tot cred

Figure A.9 The student relation.

ID course id sec id semester year grade

Figure A.10 The takes relation.

Figure A.11 The advisor relation.

time slot id day start time end time

Figure A.12 The time slot relation.

Figure A.13 The prereq relation.

time slot id day start hr start min end hr end min

[Agrawal et al. 1992] R. Agrawal, S. P. Ghosh, T. Imielinski, B. R. Iyer, and A. N.

[Amer-Yahia et al. 2004] S. Amer-Yahia, C. Botev, and J. Shanmugasundaram,

[Bernstein et al. 1998] P. Bernstein, M. Brodie, S. Ceri, D. DeWitt, M. Franklin,

[Bulmer 1979] M. G. Bulmer, Principles of Statistics, Dover Publications (1979).

[Chamberlin 1996] D. Chamberlin, Using the New DB2: IBM’s Object-Relational

[Chaudhuri and Narasayya 1997] S. Chaudhuri and V. Narasayya, “An Efficient

[Daniels et al. 1982] D. Daniels, P. G. Selinger, L. M. Haas, B. G. Lindsay, C. Mohan,

[Galindo-Legaria and Joshi 2001] C. A. Galindo-Legaria and M. M. Joshi, “Orthog-

[Graefe 1990] G. Graefe, “Encapsulation of Parallelism in the Volcano Query Pro-

[Imielinski and Badrinath 1994] T. Imielinski and B. R. Badrinath, “Mobile Com-

[Knapp 1987] E. Knapp, “Deadlock Detection in Distributed Databases”, ACM

[Lynch et al. 1988] N. A. Lynch, M. Merritt, W. Weihl, and A. Fekete, “A Theory of

[Mohan 1990a] C. Mohan, “ARIES/KVL: A Key-Value Locking Method for Con-

[NIST 1993] NIST, “Integration Definition for Information Modeling (IDEF1X)”,

[Padmanabhan et al. 2003] S. Padmanabhan, B. Bhattacharjee, T. Malkemus,

[Popek et al. 1981] G. J. Popek, B. J. Walker, J. M. Chow, D. Edwards, C. Kline,

[Salton 1989] G. Salton, Automatic Text Processing, Addison Wesley (1989).

Documents: Limitations and Opportunities”, In Proc. of the International Conf. on