Sie sind auf Seite 1von 19

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO.

XX, XXXXXXX 2001 1

Hashing Methods for Temporal Data


George Kollios, Member, IEEE, and Vassilis J. Tsotras

AbstractÐExternal dynamic hashing has been used in traditional database systems as a fast method for answering membership
queries. Given a dynamic set S of objects, a membership query asks whether an object with identity k is in (the most current state of) S.
This paper addresses the more general problem of Temporal Hashing. In this setting, changes to the dynamic set are timestamped and
the membership query has a temporal predicate, as in: ªFind whether object with identity k was in set S at time t.º We present an
efficient solution for this problem that takes an ephemeral hashing scheme and makes it partially persistent. Our solution, also termed
partially persistent hashing, uses linear space on the total number of changes in the evolution of set S and has a small …O…logB …n=B†††
query overhead. An experimental comparison of partially persistent hashing with various straightforward approaches (like external
linear hashing, the Multiversion B-Tree, and the R*-tree) shows that it provides the faster membership query response time. Partially
persistent hashing should be seen as an extension of traditional external dynamic hashing in a temporal environment. It is independent
of the ephemeral dynamic hashing scheme used; while the paper concentrates on linear hashing, the methodology applies to other
dynamic hashing schemes as well.

Index TermsÐHashing, temporal databases, transaction time, access methods, data structures.

1 INTRODUCTION

T HERE are many applications that require the mainte-


nance of time-evolving1 data [26], [16]. Examples
include accounting, marketing, billing, etc. If past states of
join X…t1 † and Y …t2 †. This temporal join is based on oids and
on (possibly different) time instants (i.e., find the employees
working in both departments X and Y, at times t1 and t2 ,
the data are also stored, the data volume increases respectively). 3) A by-product of partitioning (and, thus, of
considerably with time. Various temporal access methods the membership property) is query parallelism. That is, the
(indices) have been proposed to query past data [34]. In above temporal joins can be performed faster if the
particular, research has concentrated on supporting pure- corresponding set partitions are joined in parallel [22].
snapshot or range-snapshot queries over past states of a Conventional membership queries over a set of objects
time-evolving set S. Let S…t† denote the state (collection of have been addressed through hashing. Hashing can be
objects) that S had at time t. Given time t, a pure-snapshot applied either as a main-memory scheme (all data fits in
query asks for ªall objects in S…t†,º while a range-snapshot main memory [7], [12]) or in database systems (where
query retrieves ªthe objects in S…t† with identities (oids) in a data is stored on disks [20]). Its latter form is called
specified rangeº [24], [37], [2], [39]. Instead, this paper external hashing [10], [28]. For every object in the set, a
concentrates on the temporal membership query,2 i.e., ªgiven hashing function computes the bucket where the object is
oid k and time t find whether k was in S…t†.º stored. Each bucket is initially the size of a page. For this
Temporal membership queries are important for many discussion, we assume that a page can hold B objects.
applications: 1) In admission/passport control and security Ideally, each distinct oid should be mapped to a separate
applications, we are interested in whether a person was in bucket; however, this is unrealistic as the universe of oids is
some place at a given date. 2) Traditionally, efficient usually much larger than the number of buckets allocated
methods to address membership queries are used to by the hashing scheme. If a bucket cannot accommodate the
expedite join queries [31], [23]. Before joining two large oids mapped to it in the space assigned to the bucket, an
sets, the membership property partitions them into smaller overflow occurs. Overflows are dealt with in various ways,
disjoint subsets and every subset from the first set is joined including rehashing (where another bucket is found using
with only one subset from the second set. This idea can be another hashing scheme) and/or chaining (create a chain of
extended to solve special cases of temporal joins [33], [40], pages under the overflowed bucket).
If no overflows are present, finding whether a given oid
[27]. For example, given two time-evolving sets X and Y,
is in the set is trivial: Simply compute the hashing function
1. In the rest of this paper, the terms time and temporal refer to for the oid and visit the appropriate bucket. With a perfect
transaction-time [15]. hashing scheme, membership queries are answered in
2. Pure, range-snapshot, and temporal membership queries are denoted
O…1† steps (just one I/O to access the page of the bucket).
as ª ==-=S,º ªR==-=S,º and ªS==-=Sº in [36].
If data is not known in advance, the worst-case query
performance of hashing is large (linear in the size of the
. G. Kollios is with the Computer Science Department, Boston University, set since a bad hashing function could map all oids to the
Boston, MA 02115. E-mail: gkollios@cs.bu.edu. same bucket). Nevertheless, practice has shown that in the
. V.J. Tsotras is with the Department of Computer Science and Engineering,
University of California, Riverside, CA 92521. E-mail: tsotras@cs.ucr.edu. absence of pathological data, good hashing schemes with
few overflows and expected O…1† query performance exist
Manuscript received 6 Feb. 1998; revised 8 Nov. 1999; accepted 20 Nov. 2000.
For information on obtaining reprints of this article, please send e-mail to: (assuming uniform data distribution). This is a major
tkde@computer.org, and reference IEEECS Log Number 106237. difference between hashing and indexing schemes. With a
1041-4347/01/$10.00 ß 2001 IEEE
2 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

balanced index (like a B+ tree [5]), answering a membership solve the problem; this was also observed in [1]). The
query takes logarithmic time on the set size. For many second approach assumes that a B+ tree is used to index
applications (for example, in joins [31], [23]), a hashing each S…t† and makes this B+ tree partially persistent [2],
scheme with expected constant query performance (one or [39], [24]. The third approach sees each oid-interval
two I/Os) is preferable to the logarithmic worst case combination as a multidimensional object and uses an
performance (four or more I/Os if the set is large) of a R*-tree [14], [3] for storing it. Our experiments show that
balanced index. the partially persistent hashing outperforms the other
Static hashing refers to schemes that use a predefined three competitors in membership query performance
collection of buckets. This is inefficient if the set of objects is while having a minimal space overhead.
allowed to change (by adding or deleting objects from it). The partially persistent B+ tree [2], [39], [24] is technically
the more interesting among the competitor approaches.
Instead, a dynamic hashing scheme has the property of
Conceptually, many versions of an ephemeral B+ tree are
allocating space proportional to the set size. Various
embedded into a graph-like structure. This graph has many
external dynamic hashing schemes [19], [11], [9] have been
root nodes, each root providing access to subsequent
proposed, among which linear hashing [20] appears to be the
versions of the ephemeral B+ tree, organized by time.
most commonly used. Searching for oid k at time t involves: 1) identifying the root
Even if the set evolves, traditional dynamic hashing is valid at t and 2) searching for k through the nodes of the
ephemeral, i.e., it answers membership queries on the most B+ tree emanating from this root [2]. Part 1 is bounded by
current state of set S. The problem addressed in this paper is O…logB …n=B††, while part 2 takes O…logB …m=B†† I/Os, where
more general as it involves membership queries for any m corresponds to the number of objects in S…t†. In practice,
state that set S exhibited. however, the I/Os for part 1 can be eliminated if enough
Assume that for every time t when S…t† changes (by main memory is available. For example, [2] identifies the
adding/deleting objects) we could have a good ephemeral appropriate root using the root* structure which is kept in
dynamic hashing scheme h…t† that maps efficiently (with main memory.
few overflows) the oids of S…t† into a collection of buckets It was an open question whether an efficient temporal
b…t†. One straightforward solution to the temporal hashing extension existed for hashing. The work presented here
problem would be to separately store each collection of answers this question positively. Searching with the
buckets b…t† for each t. Answering a temporal membership partially persistent hashing scheme has also two parts.
query for oid k and time t requires: 1) identifying h…t† (the The first part is again bounded by O…logB …n=B†† and
hashing function used at t) and 2) applying h…t† on k and identifies the hashing function and appropriate bucket
accessing the appropriate bucket of b…t†. This would valid at time t. The second part takes expected O…1† time
provide an excellent query performance (as it uses a (i.e., it is independent of m). In practice, with a large enough
separate hashing scheme for each t), but the space main memory, the I/Os from the logarithmic overhead can
requirements are prohibitively large. If n denotes the be eliminated. This work supports our conjecture [18] that
number of changes in S's evolution, storing each b…t† on transaction-time problems can be solved by taking an
the disk could easily create O……n=B†2 † space. efficient solution for the nontemporal problem and making
Instead, we propose a solution that uses space linear in n it partially persistent. Note that as with ephemeral hashing,
but still has good query performance. We call our solution the worst-case query behavior of temporal hashing is linear
partially persistent hashing as it reduces the original problem …O…n=B††, but this is a rather pathological case.
into a collection of partially persistent3 subproblems. We The rest of the paper is organized as follows: Section 2
apply two approaches for solving these subproblems. The presents background and previous work. The description of
first approach "sees" each subproblem as an evolving subset partially persistent linear hashing appears in Section 3.
of set S and is based on the Snapshot Index [37]. The second Performance comparisons are presented in Section 4, while
approach ªseesº each subproblem as an evolving sublist. In conclusions and open problems for further research appear
both cases, the partially persistent hashing scheme in Section 5. Table 3 contains the main notations and
ªobservesº and stores the evolution of the ephemeral symbols used throughout the paper while the Appendix
hashing h…t†. While the paper concentrates on linear discusses the update and space analysis of the evolving-list
hashing, the partially persistence methodology applies to approach. Table 4 and Table 5 provide further performance
other dynamic hashing schemes as well (for example, comparisons (to be discussed later).
extendible hashing [11]).
We compare partially persistent hashing with three other
approaches. The first approach uses a traditional dynamic 2 BACKGROUND AND PREVIOUS WORK
hashing scheme to map all oids ever created during the A simple model of temporal evolution follows. Assume that
evolution of S. However, this scheme does not distinguish time is discrete, described by the succession of nonnegative
among subsequent occurrences of the same oid k created by integers. Consider for simplicity an initially empty set S.
adding and deleting k many times. Because all such When an object is added to S and, until (if ever) it is deleted
occurrences of oid k will be hashed on the same bucket, from S, it is called ªalive.º4 Each object is represented by a
bucket sizes will increase and will eventually deteriorate
record that stores its oid and a semiclosed interval, or
performance (note that bucket reorganizations will not
lifespan, of the form: [start_time, end_time). An object added
3. A structure is called persistent if it can store and access its past states at t has its interval initiated as [t, now), where now is a
[8]. It is called partially persistent if the structure evolves by applying
changes to its ªmost currentº state. 4. Clearly, S…t† contains all the alive objects at t.
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 3

variable representing the always increasing current time. If 2.1 The Snapshot Index and Linear Hashing
this object is later deleted, its end_time is updated to the The Snapshot Index [37] uses three basic structures: A
object's deletion time. Since an object can be added and balanced tree (time-tree) that indexes data pages by time, a
deleted many times, records with the same oid may exist, pointer structure (access-forest) among the data pages, and
but with nonintersecting lifespan intervals. an ephemeral hashing scheme. The time-tree and the access-
The performance of a temporal access method is forest enable fast query response, while the hashing scheme
characterized by three costs: space, query time, and update is used for fast update. When an object is added in set S, its
time. Update time is the time needed to update the index record is stored sequentially in a data page. When an object
for a change that happened on set S. Clearly, an index that is deleted from S, its record is located and its end_time is
solves the range-snapshot query can also solve the pure- updated. The main advantage in the Snapshot Index, is that
snapshot query. However, as indicated in [35], a method it clusters alive objects together. To achieve this, the method
designed to address primarily the pure-snapshot query uses the notion of page usefulness; a page is called useful as
does not order incoming changes according to oid and can long as it contains uB alive records …0 < u  1†. When a
thus enjoy faster update time than a method designed for page becomes nonuseful (due to object deletions) its alive
the range-snapshot query. Indeed, the Snapshot Index [37] records are copied to another page. The usefulness
solves the pure-snapshot query in O…logB …n=B† ‡ a=B† I/Os, parameter u is a constant that tunes the performance of
using O…n=B† space and only O…1† update time per change the Snapshot Index. Larger u implies better clustering and,
(in the expected amortized sense [6] because a hashing thus, less query time, but more copying and, hence, more
scheme is employed). Here, a corresponds to the number of space. For more details we refer to [37].
alive objects in S…t†. This is the I/O-optimal solution for the Linear Hashing (LH) is a dynamic hashing scheme that
pure snapshot query [34]. adjusts gracefully to object insertions and deletions. The
For the range-snapshot query three efficient methods scheme uses a collection of buckets that grows or shrinks
exist: the TSB tree [24], the MVBT [2], and the MVAS one bucket at a time. Each bucket is initially assigned one
page. Overflows are handled by creating a chain of pages
structure [39]. Both the MVBT and MVAS solve the range-
under the overflowed bucket. The hashing function changes
snapshot query in O…logB …n=B† ‡ a=B† I/Os, using
dynamically and at any given instant at most two hashing
O…n=B† space and O…logB …m=B†† update per change (in
functions are used. More specifically, let U be the universe
the amortized sense [6]). Here, m denotes the number of
of oids and h0 : U ! f0; :::; M 1g be the initial hashing
ªaliveº objects when an update takes place and a denotes
function that is used to load set S into M buckets (for
the number of objects from S…t† with oids in the query
example: h0 …oid† ˆ oid modM). Insertions and deletions of
specified range. This is the I/O-optimal solution for the
oids are performed using h0 until a bucket overflow
range-snapshot query. The MVAS structure improves the
happens. When the first overflow occurs (in any bucket),
merge/split policies of the MVBT, thus resulting in a the first bucket in the LH file, bucket 0, is split (rehashed)
smaller constant in the space bound. The TSB tree is another into two buckets: the original bucket 0 and a new bucket M,
efficient solution to the range-snapshot query. In practice, it which is attached at the end of the LH file. The oids
is more space efficient than the MVBT (and MVAS), but it originally mapped into bucket 0 (using function h0 ) are now
can guarantee worst-case query performance only when the distributed between buckets 0 and M, using a new hashing
set evolution is described by additions of new objects or function h1 …oid†.
updates on existing objects [30], [34]. Further bucket overflows will cause additional bucket
To the best of our knowledge, the temporal membership splits in a linear bucket-number order. A variable p indicates
query is a novel problem [34]. One could, of course, use a which is the bucket to be split next (i.e., initially p ˆ 0).
temporal access method designed for range-snapshot Conceptually, the value of p denotes which of the two
queries [24], [2], [39] and limit the query range to a single hashing functions, which may be enabled at any given time,
oid. There are, however, two disadvantages with this applies to what buckets. After enough overflows, all
approach. First, since the range-snapshot access method original M buckets will be split. This marks the end of a
orders objects by oid, a logarithmic update …O…logB …m=B††† splitting-round. Variable p is reset to 0 and a new splitting
is needed. Second, because the range-snapshot method round is started. In general, round i starts with p ˆ 0,
answers the membership query by traversing a tree path, buckets f0; :::; 2i M 1g, and hashing functions hi …oid† and
the same logarithmic overhead …O…logB …m=B††† appears in hi‡1 …oid†. The round ends when all 2i M buckets are split. For
the query time. Nevertheless, in our experiments, we our purposes, we use hi …oid† ˆ oid mod2i M. Functions hj ,
include a range-snapshot method for completeness. Since j ˆ 1; :::; are called split functions of h0 . A split function hj
we assume that object deletions are frequent, we use the has the properties: 1) hj : U ! f0; :::; 2j M 1g and 2) for
MVBT instead of the TSB. We use the MVBT instead of the any oid, either hj …oid† ˆ hj 1 …oid† or
MVAS since the MVBT disk-based code was readily
hj …oid† ˆ hj 1 …oid† ‡ 2j 1 M:
available to the authors. As explained in the performance
analysis section, the MVBT was also compared with a main- At any time, the hashing scheme is completely identified by
memory-based simulation of the MVAS, but there was no the round number i and variable p. Given i and p, searching
drastic performance difference among the two methods for for oid k is performed using hi if hj  p; otherwise, hi‡1 is
the temporal membership query. used.
4 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

A split performed whenever a bucket overflow occurs is ephemeral linear hashing scheme may apply a rehashing
an uncontrolled split. Let l denote the LH file's load factor, i.e., procedure that remaps the current contents of bucket bj to
l ˆ jSj=Bjbj, where jSj is the current size (cardinality) of bucket bj and some new bucket br . Assume that such
set S and jbj the current number of buckets in the LH file. rehashing occurred at time t0 and its result is a move of
The load factor achieved by uncontrolled splits is usually v oids from bj to br . For the evolution of bj …br †, this
between 50-70 percent, depending on the page size and rehashing is viewed as a deletion (respectively, addition) of
the oid distribution [20]. In practice, a higher storage the v oids at time t0 , i.e., all such deletions (additions) are
utilization is achieved if a split is triggered not by an timestamped with the same time t0 for the corresponding
overflow, but when the load factor l becomes greater than object's evolution.
some upper threshold g …g  1† [20], [10], [13]. This is called a Fig. 1 shows an example of the ephemeral hashing
controlled split and can typically achieve 95 percent utiliza- scheme at two different time instants. For simplicity, M ˆ 5
tion. (Note that the split is now independent of bucket and B ˆ 2. Fig. 2 shows the corresponding evolution of set S
overflows. Other controlled schemes exist where a split is and the evolutions of three buckets (a ª‡= º denotes
delayed until both the threshold condition holds and an addition/deletion, respectively). Controlled splitting with
overflow occurs). Deletions in set S will cause the LH file to an upper threshold g ˆ 0:9 is assumed. At time t ˆ 21, the
shrink. Buckets that have been split can be recombined (in addition of oid 8 triggers a split since jSj ˆ 10 oids and
reverse linear order) if the load factor falls below some lower
M ˆ 5, l > g (at the same time oid 8 causes an overflow on
threshold f…0 < f  g†. Practical values for f and g are 0.7
bucket 3). Thus, the contents of bucket 0 are rehashed
and 0.9, respectively. In our experiments and analysis, we
between bucket 0 and bucket 5. As a result, oid 15 is moved
use the controlled version of linear hashing.
to bucket 5. For bucket 0's evolution, this change is
considered a deletion at t ˆ 21, but, for bucket 5, it is an
3 PARTIALLY PERSISTENT LINEAR HASHING addition of oid 15 at the same t ˆ 21. The records stored in
We will apply the partially persistent methodology on an each bucket's history are also shown. For example, at t ˆ 25,
ephemeral linear hashing scheme. Section 3.1 describes the oid 10 is deleted from set S. This updates the lifespan of this
evolving-set approach while the evolving list approach is in oid's corresponding record in bucket 0's history from
Section 3.2. < 10; ‰1; now† > to < 10; ‰1; 25† > . If bj …t† is available,
searching through its contents for oid k is performed by a
3.1 The Evolving-Set Approach linear search. This process is lower bounded by
Using partial persistence, the temporal hashing problem is O…jbj …t†j=B† I/Os since these many pages are at least
reduced into a number of subproblems for which efficient needed for storing bj …t†. (This is similar with traditional
solutions are known. Assume that an ephemeral linear hashing, where a query about some oid is translated into
hashing scheme (as the one described in Section 2) is used searching the pages of a bucket; this search is also linear and
to map the objects of S…t†. As S…t† evolves, the hashing continues until the oid is found or all the bucket's pages are
scheme is a function of time, too. Let LH…t† denote the searched.) Since every bucket bj behaves like a set
linear hashing file as it is at time t. The two basic parameters evolving over time, the Snapshot Index [37] can be used
that identify LH…t† for each t are now time-dependent, to store the evolution of each bj and reconstruct any bj …t†
namely, i…t† and p…t†. with effort proportional to jbj …t†j=B I/Os.
An interesting property of linear hashing is that buckets The query performance of partially persistent linear
are reused; when round i ‡ 1 starts, it uses twice as many
hashing depends on the ephemeral hashing scheme used
buckets as round i, but the first half of the bucket sequence
for S…t†. Since the worst case of ephemeral hashing is linear,
is the same since new buckets are appended at the end of
the file. Let btotal denote the longest sequence of buckets ever the worst case of partially persistent hashing is O…n=B†.
used during the evolution of S…t† and assume it reaches up Nevertheless, a good ephemeral hashing scheme LH…t†
to bucket 2q M 1. Clearly, b…t†, the collection of buckets would use expected O…1† I/Os for answering a member-
used at time t, is a prefix of btotal and i…t†  q, 8t. ship query on S…t†. This means that on average each
Consider bucket bj from the sequence btotal bucket bj …t† used for S…t† would be of limited size or,
…0  j  2q M 1†. State bj …t† is the set of oids stored in equivalently, jbj …t†j=B corresponds to just a few (one or
this bucket at t. If any state bj …t† can be reconstructed for two) pages. In perspective, partially persistent linear
each bucket bj , answering a temporal membership query for hashing will reconstruct bj …t† in O…jbj …t†j=B† I/Os, which
oid k at time t is answered in two steps: 1) Find which is expected O…1†.
bucket bj , oid k would have been mapped by LH…t† and The size of array H depends on s, the maximum number
2) search through the contents of bj …t† until k is found. of alive objects at any time instant …s ˆ maxt …jS…t†j††. An
The first step requires identifying LH…t†. This is easily entry is added in H for each controlled split, i.e., when the
maintained if a record of the form < t; i…t†; p…t† > is number of alive objects is above the load factor threshold. If
appended to an array H, for those instants t where i…t† the scheme starts with a single bucket at worse a new split
and/or p…t† change. Given any t, LH…t† is identified by happens after adding Bl new oids. The number of splits
simply locating t inside the time-ordered H in a logarithmic is therefore bounded by O…s=Bl† and table H has
search (see discussion below). O…s=…B2 l†† pages. Even if all changes are new, oid
The second step implies accessing bj …t†. By observing the additions searching array H takes O…logB …n=B†† I/Os.
evolution of bucket bj we note that its state changes as an Using LH…t†, the appropriate bucket bj is pinpointed and
evolving set by adding/deleting oids. Each such change can time t must be searched in the time-tree associated with this
be timestamped with the time it occurred. At times, the bucket. This search is bounded by O…logB …nj =B††, where nj
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 5

Fig. 1. Two instants in the evolution of an ephemeral hashing scheme. (a) Until time t ˆ 20, no split has occurred and p ˆ 0. (b) At t ˆ 21, oid 8 is
mapped to bucket 3 and causes a controlled split. Bucket 0 is rehashed using h1 and p ˆ 1.

Fig. 2. The detailed evolution for set S until time t ˆ 25. The addition of oid 8 in S at t ˆ 21 causes the first split. Moving oid 15 from bucket 0 to
bucket 5 is seen as a deletion and an addition, respectively.

corresponds to the number of changes recorded in bucket first 2i M buckets (for some i); then, nj behaves as
bj 's history (at worst nj is O…n†). O…n=2i M†† and searching bj 's time-tree is rather fast.
In practice, we expect that: 1) s is small compared to n,
3.1.1 Update and Space Analysis
i.e., array H will use few pages and can thus be stored in It suffices to show that the partially persistent linear
main memory. 2) Most of S's history will be recorded on the hashing scheme uses O…n=B† space. An O…1† amortized
6 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

expected update processing per change can then be derived. first round, cannot be copied again in the second round
Assume that the corresponding ephemeral linear hashing as they represent deleted records in their corresponding
scheme LH…t† starts with M initial buckets and uses upper buckets). Hence, the maximum number of copies after
threshold g. Assume, for simplicity, that set S evolves by the second round is ……2M 1†B ‡ 1 ‡ …4M 1†B ‡ 1†.
only adding oids (oid additions create new records and, The total number of copies Ctotal created in all rounds
hence, more copying; deletions do not create splits). satisfies:
Clearly, array H satisfies the space bound. Next, we
show that the space used by bucket histories is also Ctotal  ‰f…2M 1†B ‡ 1g‡
bounded by O…n=B†. Since, for partially persistent linear f…2M 1†B ‡ 1 ‡ …4M 1†B ‡ 1g ‡ . . .Š;
hashing, each split creates artificial changes (the oids
where each {} represents copies per round. Using M  1
moved to a new bucket are deleted from the previous
and B > 1, we get:
bucket and subsequently added to a new bucket as new
" #
records), it must be shown that the number of oid moves i X
X k
j
(copies) due to splits is still bounded by the number of real Ctotal ‰2MB ‡ f4MB ‡ 2MBg ‡ . . .Š  MB 2
changes n. We will first use two lemmas that upper and kˆ0 jˆ0
i‡1 i‡2
lower bound such oid moves during the first round of ˆMB‰2…2 1† …i ‡ 1†Š < 2 MB:
splits. …1†
Lemma 1. For any N in 1  N  M, N splits occur after at least
Lemma 1 implies that, after the first round, we have at
…M ‡ N 1†Bg real object additions happen.
least …2M 1†Bg real oid additions. At the end of the
Proof. Just before the Nth split, we have exactly M ‡ N 1 second round, the total number of oid additions is at
buckets and, hence, the load is l ˆ jSj=B…M ‡ N 1†. By least …4M 1†Bg. However, …2M 1†Bg of these were
definition, for the next controlled split to occur, we must inserted and counted for the first round. Thus, the
have l > g, which implies that l ˆ jSj= > …M ‡ N 1†Bg.t u second round introduces at least 2MBg new additions.
Lemma 2. For any N in 1  N  M, N splits can create at most Similarly, the third round introduces at least 4MBg new
…M ‡ N 1†B ‡ 1 oid copies. oid additions, etc. The number of real oid additions
Atotal , after all rounds, is lower bounded by:
Proof. Assume that all buckets are initially empty. Observe
that the number of oid copies made during a single Atotal ‰f…2M 1†Bgg ‡ f2MBgg ‡ f4MBgg ‡ . . .Š
round is at most equal to the number of real object ‰MBg ‡ 2MBg ‡ 4MBg ‡ . . .Š:
additions. This is because, during one round, an oid is
rehashed at most once. The number of real additions Hence,
before a split increases with g (lower g triggers a split
X
i
earlier, i.e., with less oids). Hence, the most real Atotal  MBg 2k ˆ …2i‡1 1†MBg  2i MBg: …2†
additions occur when g ˆ 1. The Nth controlled split is kˆ0
then triggered when the load exceeds 1. Since there
are already M ‡ N 1 buckets, this happens at the From (1) and (2), it can be derived that there exists a
positive constant const such that Ctotal =Atotal < const.
……M ‡ N 1†B ‡ 1†th object addition. u
t
Since Atotal is bounded by the total number of changes
n, we have Ctotal ˆ O…n†. For proving that partially
The basic theorem about space and updating follows:
persistent linear hashing has O…1† expected amortized
Theorem 1. Partially Persistent Linear Hashing uses space updating per change, we note that when a real change
proportional to the total number of real changes and updating occurs it is directed to the appropriate bucket, where
that is amortized expected O…1† per change. the structures of the Snapshot Index are updated in
Proof. As splits occur, linear hashing proceeds in rounds. In O…1† expected time. Rehashings must be carefully
the first round, variable p starts from bucket 0 and, in the examined. This is because a rehashing of a bucket is
end of the round, it reaches bucket M 1. At that point, caused by a single real oid addition (the one that
2M buckets are used and all copies (remappings) from created the split), but it results in a ªbunchº of copies
oids of the first round have been created. Since M splits made to a new bucket (at worse, the whole current
have occurred, Lemma 2 implies that there must have contents of the rehashed bucket are sent to the new
been at most …2M 1†B ‡ 1 oid moves (copies). By bucket). However, using the space bound, any sequence
construction, these copies are placed in the last M buckets. of n real changes can at most create O…n† copies (extra
For the next round, variable p will again start from work) or, equivalently, O…1† amortized effort per real
bucket 0 and will extend to bucket 2M 1. When p change. u
t
reaches bucket 2M 1, there have been 2M new splits.
3.1.2 Optimization Issues
These new splits imply that there must have been at most
…4M 1†B ‡ 1 copies created from these additions. The Optimizing the performance of partially persistent linear
copied oids from the first round are seen in the second hashing involves the load factor l of the ephemeral
round as regular oid additions. At most, each such oid Linear Hashing and the usefulness parameter u of the
can be copied once more during the second round (the Snapshot Index. Load l lies between thresholds f and g.
original oids from which these copies were created in the Note that l is an average over time of l…t† ˆ j…S…t††j=Bjb…t†j,
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 7

where jS…t†j and jb…t†j denote the size of the evolving set S records clustered by oid. However, a C-list and a partially-
and the number of buckets used at t (clearly, b…t†  jbtotal j). persistent list solve different problems: 1) A C-list addresses
A good ephemeral linear hashing scheme will try to equally the query: ªGiven an oid and a time interval, find all
distribute the oids among buckets for each t. Hence, on versions of this oid during this interval.º Instead, a partially
average, the size (in oids) of each bucket bj …t† will satisfy persistent list finds all the oids that its ephemeral list had at
jbj …t†j  j…S…t††j=jb…t†j. a given time instant. 2) In a C-list, the various temporal
One of the advantages of the Snapshot Index is the ability versions of an oid must be clustered together. This implies
to tune its performance through usefulness parameter u. The that oids are also ordered by key. In contrast, a persistent
index will distribute the oids of each bj …t† among a number of list does not require ordering its keys, simply because its
useful pages. Since each useful page (except the acceptor ephemeral list is not ordered. 3) In a C-list, updates and
page) contains at least uB alive oids, the oids in bj …t† will be queries are performed by first using the MVAS structure.
occupying at most jbj …t†j=uB  j…S…t††j=uBjb…t†j ˆ l…t†=u Updating/querying the partially persistent list always
pages. Ideally, we would like the answer to a snapshot query starts from the top of the list and proceeds through a linear
to be contained in a single page. Then, a good optimiza-
search.
tion choice is to maintain l=u < 1. Conceptually, load l
For the evolving-list approach, a list page has now two
gives a measure of the size of a bucket (ªaliveº oids) at each
areas. The first area is used for storing oid records and its
time. These alive oids are stored in the data pages of the
size is Br (Br is O…B†). The second area, of size B-Br (B-Br is
Snapshot Index. Recall that an artificial copy happens if the
also O…B†), accommodates an extra structure (array NT ),
number of alive oids in a data page falls below uB. At that
which will be explained shortly. When the first oid k is
point, the remaining uB 1 alive oids of this page are
copied to a new page. By keeping l below u we expect that added on bucket bj at time t, a record < k; ‰t; now† > is
the alive oids of the split page will be copied in a single appended in the first list page. Additional oid insertions
page which minimizes the number of I/Os needed for will create record insertions in this page. When the Br part
finding them. of the first page gets full of records, a new page is appended
On the other hand, the usefulness parameter u affects the in the list and so on. If oid k is deleted at t0 from the bucket,
space used by the Snapshot Index and in return the overall its record in the list is found by a serial search among the
space of the persistent hashing scheme. Higher values of u list pages and its end_time is updated.
imply frequent time splits, i.e., more page copies and, thus, Good clustering is achieved if each page in lbj …t† is a
more space. Hence, it would be advantageous to keep u low useful page. A page is useful if 1) it is full of records and
but this implies an even lower l. In return, lower l would contains at least uBr alive records …0 < u  1† or 2) it is the
mean that the buckets of the ephemeral hashing are not last list page. A nonuseful page is taken off list lbj …t† and its
fully utilized. This is because low l causes set S…t† to be alive records are copied to the current last list page. If the
distributed into more buckets, not all of which may be fully last page in lbj …t† does not have enough free space, a new
occupied. last page is appended.
At first, this requirement seems contradictory. However, Reconstructing bj …t† is equivalent to finding the pages in
for the purposes of partially persistent linear hashing, lbj …t†. We will use two array structures. Array F Tj …t†
having low l is still acceptable. Recall that the low l applies provides access to the first page in the list for any time t.
to the ephemeral hashing scheme whose history the Entries in F Tj have the form < time; pid > , where pid is the
partially persistent hashing observes and accumulates. address of the first page. When the first page of the list
Even though at single time instants the bj …t†s may not be changes, a new entry is appended in F Tj . This array can be
fully utilized, over the whole time evolution many object implemented as a multilevel index since entries are added
oids are mapped to the same bucket. What counts for the in increasing time order. Each remaining page of lbj …t† is
partially persistent scheme is the total number of changes found by the second structure, which is an array of size
accumulated per bucket. Because of bucket reuse, a bucket B-Br , implemented inside every list page. Let NT …A† be the
will gather many changes creating a large history for the array inside page A. This array is maintained for as long as
bucket and, thus, justifying its use in the partially persistent the page is useful. Entries in NT …A† are also of the form
scheme. Our findings regarding optimization will be
<time; pid > , where pid corresponds to the address of the
verified through the experimentation results that appear
next list page after page A.
in Section 4.
To answer a temporal membership query for oid k at
3.2 The Evolving-List Approach time t the appropriate bucket bj , where oid k would have
The elements of bucket bj …t† can also be viewed as an been mapped by the hashing scheme at t is found. This part
evolving list lbj …t† of alive oids. Such an observation is is the same with the evolving-set approach. To reconstruct
consistent with the way buckets are searched in ephemeral bucket bj …t†, the first page in lbj …t† is retrieved by searching t
hashing, i.e., linearly, as if a bucket's contents belong to a in array F Tj . This search is O…logB …nj =B††. The remaining
list. Accessing the bucket state bj …t† is then reduced to pages of lbj …t† are found by locating t in the NT array of
reconstructing lbj …t†. Equivalently, the evolving list of oids each subsequent list page. Since all pages in the list lbj …t† are
should be made partially persistent. useful, the oids in bj …t† are found in O…jbj …t†j=B† I/Os.
We note that [39] presents a notion of persistent lists Hence, the space used by the evolving-list approach is
called the C-lists. A C-list is a list structure made up of a O…n=B† while updating is O…jbt …t†j=B† per update (for
collection of pages that contain temporal versions of data details, see the Appendix).
8 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

There are two differences between the evolving-list and were also examined). Since the entries in the time-tree
the evolving-set approaches. First, updating using the associated with a bucket have half the object record size,
Snapshot Index remains constant, while, in the evolving each time-tree page can hold up to 50 entries.
list, the whole current list may have to be searched for In the PPLH-l implementation, Br holds 20 object
deleting an oid. Second, reconstruction in the evolving-list records. Here, the usefulness parameter was set to
starts from the top of the list pages while, in the evolving- u ˆ 0:25. The remaining space in a list page (five object
set, reconstruction starts from the last page of the bucket. records) is used for the page's NT array. Similarly with
This may affect the search for a given oid depending time-arrays, NT arrays have entries of half the size, i.e., each
whether it has been placed near the top or near the end of page can hold 10 NT entries. For the same reason, the pages
the bucket. of an F Tj array can hold up to 50 entries.
The Multiversion B-tree (MVBT) implementation uses
buffering during updates; this buffer stores the pages in the
4 PERFORMANCE ANALYSIS
path to the last update (LRU buffer replacement policy is
For the Partially Persistent Linear Hashing, we implemen- used). Such buffering can be very advantageous since
ted the set-evolution (PPLH-s) and the list-evolution updates are directed toward the most current B-tree, which
(PPLH-l) approaches, using controlled splits. Both are is a small part of the whole MVBT structure. In our
compared against Linear Hashing (in particular, Atemporal experiments, we set the buffer size to 10 pages. The original
linear hashing, which is discussed below), the MVBT, and MVBT uses buffering for queries, too. For a fair comparison
two R*-tree implementations (Ri which stores intervals with the partially persistent methods, during queries we
in a 2-dimensional space and Rp which stores points in allow the MVBT to use a buffer that is as large in size as
a 3-dimensional space). The partial persistence methodol- array H for that experiment. Recall that, during querying,
ogy is applicable to other hashing schemes as well. We the MVBT uses a root* structure. Even though root* can
implemented Partially Persistent Extendible Hashing using increase with time, it is small enough to fit in the above
the set-evolution (PPEH-s) and uncontrolled splits and main-memory buffer. Thus, we do not count I/O accesses
compared it with PPLH-s. Implementation details and the for searching root*.
experimental setup are described in Section 5.1, the data As with the Snapshot Index, a page in the MVBT is
workloads in Section 5.2, and our findings in Section 5.3. ªaliveº as long as it has at least q alive records. If the
number of alive records falls below q, this page is merged
4.1 Method ImplementationÐExperimental Setup with a sibling (this is called a weak version underflow in [2]).
We set the size of a page to hold 25 object records (B ˆ 25). On the other extreme, if a page already has B records (alive
In addition to the object's oid and lifespan, the record or not) and a new record is added, the page splits (called a
contains a pointer to the actual object (which may have page overflow in [2]). Both conditions need special handling.
additional attributes). First, a time-split happens (which is like the copying
We first discuss the Atemporal linear hashing (ALH). It procedure of the Snapshot Index). All alive records in the
should be clarified that ALH is not the ephemeral linear split page are copied to a new page. Then, the resulting new
hashing whose evolution the partially persistent linear page should be incorporated in the structure. The MVBT
hashing observes and stores. Rather, it is a conventional requires that the number of alive records in the new page
linear hashing scheme that treats time as just another should be between q ‡ e and B-e, where e is a predeter-
attribute. This scheme simply maps objects to buckets using mined constant. Constant e works as a threshold that
the object oids. Consequently, it ªseesº the different guarantees that the new page can be split or merged only
lifespans of the same oid as copies of the same oid. We after at least e new changes. Not all values for q, e, and B are
implemented ALH using the scheme originally proposed by possible as they must satisfy some constraints; for details
Litwin in [20]. For split functions, we used the hashing by we, refer to [2]. In our implementation, we set q ˆ 5 and
division functions hi …oid† ˆ oid mod 2i M with M ˆ 10. For e ˆ 4.
good space utilization, controlled splits were employed. Another choice for a range-snapshot method would be
The lower and upper thresholds (namely, f and g) had values the MVAS [39], whose updating policies lead to smaller
0.7 and 0.9, respectively. space requirements than the MVBT. In addition, the MVAS
Another approach for Atemporal hashing would be a does not use the root* structure. With either method, a
scheme which uses a combination of oid and the start_time temporal membership query is answered by following a
or end_time attributes. However, this approach would still single path which is proportional to the height of the
have the same problems as ALH for temporal membership structure. We had access to a main-memory simulation
queries. For example, hashing on start_time does not help of the MVAS and compared it with the disk-based
for queries about time instants other than the start_times. MVBT implementation. We were able to perform up to
The two Partially Persistent Linear Hashing approaches n ˆ 120K updates with the simulated MVAS code, using
(PPLH-s and PPLH-l) observe an ephemeral linear hashing (a shorter version of) the uniform-10 workloads (see Table
LH…t† with controlled splits and load between f ˆ 0:1 and 1 in Section 4.3). As expected, the simulated MVAS used
g ˆ 0:2. Array H is kept in main memory (i.e., no I/O cost is less space (6,209 pages against 6,961 pages in the MVBT).
counted for accessing H). In our experiments, the size of H However, the difference was not enough to affect the
varied from 5 to 10 KB (which can easily fit in today's main structures' heights. In particular, the simulated MVAS had
memories). Unless otherwise noted, PPLH-s was implemen- height six, while the MVBT needed two levels in the root*, and
ted with u ˆ 0:3 (other values for usefulness parameter u each tree underneath had a height between three and four.
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 9

TABLE 1
Title???

The Ri implementation assigns to each oid its lifespan buckets was implemented using Snapshot indices with
interval. One dimension is used for the oid and one for the higher utilization parameter (u ˆ 0:5).
interval. When a new oid k is added in set S at time t, a
record < k; ‰t; now†; ptr > is added in an R*-tree data page. 4.2 Workloads
If oid k is deleted at t0 , the record is updated to Various workloads were used for the comparisons. Each
< k; ‰t; t0 †; ptr > . Note that now is stored in the R*-tree as workload contains an evolution of data set S and temporal
some large number (larger than the maximum evolution membership queries on this evolution. More specifically, a
time). Directory pages in the R*-tree include one more workload is defined by triplet W ˆ …U; E; Q†, where U is the
attribute per record for representing an oid range. The universe of the oids (the set of unique oids that appeared in
Rp implementation has a similar format for data pages, the evolution of set S; clearly, s  jUj), E is the evolution of
but it assigns separate dimensions for the start_time and set S, Q ˆ fQ1; . . . ; QjUj g is a collection of queries, and Qk is
the end_time of the object's lifespan interval. Hence, a the set of queries that corresponds to oid k.
directory page record has seven attributes (two for each of Each evolution starts at time 1 and finishes at time
the oids, start_time, end_time, and one for the pointer). MAXTIME. Changes in a given evolution were first
During updating, both R*-tree implementations used a generated per object oid and then merged. First, for each
buffer (10 pages) to keep the pages in the path leading to object with oid k, the number nk of the different lifespans for
the last update. A buffer as large as the size of array H this object in this evolution was chosen. The choice of nk
was again used during the query phase. To further optimize was made using a specific random distribution function
the R*-tree approaches toward temporal membership (namely, Uniform, Exponential, Step, or Normal) whose
queries, we forced the trees to cluster data first based on details are described in the next section. The start_times of
the oids and then on the time attributes. the lifespans of oid k were generated by randomly picking
Another popular external dynamic hashing scheme is nk different starting points in the set f1; :::; MAXT IMEg.
Extendible Hashing [11], which is different from linear The end_time of each lifespan was chosen uniformly
hashing in two ways. First, a directory structure is used to between the start_time of this lifespan and the start_time
access the buckets. Second, an overflow is addressed by of the next lifespan of oid k (since the lifespans of each oid
splitting the bucket that overflowed. Note that traditional k are disjoint). Finally, the whole evolution E for set S was
extendible hashing uses uncontrolled splitting. To imple- created by merging the evolutions for every object.
ment Partially Persistent Extendible Hashing, one needs to For another ªmixº of lifespans, we also created an
keep the directory history as well as the history of each evolution that picks the start_times and the length of the
bucket. The history of each bucket was kept using the lifespans using Poisson distributions; we called it the
evolving-set approach (PPEH-s). In our experiments, the Poisson evolution.
directory history occupied between 7 and 18KB, so we kept A temporal membership query in query set Q is specified
it in main memory. Since uncontrolled splitting is used, by tuple …oid; t†. The number of queries Qk for every object
most buckets will be full of records before being split. This with oid k was chosen randomly between 10 and 20; thus,
implies that the alive records in a bucket are close to one on average, Qk  15. To form the …k; t† query tuples, the
page. Consequently, to get better query performance (with corresponding time instants t were selected using a
the expense of higher space overhead), the history of such uniform distribution from set f1; . . . ; MAXT IMEg. The
10 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

MAXTIME is set to 50,000 for all workloads. The value oid lifespan; the additional space is mainly because the
used in the R-trees for now was 100,000. average R-tree page utilization is about 65 percent. PPLH-l
Each workload is described by the distribution used to uses more space than PPLH-s because the NT array
generate the object lifespans, the number of different oids, implementation reduces page utilization. The MVBT has
the total number of changes in the evolution n (object the largest space requirements, about twice more space than
additions and deletions), the total number of object the ALH and PPLH-s methods.
additions NB, and the total number of queries. To consider the effect of lifespan distribution, all
approaches were compared using five additional workloads
4.3 Experiments
(called the exponential, step, normal, Poisson, and uniform-
First, the behavior of all implementations was tested using a consecutive). These workloads had the same number of
basic Uniform workload. The number of lifespans per object distinct oids …jUj ˆ 8; 000†, number of queries (115,878), and
follows a uniform distribution between 20 and 40. The total similar n … 0:5M† and NB ˆ … 30† parameters. The
number of distinct oids was jUj ˆ 8; 000, the number of real Exponential workload generated the nk lifespans per oid
changes n ˆ 466; 854, and NB ˆ 237; 606 object additions. using an exponential distribution with probability density
Hence, the average number of lifespans per oid was function f…x† ˆ exp… x† and mean 1= ˆ 30. The total
NB  30 (we refer to this workload as Uniform-30). The number of changes was n ˆ 487; 774, the total number of
number of queries was 115,878. object additions was NB ˆ 245; 562, and NB ˆ 30:7. In the
Fig. 3a presents the average number of pages accessed
Step workload, the number of lifespans per oid follows a
per query by all methods. The PPLH methods have the best
step function. The first 500 oids have four lifespans, the next
performance, about two pages per query. The ALH approach
500 have eight lifespans, and so on, i.e., for every 500 oids,
uses more query I/O (about 1.5 times in this example)
the number of lifespans advances by four. In this workload,
because of the larger buckets it creates. The MVBT also uses
we had n ˆ 540; 425, NB ˆ 272; 064, and NB ˆ 34. The
more I/O (about 1.75 times) than the PPLH approaches since
Normal workload used a normal distribution with  ˆ 30
a tree path is traversed per query. The Ri uses more I/Os
and 2 ˆ 25. Here, the parameters were n ˆ 470; 485,
per query than the MVBT (about 11.5 I/Os), mainly due to
NB ˆ 237; 043, and NB ˆ 29:6.
tree node overlapping and larger tree height (its height
For the Poisson workload the first lifespan for every oid
relates to the total number of oid lifespans while MVBT's
was generated randomly between time instants 1 and 500.
height corresponds to the alive oids at the time specified by
The length of a lifespan was generated using a Poisson
the query). The Rp tree has the worse query performance (an
distribution with mean 1,100. Each next start time for a
average of 28.3 I/Os per query). The performance of the R-
given oid was also generated by a Poisson distribution with
tree methods has been truncated in Fig. 3a to fit the graph.
While using a separate dimension for the two endpoints of a mean value 500. For this workload, we had n ˆ 498; 914,
lifespan interval allows for better clustering (see also the NB ˆ 251; 404, and NB ˆ 31. The main characteristic of the
space usage in Fig. 3c), it makes it more difficult to check Poisson workload is that the number of alive oids over time
whether an interval contains a query time instant. can vary from a very small number to a large proportion of
Fig. 3b shows the average number of I/Os per update. jUj, i.e., there are time instants where the number of alive
The best update performance was given by the PPLH-s oids is some hundreds and other time instants where almost
method. In PPLH-l, the NT array implementation inside all distinct oids are alive.
each page limits the actual page area assigned for storing The special characteristic of the Uniform-consecutive
oids and, thus, increases the number of pages used per workload is that it contains objects with multiple but
bucket. The MVBT update is longer than PPLH-s since the consecutive lifespans. This scenario occurs when objects are
MVBT traverses a tree for each update (instead of quickly updated frequently during their lifetime. Each update is
finding the location of the updated element through seen as the deletion of the object followed by the insertion of
hashing). The update of Ri follows; it is larger than the the updated object at the same time. Since the object retains
MVBT since the size of the tree traversed is related to all oid its oid through updates, this process creates consecutive
lifespans (while the size of the MVBT structure traversed is lifespans for the same object (the end of one lifespan is the
related to the number of alive oids at the time of the update). start of the next lifespan). This workload was based on the
The Rp tree uses larger update processing than the Ri Uniform-30 workload and had n ˆ 468; 715, NB ˆ 236; 155,
because of the overhead to store an interval as two points. and NB ˆ 30. An object has a single lifetime which is cut
The ALH had the worse update processing since all into consecutive lifespans. The start_times of an object's
lifespans with the same oid are thrown on the same bucket, lifespans are chosen uniformly.
creating large buckets that must be searched serially. Fig. 4 presents the query, update and space performance
The space consumed by each method appears in Fig. 3c. under the new workloads. The results resemble the
The ALH approach uses the smallest space since it stores a Uniform-30 workload. (For brevity, we have excluded the
single record per oid lifespan and uses ªcontrolledº splits R-tree-based methods from the remaining discussion as
with high utilization. The PPLH-s method has also very they consistently had much worse query performance; the
good space utilization, very close to ALH. The R-tree interested reader can find the detailed performance in
methods follow; Rp uses slightly less space than the Ri [17]). As before, the PPLH-s approach has the best overall
because paginating intervals (putting them into bounding performance using slightly more space than the ªminimalº
rectangles) is more demanding than with points. Note that space of ALH. PPLH-l has the same query performance
similarly to ALH, both R* methods use a single record per with PPLH-s, but uses more updating and space. Note that
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 11

Fig. 3. (a) Query, (b) update, and (c) space performance for all implementations on a uniform workload with 8K oids, n  0:5M and NB  30.
12 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

Fig. 4. (a) Query, (b) update, and (c) space performance for ALH, PPLH-s, PPLH-l, and MVBT methods using the exponential, step, normal, and
Poisson workloads with 8K oids, n  0:5M, and NB  30.
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 13

in Fig. 4c, the space of the MVBT is truncated (MVBT used easier. Fig. 7c shows the space of PPLH-s. For us below the
about 26K, 28K, 25K, 35K, and 28K pages for the respective maximum load, the alive oids are distributed among more
workloads). We have tried the consecutive workloads for data pages; hence, when such a page becomes nonuseful it
the exponential, step, normal, and Poisson distributions, contains less alive oids and, thus, less copies are made,
too. The comparative behavior of all methods in the resulting in smaller space consumption. Using this optimi-
consecutive workloads was similar and, thus, is not zation, the space of PPLH-s can be made similar to that of
presented. the ALH at the expense of some increase in query/update
The effect of the number of lifespans per oid was tested performance.
using 10 uniform workloads with varying average number Finally, we compared the PPLH-s with partially persis-
of lifespans NB (from 10 to 1,000). All used jUj ˆ 8; 000 tent extendible hashing, a method that also uses the
different oids and the same number of queries … 115K†. evolving-set approach but uncontrolled splits (PPEH-s).
The other parameters are shown in Table 1. With uncontrolled splits, a bucket bj is split only when it
Fig. 5 shows the results for small to medium NB values. overflows. Thus, the ªobservedº ephemeral hashing tends
The query performance of atemporal hashing deteriorates to use a smaller number of buckets, but longer in size. As it
as NB increases since buckets become larger (Fig. 5a). The turns out, this policy has serious consequences. First, for the
PPLH-s, PPLH-l, and MVBT methods have a query Snapshot Index that keeps the history of bucket bj , the state
performance that is independent of NB. PPLH-s outper- bj …t† is a full page. This means that the Snapshot Index will
forms all methods in update performance (Fig. 5b). The distribute bj …t† into more physical pages in the bucket's
updating of PPLH-s, PPLH-l, and MVBT is basically history. Consequently, to reconstruct bj …t†, more pages will
independent of NB. Since increased NB implies larger be accessed. Second, the Snapshot Index will be copying
bucket sizes, the updating of ALH increases. The space of larger alive states and, thus, occupy more space. We run the
all methods increases with NB as there are more changes same set of experiments for the PPEH-s. In summary,
per evolution (Table 1). The ALH has the least space, PPEH-s used consistently larger query, update, and space
followed by the PPLH-s; the MVBT has the steeper space than PPLH-s (Table 5 presents some of our experimental
increase (for NB values 80 and 100, MVBT used 68K and results).
84.5K pages).
The methods performed similarly for high NB values,
5 CONCLUSIONS AND OPEN PROBLEMS
namely, 300, 500, and 1,000 (see Table 4). The query and
update performance of PPLH-s, PPLH-l, and MVBT We addressed the problem of Temporal Hashing or,
remained basically unaffected by NB. However, ATH equivalently, how to support temporal membership queries
was drastically affected as these workloads had many more over a time-evolving set S. An efficient solution, termed
occurrences per oid. The MVBT followed a steeper space partially persistent hashing, was presented. By hashing oids to
increase than ATH, PPLH-s, and PPLH-l. various buckets over time, partially persistent hashing
The effect of the number of distinct oids per evolution reduces the temporal hashing problem into reconstructing
was examined by considering four uniform workloads. The previous bucket states. The paper applies the methodology
number of distinct oids jUj was 5,000, 8,000, 12,000, and on linear hashing with controlled splits. Two flavors of
16,000, respectively. All workloads had a similar average partially persistent linear hashing were presented, one
number of lifespans per distinct oid …NB  30†. The other based on an evolving-set abstraction (PPLH-s) and one on
parameters appear in Table 2. The results are depicted in an evolving-list (PPLH-l). They have similar query and
Fig. 6. The query performance of the hashing methods comparable space performance, but PPLH-s uses much less
(PPLH-s, PPLH-l, and ALH) is basically independent of jUj, updating. Both methods were compared against straight-
with PPLH-s and PPLH-l having the lowest query I/O. In forward approaches, namely, traditional (atemporal)
contrast, it increases for the MVBT. The increase is because linear hashing scheme, the Multiversion B-Tree, and two
more oids are stored in these tree structures, thus increasing R*-tree implementations. The experiments showed that
the structure's height. Similar observations hold for the PPLH-s has the most robust performance. Partially persis-
update performance (i.e., the hashing-based methods have tent hashing should be seen as an extension of traditional
updating that is basically independent of jUj, while the external dynamic hashing in a temporal environment. The
updating of tree-based methods tends to increase with jUj). methodology is general and can be applied to other
Finally, the space of all methods increases because n ephemeral dynamic hashing schemes, like extendible
increases (Table 2). hashing, etc.
From the above experiments, the PPLH-s method has the An interesting extension is to find the history of a given
most competitive performance among all solutions. Its oid that existed at time t. This query can be addressed if
performance can be further optimized through useful- each version remembers the previous version of the same
ness parameter u. Fig. 7 shows the results for the oid (through a pointer to that version). Note that this simple
basic Uniform-30 workload (jUj ˆ 8; 000, n ˆ 466; 854, solution does not cluster together versions of the same
NB ˆ 237; 606, and NB  30), but with different values oid, i.e., accessing a previous version may result in a
of u. As expected, the best query performance occurs distinct I/O. A better approach is to involve the C-lists of
if u is greater than the maximum load of the observed [39]; however, this is beyond the scope of this paper.
ephemeral hashing. For these experiments, the maximum Traditionally, hashing has been used to speed up join
load was 0.2. As asserted in Fig. 7a, the query time is computations. We currently investigate the use of temporal
minimized after u ˆ 0:3. The update is similarly minimized hashing to speed up joins of the form: Given two time-
(Fig. 7b) for us above 0.2 since, after that point, the alive evolving sets X, Y, join X…t1 †, and Y …t2 † (that is, find the
oids are compactly kept in a few pages that can be updated objects common to both states). This temporal join is
14 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

Fig. 5. (a) Query, (b) update, and (c) space performance for ALH, PPLH-s, PPLH-l, and MVBT methods, using various uniform workloads with
varying NB.
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 15

TABLE 2
Title???

TABLE 3
Notation

TABLE 4
Performance Comparison for High NB Values

different than the valid-time natural join examined in [33], temporal predicate is generalized to an interval, i.e., find the
[40]. There oids were matched as long as the lifespan objects that were both in set X during interval I1 and in set
intervals were intersecting (i.e., the join predicate did not Y during interval I2 . Of course, one has to worry about
include any specific temporal predicate). Also, our environ- handling artificial oid copies created by the Snapshot Index
ment is based on transaction-time where changes are in or the evolving-list approach. However, methodologies
time order. Note that one can easily manipulate the have been recently proposed to avoid such copies [4]. It
temporal hashing methods to answer interval membership would be an interesting problem to compare indexed
queries (ªfind all objects that were in set S during interval temporal joins based on temporal hashing with temporal
Iº). Thus, temporal hashing applies also in joins where the joins based on MVBT or MVAS indices. Finally, the
16 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

Fig. 6. (a) Query, (b) update, and (c) space performance for ALH, PPLH-s, PPLH-l, and MVBT methods using various uniform workloads with varying
jUj.
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 17

Fig. 7. (a) Query, (b) update, and (c) space performance for PPLH-s on a uniform workload with varying values of the usefulness parameter u.

discussion in this paper assumes temporal membership maintained. From each page, only the NT array is shown.
queries over a linear transaction-time evolution. It is In this example, B-Br ˆ 4 entries. At time t6 , array NT …A†
interesting to investigate hashing in branched transaction fills up and an artificial copy of page A is created with array
environments [25]. NT …A0 †. Array NT …C† is also updated about the artificially
created new page.
APPENDIX The artificial creation of page A0 provides faster query
processing. If instead of creating page A0 , NT …A† is allowed
UPDATE AND SPACE ANALYSIS OF THE to grow over the B-Br area of page A (using additional
EVOLVING-LIST APPROACH pages), the last entry on NT …C† still points to page A.
The NT arrays facilitate the efficient location of the next list Locating the next page after C at time t would lead to page A,
page. However, if, during the usefulness period of page A, but then a serial search among the pages of array NT …A† is
its next page changes often, array NT …A† can become full. needed. Clearly, this approach is inefficient if the page in
Assume this scenario happens at time t and let C be the front of page A changes many times. Using the artificial
page before page A. Page A is then artificially replaced in the pages, the next useful list page for any time of interest is
list by page A0 . The Br part of A0 is a copy of the Br part of found by one I/O! This technique is a generalization of the
page A (thus, A0 is useful), but NT …A0 † is empty. A new backward updating technique used in [35].
entry is added in NT …C† with the pid of A0 and the first In the evolving-list approach, copying occurs when a
entry of NT …A0 † stores the pid of the page (if any) that was page turns nonuseful or when a page is artificially replaced.
after page A at t. If all list pages until page A had their We first consider the evolving-list without the artificial page
NT arrays full, the above process of artificially turning replacements and show that the copying due to page
useful pages to nonuseful ones can propagate all the way to usefulness is linear to the total number of real oid changes.
the beginning of the list. If the first list page is replaced, We use an amortization argument that counts the number
array F Tj is updated. However, this does not happen often. of real changes that occur to a page before this page
Fig. 8 shows an example of how arrays NT …† and F Tj are becomes nonuseful. A useful page can become nonuseful
18 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XXX, NO. XX, XXXXXXX 2001

TABLE 5
PPLH-s (Controlled Splits) Versus PPEH-s (Uncontrolled Splits)

only after it has acquired Br records. A page that is not the Copying can also happen due to artificial page replace-
last list page will turn nonuseful if an oid deletion brought ments. Note that a page is replaced only when its NT array
the number of alive oids in this page below uBr . The last list becomes full. This implies that this page was not the last
page can turn nonuseful only if it does not have uBr alive page in the list and, hence, its Br part is already full. It can
records when it acquires Br records. However, a page never be easily verified that the extra space used by the backward
overflows; if it already has Br records, no new records can updating technique [35] is also O…nj =B†. Note that the first
be added to this page. replacement page is created after B-Br (which is O…B†) pages
In this environment, a page is appended in the end of the in front of it became nonuseful. Updating in the evolving-
list in two cases: Either a real oid insertion occurs and its list approach is O…jbj …t†j=B† since the whole current list
record cannot fit in the previous last page or a list page must be searched until a new oid is added or an existing oid
turned nonuseful and some of its (copied) alive records is updated as deleted.
cannot fit in the previous last page. At best, filling a page
with real oid insertions accommodates at least Br real
changes for this page. However, at worst, a new page can be ACKNOWLEDGMENTS
filled only with copies of alive records (from various The authors would like to thank the anonymous referees for
other pages that turned nonuseful). Copies do not their many careful comments. We also thank B. Seeger for
correspond to real oid insertions but this page will need providing the R* and MVBT code and R.M. Verma for the
at least Br -uBr (which is O…Br )) real oid deletions before MVAS simulation code. Part of this work was performed
it becomes nonuseful. Hence, the space used by this while V.J. Tsotras was on a sabbatical visit to UCLA; we
copying is O…nj =B†, where nj is the total number of real thus thank Carlo Zaniolo for his comments and hospitality.
changes recorded in the evolution of bucket bj . This research was partially supported by US National

Fig. 8. (a) An example evolution for the useful pages of list lbj …t†. (b) The corresponding F Tj and NT arrays.
KOLLIOS AND TSOTRAS: HASHING METHODS FOR TEMPORAL DATA 19

Science Foundation grants IIS-9509527, IIS-9907477 and by [28] R. Ramakrishnan, Database Management Systems, first ed. McGraw-
Hill, 1997.
the US Department of Defense. [29] B. Salzberg, File Structures. Englewood Cliffs, N.J.: Prentice Hall,
1988.
[30] B. Seeger, title???, personal communication, year???
[31] D.A. Schneider and D.J. DeWitt, ªTradeoffs in Processing
REFERENCES Complex Join Queries via Hashing in Multiprocessor Database
[1] I. Ahn and R.T. Snodgrass, ªPerformance Evaluation of a Machines,º Proc. Very Large Databases Conf., pp. 469-480, 1990.
Temporal Database Management System,º Proc. ACM SIGMOD [32] B. Salzberg and D. Lomet, ªBranched and Temporal Index
Conf., pp. 96-107, 1986. Structures,º Technical Report NU-CCS-95-17, College of Compu-
[2] B. Becker, S. Gschwind, T. Ohler, B. Seeger, and P. Widmayer, ªAn ter Science, Northeastern Univ., year???
Asymptotically Optimal Multiversion B-Tree,º Very Large Data [33] M.D. Soo, R.T. Snodgrass, and C.S. Jensen, ªEfficient Evaluation of
Bases J., vol. 5, no. 4, pp. 264-275, 1996. the Valid-Time Natural Join,º Proc. IEEE Conf. Data Eng., pp. 282-
[3] N. Beckmann, H.P. Kriegel, R. Schneider, and B. Seeger, ªThe R*- 292, 1994.
Tree: An Efficient and Robust Access Method for Points and [34] B. Salzberg and V.J. Tsotras, ªA Comparison of Access Methods
Rectangles,º Proc. ACM SIGMOD Conf., pp. 322-331, 1990. for Time-Evolving Data,º ACM Computing Surveys, vol. 31, no. 2,
[4] J. Van den Bercken and B. Seeger, ªQuery Processing Techniques June 1999.
for Multiversion Access Methods,º Proc. Very Large Databases [35] V.J. Tsotras, B. Gopinath, and G.W. Hart, ªEfficient Management
Conf., pp. 168-179, 1996. of Time-Evolving Databases,º IEEE Trans. Knowledge and Data
[5] D. Comer, ªThe Ubiquitous B-Tree,º ACM Computing Surveys, Eng., vol. 7, no. 4, pp. 591-608, Aug. 1995.
vol. 11, no. 2, pp. 121-137, 1979. [36] V.J. Tsotras, C.S. Jensen, and R.T. Snodgrass, ªAn Extensible
[6] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Notation for Spatiotemporal Index Queries,º ACM Sigmod Record,
Algorithms. MIT Press and McGraw-Hill, 1990. pp. 47-53, Mar. 1998.
[7] M. Dietzfelbinger, A. Karlin, K. Mehlhorn, F. Meyer, H. Rohnher- [37] V.J. Tsotras and N. Kangelaris, ªThe Snapshot Index, An I/O-
tand, and R. Tarjan, ªDynamic Perfect Hashing: Upper and Lower Optimal Access Method for Timeslice Queries,º Information
Bounds,º Proc. 29th IEEE Foundations on Computer Science, pp. 524- Systems, An Int'l J., vol. 20, no. 3, 1995.
531, 1988. [38] V.J. Tsotras and X.S. Wang, ªTemporal Databases,º Wiley
[8] J.R. Driscoll, N. Sarnak, D. Sleator, and R.E. Tarjan, ªMaking Data Encyclopedia of Electrical and Electronics Eng., vol. 21, pp. 628-641,
Structures Persistent,º J. Computers and Systems Science, vol. 38, 1999.
pp. 86-124, 1989. [39] P.J. Varman and R.M. Verma, ªAn Efficient Multiversion Access
[9] R.J. Enbody and H.C. Du, ªDynamic Hashing Schemesº ACM Structure,º IEEE Trans. Knowledge and Data Eng., vol 9, no 3,
Computing Surveys, vol. 20, no. 2, pp. 85-113, 1988. pp. 391-409, May/June 1997.
[10] R. Elmasri and S. Navathe, Fundamentals of Database Systems, third [40] T. Zurek, ªOptimization of Partitioned Temporal Joins,º PhD
ed. Benjamin/Cummings, 2000. thesis, Dept. of Computer Science, Univ. of Edinburgh, Nov. 1997.
[11] R. Fagin, J. Nievergelt, N. Pippenger, and H. Strong, ªExtendible
HashingÐA Fast Access Method for Dynamic Files,º ACM Trans. George Kollios received the diploma in elec-
Database Systems, vol. 4, no. 3, pp. 315-344, 1979. trical and computer engineering in 1995 from the
[12] A. Fiat, M. Naor, J.P. Schmidt, and A. Siegel, ªNonoblivious National Technical University of Athens, Greece,
Hashing,º J. ACM, vol. 39, no. 4, pp. 764-782, 1992. and the MSc and PhD degrees in computer
[13] M.J. Folk, B. Zoellick, and G. Riccardi, File Structures. Reading, science from Polytechnic University, New York
Mass.: Addisson Wesley, 1998. in 1998 and 2000, respectively. He is currently
[14] A. Guttman, ªR-Trees: A Dynamic Index Structure for Spatial an assistant professor in the Computer Science
Searching,º Proc. ACM SIGMOD Conf., pp. 47-57, 1984. Department at Boston University in Boston,
[15] ªA Consensus Glossary of Temporal Database Concepts,º ACM Massachusetts. His research interests include
SIGMOD Record, C.S. Jensen et al., eds., vol. 23, no. 1, pp. 52-64, temporal and spatiotemporal indexing, index
1994. benchmarking, and data mining. He is a member of the ACM, the
[16] C.S. Jensen and R.T. Snodgrass, ªTemporal Data Management,º IEEE, and the IEEE Computer Society.
IEEE Trans. Knowledge and Data Eng., vol. 11, no. 1, pp. 36-44, Jan./
Feb. 1999. Vassilis J. Tsotras received the diploma in
[17] G. Kollios and V.J. Tsotras, ªHashing Methods for Temporal electrical engineering (1985) from the National
Data,º Technical Report UCR-CS-98-01, Dept. of Computer Technical University of Athens, Greece and the
Science, Univ. of California-Riverside, (http://www.cs.ucr.edu/ MSc, MPhil, and PhD degrees in electrical
publications/tech_reports/). engineering from Columbia University, in 1986,
[18] A. Kumar, V.J. Tsotras, and C. Faloutsos, ªDesigning Access 1988, and 1991, respectively. He is currently an
Methods for Bitemporal Databases,º IEEE Trans. Knowledge and associate professor in the Department of
Data Eng., vol. 10, no.1, pp. 1-20, Jan./Feb. 1998. Computer Science and Engineering at the
[19] P. Larson, ªDynamic Hashing,º BIT, vol. 18, pp. 184-201, 1978. University of California, Riverside. Before that,
[20] W. Litwin, ªLinear Hashing: A New Tool for File and Table he was an associate professor of computer and
Addressing,º Proc. Very Large Databases Conf., pp. 212-223, 1980. information science at Polytechnic University, Brooklyn. During the
[21] W. Litwin, M.A. Neimat, and D.A. Schneider, ªLH*-A Scalable, summer of 1997, he was on a sabbatical visit in the Department of
Distributed Data Structure,º ACM Trans. Database Systems, vol. 21, Computer Science, University of California, Los Angeles. He received
no. 4, pp. 480-525, 1996. the US National Science Foundation (NSF) Research Initiation Award in
[22] T.Y.C. Leung and R.R. Muntz, ªTemporal Query Processing and 1991. He has served as a program committee member at various
Optimization in Multiprocessor Database Machines,º Proc. Very database conferences including SIGMOD, VLDB, ICDE, EDBT, etc. He
Large Databases Conf., pp. 383-394, 1992. was the program committee cochair of the Fifth Multimedia Information
[23] M-L. Lo and C.V. Ravishankar, ªSpatial Hash-Joins,º ACM Systems (MIS '99) conference and is the general chair of the Seventh
SIGMOD Conf., pp. 247-258, 1996. Symposium on Spatial and Temporal Databases (SSTD '01). His
[24] D. Lomet and B. Salzberg, ªAccess Methods for Multiversion research has been supported by various grants from US National
Data,º Proc. ACM SIGMOD Conf., pp. 315-324, 1989. Science Foundation, the US Defense Advanced Research Projects
[25] G.M. Landau, J.P. Schmidt, and V.J. Tsotras, ªOn Historical Agency, and the the US Department of Defense. His interests
Queries Along Multiple Lines of Time Evolution,º Very Large Data include access methods, temporal and spatiotemporal databases,
Bases J., vol. 4, pp. 703-726, 1995. semistructured data, and data mining.
[26] G. Ozsoyoglu and R.T. Snodgrass, ªTemporal and Real-Time
Databases: A Survey,º IEEE Trans. Knowledge and Data Eng., vol. 7,
no. 4, pp. 513-532, Aug. 1995.
[27] D. Pfoser and C.S. Jensen, ªIncremental Join of Time-Oriented
Data,º Proc. 11th Int'l Conf. Scientific and Statistical Database
Management, July 1999.

Das könnte Ihnen auch gefallen