Sie sind auf Seite 1von 4

Mnesia - An Industrial DBMS with Transactions, Distribution and a

Logical Query Language


Hans Nilsson and Claes Wikstrom
fhans,klackeg@erix.ericsson.se

Computer Science Laboratory


Ericsson Telecom AB
S-125 26 Stockholm, Sweden

Abstract 2. The trac data must be organized in such a


Mnesia is a full DBMS made for telecommunica- way so that when an external stimuli arrives
tions industrial needs. It has distributed transactions, to the system, all data which is necessary to
fast real time lookups, crash recovery and a logical process that stimuli must be located in a very
query language. The DBMS is written in the func- ecient manner
tional language Erlang which is also the intended
applications language. It already has real users devel- The maintenance data usually contains a copy of
oping products. the trac data, although structured in a di erent
way. Traditionally, the management of trac data
1 Introduction in telecommunications applications has not been per-
Mnesia is a multiuser Distributed DBMS formed by a general purpose DBMS, but rather tai-
(Database Management System) specially made for lor made for each application. One of the design
industrial telecommunications applications written in goals for Mnesia was to provide a DBMS where both
the symbolic language Erlang[AVWW95]. the requirements on the trac data as well as the
In telecommunications applications there are maintenance data are met.
needs di erent from the features provided by tradi- The rest of this paper is organized as follows:
tional DBMS's. The applications we now see imple- Section 2 is an Erlang introduction and section 3
mented in the Erlang language need a mixture of is an overview of Mnesia. Some usage of Mnesia is
a broad range of features. The most extreme among given in section 4 and some notes of the future and
those are: conclusions are given in section 5 and 6.
1. fast soft real-time key-value lookup 2 Erlang
2. complicated non-realtime queries mainly for op- Erlang[AVWW95] is a symbolic functional lan-
eration and maintenance guage intended for large real-time applications
mainly in the telecommunication area.
3. distributed data due to distributed applications Functions may have many \clauses" and the cor-
rect one is selected by pattern matching. The visi-
4. high fault-tolerance bility is controlled by explicit export declarations in
modules.
5. dynamic re-con guration Processes are lightweight and created explicitly.
6. variable-length records They communicate with asynchronous message pass-
ing. Messages are received selectively using pattern
In the current rst version of Mnesia we primarily matching. When running on UNIX there is only
use well-known algorithms. In following versions we one UNIX process namely the Erlang node. This
will begin to experiment with alternative solutions. node has its own internal process handling, so the
Many telecommunications applications are struc- Erlang processes should not be mistaken for UNIX
tured in such way that there are two di erent sets processes.
of data. One which is crucial for trac, and one Erlang nodes can be run on di erent types of
which contains maintenance data. The performance machines and operating systems and the Erlang
requirements on the trac data are severe. This processes can communicate across a network with
means that processes on other nodes. This does not in uence the
program writing, except if the message passing time
1. The trac data must be kept in memory at all is critical and the physical transport medium is slow.
times. No extra protocol speci cations are necessary. The
message passing is competitive with other network The feature list above is to be seen as an options
communication methods [Wik94]. list for the Mnesia user. For data which is access
There are two kinds of support for error recovery time critical but where it need not survive a crash,
in the language: exceptions and special exit messages the user selects primary memory storage only. It is
propagated to linked processes when a process dies. possible to dump the contents of a primary memory
Those messages can be caught by the other processes table to disk at regular intervals. Data which must
to activate some recovery plan, for instance restart survive a crash, and have high requirements on ac-
the dying process. cess time, can be confugured in such a way that the
Erlang programs communicate with hardware data is always kept in RAM and all write operations
and with programs written in other languages by on the data are logged to disc. The system will then
ports. The ports behave similar to processes in that at regular intervals checkpoint the data and truncate
they can receive messages from processes and send the log. Data which is not performance critical, can
messages back to them. The di erence from pro- be kept on disc only.
cesses are that ports are connected either to non- An advanced feature of Mnesia is that all system
Erlang software or to hardware. activities that are performed, such as checkpointing,
Example 1 The append function which returns the dumping tables, schema recon gurations, e.t.c never
two argument lists appended is in Erlang: stop the system. So for example, although a table
is being copied to a remote node, the table is still
append([H|T], Z) -> [H|append(T,Z)]; available for write operations to all applications.
append([], X) -> X. 3.2 Data model and schema
2 Data is organized in tables. A table has a set of
properties, for example:
3 Mnesia  on which node it is located
3.1 Features  where it is replicated (= a list of zero or more
The DBMS Mnesia provides:
nodes)
a. storage of full Erlang variant length terms:
 storage (RAM, RAM with Disc-copy or Disc-
(i) primary memory for fast access (with op- only)
tional disk backup)
(ii) disk-only for large data sets  if the system should maintain any indices
b. a logic query language The table de nitions and the table properties are
c. a fast program interface for very simple queries collected in a schema which may be altered in run-
time.
d. distributed transactions and query evaluation Each individual eld in a row in a table, may be
arbitrarily complex and contain compound Erlang
e. crash recovery terms including object code, lambda expressions and
f. replication of data on separate nodes regular data. This is one of the features which makes
Mnesia suitable for many telecommunications prob-
g. location transparency: applications are written lem where the single most important requirement on
without knowledge where and how the tables the DBMS is that an individual lookup is very fast.
are located This property makes data modeling very exible and
it allows for a data model where all the data which
h. run-time schema alteration is necessary for i.e. a call in a telephony switch, to
Those features are for Mnesia's ful llment of the be bundled in a single record.
requirements given in the introduction. The map- 3.3 Concurrency control and crash recov-
ping from the requirements to the features could be ery
pictured in a table: Operations on a database are grouped into trans-
actions which are distributed over the network. All
Mnesia Requirement completed transactions are logged as well as the
feature 1 2 3 4 5 6 'dirty' operations where a time-critical application
a(i) x x can bypass the transaction system.
a(ii) x The lock manager uses a multitude of traditional
b x techniques. Locking is dynamic, and each lock is
c x acquired by a transaction when needed. Regular two-
d x phase locking [EGLT76] is used and deadlock pre-
e x vention is traditional wait-die [RSL78]. The time
f x x x x stamps for the wait-die algorithm are acquired by
g x Lamport clocks [Lam78] maintained by the trans-
h x action manager on each node. When a transaction
Requirement and feature mapping is restarted, its Lamport clock is maintained, thus
making Mnesia live lock free as well. The lock Cursors are available for the users. If the node of
manager also implements multi granularity locking the querying user process is running on a multi-cpu
[GLPT75]. the database query could sometimes be parallelized
Traditional two-phase commit [Gre78] is used by and will always run in parallel with the user process.
the transaction manager when a transaction is n- 3.5 About the Implementation
ished. All of Mnesia is written in Erlang except from
Since Mnesia is running on top of distributed a special key-value dictionary using linear hashing
Erlang the implementation is greatly simpli ed. [Lar88]. This dictionary was added to the Erlang
In a distributed application there are separate implementation to get fast key-lookups.
Erlang nodes running on (usually) di erent ma- The majority of our present users queries are non-
chines. Erlang takes care of the communication recursive. Therefore we have chosen to separate the
between processes possibly on separate nodes trans- recursive and the non-recursive parts of the queries.
parently. Processes and nodes can easily be started, In the rst version we make the non-recursive query
supervised and stopped by processes on other nodes. evaluation ecient and in a later version the recur-
This makes lots of communication implementation sion will be more ecient than today.
problems disappear for Mnesia as well as for applica-
tions. Minimal cycles in the query ow graph are col-
Mnesia has (at least) one process on each partic- lapsed into one node giving a non-cyclic graph cor-
ipating node. Those processes takes care of updates, responding to a non-recursive query with explicit re-
transactions, emergency recon guration at node fail- cursion operators. This non-recursive query is op-
ure as well as normal maintenance recon gurations. timized and compiled using standard methods like
The user API to the transaction system is partial evaluation and goal reordering guided by
straightforward and easy to use. The programmer statistics about the database. Note that there is
presents the transaction system with a lambda ex- no optimization of recursive parts (yet). They are
pression and a closure, which is the executed by the simply put last in the query.
transaction system. The query is evaluated by the relational DBMS
3.4 Queries technique operators[Gra93]. The recursive parts are
The query construction is integrated with the evaluated by SLG [WC93, WCS93, CW93] reso-
Erlang language by list comprehensions. This re- lution. This is currently implemented by a naive
moves the impedance mismatch that a solution straight-forward interpreter.
like Embedded SQL or some embedded kind of The operator technique maps extremely well onto
Datalog[Ull89] would give with a functional language the Erlang processes and messages | as a spin-
like Erlang. The use of list comprehension in con- o e ect queries are automatically parallelized when
nection with relational DBMS has been known for a run on a multi-cpu computer.
long time[Bre88]. 4 Projects
Example 2 The Erlang list comprehension: Mnesia is currently used in several development
projects within Ericsson. Some of these projects are
[E.name || small scale prototype projects and some are large
E <- table(employee), scale product projects. Mnesia is being used for both
D <- table(department), trac data as well as maintenance data in various
E.name = D.boss, ways and the applications include pure switch con-
E.sex = female] trollers, Intelligent Network controllers as well as of-
ce/telephony/internet applications.
extracts a list of the names of all female bosses 5 The Future
in a database. This is equivalent with the Datalog
expression: Primarily we will evaluate Mnesia with the appli-
cations that are now being written. We will com-
employee(Name, , ,female, , ), pare the performance with commercial DBMS as
department( , , ,Name) well with modern research DBMS. Preliminary per-
formance measurements are promising.
Mnesia has no consistency checking at updates to-
2 day. We will therefore add integrity constraint check-
ing in a near future.
We think that the straight forward translation be- In the area of query processing there are two ob-
tween List Comprehension and Datalog makes a nice vious continuations namely making an ecient SLG
and clean connection. implementation and to optimize recursive queries.
A logical query language is chosen by two reasons. We will try other transaction and locking algo-
The rst is that one of the authors has a background rithms, especially real-time algorithms.
in logic programming and the second is that we will There is a need by some of our users of an SQL-
need to make an SQL interface in the future. We interface. This will be provided and probably also
think that with a logical language as target the SQL an ODBMC interface. This will enable external PC-
translation will be somewhat simpler. applications to access the DBMS.
Another problem which has not been addressed is [RSL78] D.J. Rosenkrantz, R.E. Stearns, and
the problem of partitioned networks. If the network P.M. Lewis. System level concur-
between two Mnesia nodes fail, but the nodes them- rency control for distributed databases.
selves continue to operate, we have a problematic ACM Transactions on Database Sys-
situation. tems, 3(2):178{198, June 1978.
6 Experiences and Conclusions
[Ull89] Je rey D. Ullman. Principles of
Database and Knowledge-Base Systems,
By combining wellknown algorithms with the volume 2. Computer Science Press,
symbolic high level language Erlang it is possible 1989. ISBN 0-7167-8162-X.
to write a full industrial DBMS with only a fraction [WC93] David S Warren and Weidong Chen.
of the manpower normally required. We have by this Towards e ective evaluation of general
also obtained a good platform for research and got logic programs. Technical report, Com-
some real world applications for benchmarking. puter Science Department (SUNY) at
Stony Brook, 1993.
References
[WCS93] D. S. Warren, W. Chen, and T. Swift.
[AVWW95] Joe Armstrong, Robert Virding, Claes Ecient computation of queries under
Wikstrom, and Mike Williams. Concur- the well-founded semantics. Technical
rent Programming in ERLANG. Pren- Report 93-CSE-33, Southern Methodist
tice Hall, second edition, 1995. University, 1993.
[Bre88] P T Breuer. Applicative query lan- [Wik94] Claes Wikstrom. Distributed computing
guages. Technical report, Cambridge in Erlang. In First International sympo-
University Engineering Dept, 1988. sium on Parallel Symbolic computation,
Sep 1994.
[CW93] A. W. Chen and D. S. Warren. Query
evaluation under the well-founded se-
mantics. In Proc. ACM SIGACT-
SIGMOD-SIGART Symp. on Principles
of Database Sys., page 168, Washington,
DC, May 1993.
[EGLT76] K.P Eswaran, J.N. Grey, R.A. Lorie,
and I.L. Traiger. The notions of con-
sistence and predicate locks in a dat-
base system. Communications of ACM,
19(11):624{633, November 1976.
[GLPT75] J.N. Grey, R.A. Lorie, G.R. Putzolo,
and I.L. Traiger. Granularity of locks
and degrees of consistency in a shared
database. Technical Report Research re-
port RJ1654,, IBM, September 1975.
[Gra93] Goetz Graefe. Query evaluation tech-
niques for large databases. ACM Com-
puting Surveys, 25(2):73{170, June 1993.
[Gre78] J.N. Grey. Notes on database operating
system: An advanced cource. Lecture
notes in Computer Science, Springer
Verlag, Berlin, 1(60):393{481, 1978.
[Lam78] L Lamport. Time, clocks and the or-
dering of events i a distributed sys-
tem. ACM Transactions on Program-
ming Languages and Systems, 21(1):558{
565, July 1978.
[Lar88] P-A Larson. Dynamic hash tables.
Communications of the ACM, 31(4),
1988.

Das könnte Ihnen auch gefallen