The InterBase and Firebird Developer Magazine, Issue 2, 2005

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Contents
Credits
Contents Replicating and synchronizing
by Jonathan Neve
Interbase/FireBird databases using CopyCat Alexey Kovyazin,
Chief Editor
......................................................................... 27
Editor’s note Helen Borrie,
by Dmitri Kouzmenko
Using IBAnalyst
Editor
by Alexey Kovyazin
Rock around the blog
......................................................................... 31 Dmitri Kouzmenko
Editor
................................................................... 3 Readers feedback
Noel Cosgrave,
by Volker Rehn
Comments to “Temporary tables” article
Sub-editor
by Helen E. M. Borrie
Firebird conference
......................................................................... 35 Lev Tashchilin,
................................................................... 4 Designer
Miscellaneous Natalya
Oldest Active ......................................................................... 36
Polyanskaya,
Blog editor
by Helen E. M. Borrie
On Silly Questions and Idiotic Outcomes
................................................................... 5
Server internals
Cover story
by Ann. W. Harrison
Locking, Firebird, and the Lock Table
Subscribe now!
................................................................... 6
To receive future issues
Magazine CD notifications send email to
by Dmitri Kouzmenko and Alexey Kovyazin
Inside BLOBs
subscribe@ibdeveloper.com
................................................................. 11
TestBed
by Vlad Horsun, Alexey Kovyazin
Testing NO SAVEPOINT in InterBase 7.5.1
................................................................... 13
Development area
Best viewed with Acrobat Reader 7
by Vladimir Kotlyarevsky
Object-Oriented Development in RDBMS, Part 1 Donations
Download now!
.................................................................. 22
©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 2

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Editor’s note
Rock around the block
Rock around the blog Editor’s note

By Alexey Kovyazin
Dear readers, Tips&Tricks” by Dmitri Kouzmenko column “Oldest Active” by Helen but it is a considerable expense for
and Alexey Kovyazin. Borrie. No need to say more about its publisher. We must pay for arti-
I am glad to introduce you the author of “The Firebird Book”, just cles, editing and proofreading,
second issue of “The Inter- Object oriented development has read it! In this issue Helen looks at design, hosting, etc. In future we’ll
Base and Firebird Developer been a high-interest topic for many the phenomenon of support lists try to recover costs solely from
Magazine”. We received a lot years, and it is a still hot topic. The and their value to the Firebird com- advertising but, for these initial
of emails with kind words and article “Object-Oriented Develop- munity. She takes a swing at some issues, we need your support.
congratulations which have ment in RDBMS” explores the prac- of the unproductive things that list
helped us in creating this tical use of OOD principles in the posters do in this topic, titled “On See details here
issue. Thank you very much! design of InterBase or Firebird Silly Questions and Idiotic Out-
databases. http://ibdeveloper.com/donations
comes”.
In this issue Replication is a problem which On the Radar
The cover story is the newest faces every database developer Let’s blog again I’d like to introduce the several
Episode from Ann W. Harrison sooner or later. The article “Repli- Now it is a good time to say a few projects which will be launched in
“Locking, Firebird, and the Lock cating and synchronizing Inter- words about our new web-presen- the near future. We really need
Table”. The article covers the lock- Base/Firebird databases using tation. We've started a blog-style your feedback so please do not
ing issue from general principles to CopyCat” introduces the approach interface for our magazine – take a hesitate to place your comments in
specific advice, so everyone will used in the new CopyCat compo- look on www.ibdeveloper.com if blog or send email to us
find a bit that's interesting or inform- nent set and the CopyTiger tool. you haven't already discovered it.. (readers@ibdeveloper.com)
ative. Now you can make comments on
The last article by Dmitri Kouz-
We continue the topic of savepoints menko “Understanding Your Data- any article or message. In future
Magazine CD
internals with an article “Testing base with IBAnalyst” provides a we’ll publish all the materials relat-
ed to the PDF issues, along with spe- We will issue a CD with all 3 issues
NO SAVEPOINT in InterBase guide for better understanding of of “The InterBase and Firebird
7.5.1”. Along with interesting prac- how databases work and how tun- cial bonus articles and materials.
You can look out for previews of Developer Magazine” in Decem-
tical test results you will find a ing their parameters can optimize ber 2005 (yes, a Christmas CD!).
description of UNDO log workings performance and avoid bottle- articles, drafts and behind-the-
in InterBase 7.5.1. necks. scene community discussions.
Along with all issues in PDF and
Please feel welcome to blog with us!
searchable html-format you will find
We publish the small chapter, exclusive articles and bonus materi-
“Inside BLOBs” from the forthcom- “Oldest Active” column
Donations als. The CD will also include free
ing book “1000 InterBase&Firebird I am very glad to introduce the new versions of popular software relat-
The magazine is free for readers,

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Editor’s note
Rock around the block
Firebird conference
ed to InterBase and Firebird and ward for your feedback!

offer very attractive discounts for
dozens of the most popular prod-
ucts.
Invitation
As a footnote, I’d like to invite
Firebird
Conference
The price for CD is USD$9.99 plus all people who are in touch with
shipping. Firebird and InterBase to partici-
For details see pate in the life of the community.
www.ibdeveloper.com/magazine- The importance of a large, active
This issue of our magazine almost lunches on both conference days
cd community to any public project --
coincides with the third annual as well as mid-moring and mid-
and a DBMS with hundreds of thou-
Firebird World Conference, start- afternoon refreshments.
sands of users is certainly public! --
Paper version ing November 13 in Prague,
cannot be emphasised enough. It is Presentations include one on Fire-
of the magazine Czech Republic. This year's con-
the key to survival for such projects. bird's future development from
ference spans three nights, with a
In 2006 we intend to issue the first Read our magazine, ask your ques- the Firebird Project Coordinator,
tight programme of back-to-back
paper version of “The InterBase tions on forums, and leave com- Dmitry Yemanov and another
and parallel presentations over
and Firebird Developer Maga- ments in blog, just rock and roll with from Alex Peshkov, the architect
two days. With speakers in atten-
zine”. The paper version cannot be us! of the security changes coming in
dance from all across Europe and
free because of high production Firebird 2. Jim Starkey will be
coming the Americas presenting topics
costs. there, talking about Firebird Vul-
soon that range from specialised appli-
Sincerely, cation development techniques to can, and members of the SAS
We intend to publish 4 issues per
first-hand accounts of Firebird Institute team will talk about
year with a subscription price of Alexey Kovyazin
internals from the core develop- aspects of SAS's taking on Fire-
around USD$49. The issue volume
ers, this one promises to be the bird Vulcan as the back-end to its
will be approximately 64 pages. Chief Editor
best ever. illustrious statistical software.
In the paper version we plan to editor@ibdeveloper.com Most of the interface develop-
include the best articles from the Registration has been well under ment tools are on the menu,
online issues and, of course, exclu- way for some weeks now, including Oracle-mode Firebird,
sive articles and materials. although the time is now past for PHP, Delphi, Python and Java
early bird discounts on registra- (Jaybird).
To be or not be – this is the question tion fees. At the time of publish-
that only you can answer. If the ing, bookings were still being It is a "don't miss" for any Firebird
idea of subscribing to the paper taken for accommodation in the developer who can make it to
version appeals to you, please conference venue itself, Hotel Prague. Link to details and pro-
place a comment in our blog Olsanka, in downtown Prague. gramme either at
(http://ibdeveloper.com/paper- The hotel offers a variety of room http://firebird.sourceforge.net/i
version) or send email to configurations, including bed and ndex.php?op=konferenz or at the
readers@ibdeveloper.com with breakfast if wanted, at modest IBPhoenix website,
your pro and cons. We look for- tariffs. Registration includes http://www.ibphoenix.com.

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Oldest Active
On Silly Questions
and Idiotic Outcomes
On Silly Questions and Idiotic Outcomes Off-topic questions are another

source of noise and irritation in the
you do at least owe it to yourself
to know what you are talking
Firebird owes its existence to the community's support lists. Though it's impossible, when you post prob- lists. At the Firebird website, we about, and not to present your
all ancient history now, if it hadn't been for the esprit de corps among the lems with no descriptions. You have bent over backwards to problem as an attack on our soft-
"regulars" of the old InterBase support list hosted by mers.com in the waste our time. You waste your make it very clear which lists are ware. Whether intended or not, it
'nineties, Borland's shock-horror decision to kill off InterBase develop- time. We get frustrated because appropriate for which areas of comes over as a troll and you
ment at the end of 1999 would have been the end of it, for the English- you don't provide the facts; you interest. An off-topic posting is come across as a fool. There's a
speaking users at least. get frustrated because we can't easily forgiven if the poster polite- strong chance that your attitude
read your mind. And the list gets ly complies with a request from a will discourage anyone from
Firebird's support lists on Yahoo and Sourceforge were born out of the filled with noise. list regular to move the question to bothering with you at all. It's no-
same cataclysm of fatal fire and regeneration that caused the phoenix, "x" list. When these events instead win, sure, but it's also a reflection
drawn from the fire against extreme adversity, to take flight as Firebird If there's something I've learnt in become flame threads, or when of human nature. You reap what
in July 2000. Existing and new users, both of Firebird and of the rich more than 12 years of on-line the same people persistently reof- you sow.
selection of drivers and tools that spun off from and for it, depend heav- software support, it is that a prob- fend, they are an extreme burden
ily on the lists to get up to speed with the evolution of all of these soft- lem well described is a problem on everyone. In closing this little tirade, I just
ware products. solved. If a person applies time want to say that I hope I've struck
and thought to presenting a good In the "highly tedious" category a chord with those of our reader-
Newbies often observe that there is just nothing like our lists for any description of the problem, we have the kind of question that ship who struggle to get satisfac-
other software, be it open source or not. One of the things that I think chances are the solution will hop goes like this: "Subject: Bug in tion from their list postings. It will
makes the Firebird lists stand out from the crowd is that, once gripped by up and punch him on the nose. Firebird SQL. Body: "This state- be a rare question indeed that has
the power of the software, our users never leave the lists. Instead, they Even if it doesn't, good descrip- ment works in no answer. Those rarities, well-
stick around, learn hard and stay ready and willing to impart to others tions produce the fastest right MSSQL/Access/MySQL/insert presented, become excellent bug
what they have learnt. After nearly six years of this, across a dozen lists, answers from list contributors. any non-standard DBMS name reports. If you're not getting an
our concentrated mix of expertise and fidelity is hard to beat. Furthermore, those good descrip- here. It doesn't work in Firebird. answer, there's a high chance that
tions and their solutions form a Where should I report this bug?" you asked a silly question.
Now, doesn't all this make us all feel warm and fuzzy? For those of us
powerful resource in the archives, To be fair, George Bernard Shaw
in the hot core of this voluntary support infrastructure, the answer has to Helen E. M. Borrie
for others coming along behind. wasn't totally right when he said
be, unfortunately, "Not all the time!" There are some bugbears that can
"Ignorance is a sin." However, helebor@tpg.com.au
make the most patient person see red. I'm using this column to draw
attention to some of the worst.
First and worst is silly questions. You have all seen them. Perhaps
you even posted them! "Subject: Help! Body: Every time I try to
connect to Firebird I get the message 'Connection refused'. What
am I doing wrong?" What follows from that is a frustrating and
tedious game for responders. "What version of Firebird are you
using? Which server model? Which platform? Which tool?
What connection path? ('By connection path, we mean....').."
Everyone's time is valuable. Nobody has the luxury of being
able to sit around all day playing this game. If we weren't will-
ing to help, we wouldn't be there. But you make it hard, or even

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Cover story
News & Events

Firebird Worldwide
Author: Ann. W. Harrison
Conference 2005 One of the best-known facts about base to running transactions. Those A lock sounds like a very solid
Firebird is that it uses multi-version three functions can be implemented object, but in database systems, a aharrison@ibphoenix.com
The Firebird conference concurrency control instead of in various ways. The simplest is to lock anything but solid. “Locking” is
will take place record locking. For databases that serialize transactions – allowing a shorthand description of a proto- the transaction that has the lock, the
at the Hotel Olsanka manage concurrency though each transaction exclusive access to col for reserving resources. Data- identity of the table being locked,
in Prague, record locks, understanding mecha- the database until it finishes. That bases use locks to maintain consis- and a function to call if another
Czech Republic nisms of record locking is critical to solution is neither interesting nor tency while allowing concurrent transaction wants an incompatible
from the evening designing a high performance, high desirable. independent transactions to update lock on the table. The normal table
of Sunday the 13th concurrency application. In gener- distinct parts of the database. Each lock is shared – other transactions
of November al, record locking is irrelevant to A common solution is to allow each transaction reserves the resources – can lock the same table in the same
(opening session) Firebird. However, Firebird uses transaction to lock the data it uses, tables, records, data pages – that it way.
until the evening locks to maintain internal consisten- keeping other transactions from needs to do its work. Typically, the
of Tuesday the 15th cy and for interprocess coordina- reading data it changes and from reservation is made in memory to Before a transaction can delete a
of November tion. Understanding how Firebird changing data it reads. Modern control a resource on disk, so the table, it must get an exclusive lock
(closing session). does (and does not) use locks helps databases generally lock records. cost of reserving the resource is not on the table. An exclusive lock is
with system tuning and anticipating Firebird provides concurrency con- significant compared with the cost incompatible with other locks. If any
how Firebird will behave under trol without record locks by keeping of reading or changing the transaction has a lock on the table,
Request load. multiple versions of records, each resource. the request for an exclusive lock is
for Sponsors marked with the identifier of the denied, the drop table statement
I’m going to start by describing transaction that created it. Concur- fails. Returning an immediate error
locking abstractly, then Firebird rent transactions don’t overwrite Locking example is one way of dealing with conflict-
locking more specifically, then the each other’s changes, because the “Resource” is a very abstract term. ing lock requests. The other is to put
How to Register controls you have over the Firebird system will not allow a transaction Lets start by talking about locking the conflicting request on a list of
lock table and how they can affect to change a record if the most tables. Firebird does lock tables, but unfulfilled requests and let it wait for
your database performance. So, if recent version was created by a normally it locks them only to pre- the resource to become available.
you just want to learn about tuning, concurrent transaction. Readers vent catastrophes like having one
Conference skip to the end of the article. never see uncommitted data transaction drop a table that anoth- In its simplest form, that is how lock-
ing works. All transactions follow
Timetable because the system will not return er transaction is using.
record versions created by concur- the formal protocol of requesting a
Concurrency control Before a Firebird transaction can lock on a resource before using it.
rent transactions. Readers see a
Concurrency control in database access a table, it must get a lock on Firebird maintains a list of locked
Conference consistent version of the database
resources, a list of requests for locks
systems has three basic functions: because the system returns only the the table. The lock prevents other
Papers preventing concurrent transactions transactions from dropping the on resources – satisfied or waiting
data committed when they start –
and Speaker from overwriting each others’ allowing other transactions to cre- table while it is in use. When a – and a list of the owners of lock
changes, preventing readers from ate newer versions. Readers don’t transaction gets a lock on a table, requests. When a transaction
seeing uncommitted changes, and block writers. Firebird makes an entry in its table requests a lock that is incompatible
giving a consistent view of the data- of locks, indicating the identity of with existing locks on a resource,

Firebird either denies the new describing follow a protocol known base file. Most databases have a itself. Firebird coordinates physical
request, or puts it on a list to wait as two-phase locking, which is typi- single server process like Super- access to the database through
until the resource is available. Inter- cal of locks taken by transactions in Server that has exclusive access to locks on database pages.
nal lock requests specify whether database systems. Databases that the database and coordinates
they wait or receive an immediate use record locking for consistency physical access to the file within In general database theory, a trans-
error on a case-by-case basis. control always use two-phase
When a transaction starts, it speci- record locks. In two-phase locking,
fies whether it will wait for locks that a transaction acquires locks as it
it acquires on tables, etc. proceeds and holds the locks until it
ends. Once it releases any lock, it
Lock modes can no longer acquire another. The
two phases are lock acquisition and
For concurrency and read commit- lock release. They cannot overlap.
ted transactions, Firebird locks
tables for shared read or shared When a Firebird transaction reads
write. Either mode says, “I’m using a table, it holds a lock on that table
this table, but you are free to use it until it ends. When a concurrency
too.” Consistency mode transac- transaction has acquired a shared
tions follow different rules. They write lock to update a table, no con-
lock tables for protected read or sistency mode transaction will be
protected write. Those modes say able to get a protected lock on that
“I’m using the table and no one else table until the transaction with the
is allowed to change it until I’m shared write lock ends and releases
done.” Protected read is compati- its locks. Table locking in Firebird is
ble with shared read and other pro- two-phase locking.
tected read transactions. Protected
write is only compatible with share Locks can also be transient, taken
read. and released as necessary during
the running of a transaction. Fire-
The important concept about lock bird uses transient locking exten-
modes is that locks are more subtle sively to manage physical access to
than mutexes – locks allow the database.
resource sharing, as well as protect-
ing resources from incompatible Firebird page locks
use.
One major difference between Fire-
bird and most other databases is
Two-phase locking vs. tran- Firebird’s Classic mode. In Classic
sient locking mode, many separate processes
The table locks that we’ve been share write access to a single data-

action is a set of steps that transform section. In SuperServer, only the state of the lock table to reflect its on a resource that is already locked
the database from on consistent server uses that share memory request. in an incompatible mode, one of
state to another. During that trans- area. In Classic, every database two things happens. Either the
formation, the resources held by the connection maps the shared memo- Conflicting lock requests
requesting transaction gets an
transaction must be protected from ry and every connection can read When a request is made for a lock immediate error, or the request is
incompatible changes by other and change the contents of the
transactions. Two-phase locks are memory.
that protection.
The lock table is a separate piece of
In Firebird, internally, each time a share memory. In SuperServer, the
transaction changes a page, it lock table is mapped into the server
changes that page – and the physi- process. In Classic, each process
cal structure of the database as a maps the lock table. All databases
whole – from one consistent state to on a server computer share the
another. Before a transaction reads same lock table, except those run-
or writes a database page, it locks ning with the embedded server.
the page. When it finishes reading
or writing, it can release the lock The Firebird lock manager
without compromising the physical
consistency of the database file. We often talk about the Firebird
Firebird page level locking is tran- Lock Manager as if it were a sepa-
sient. Transactions acquire and rate process, but it isn’t. The lock
release page locks throughout their management code is part of the
existence. However, to prevent engine, just like the optimizer, pars-
deadlocks, transactions must be er, and expression evaluator. There
able to release all the page locks it is a formal interface to the lock
holds before acquiring a lock on a management code, which is similar Registering for the Conference
new page. to the formal interface to the distrib-
uted lock manager that was part of
Call for papers
The Firebird lock table VAX/VMS and one of the inter-
faces to the Distributed Lock Man-
When all access to a database is ager from IBM. Sponsoring the Firebird Conference
done in a single process – as is the
case with most database systems – The lock manager is code in the
locks are held in the server’s memo- engine. In Classic, each process has
ry and the lock table is largely invis- its own lock manager. When a
ible. The server process extends or Classic process requests or releases
remaps the lock information as a lock, its lock management code
required. Firebird, however, man- acquires a mutex on the shared http://firebird-conference.com/
ages its locks in a shared memory memory section and changes the

put on a list of waiting requests and work the mechanism must be fast they check the index definitions for mode of the request, etc. Lock
the transactions that hold conflicting and reliable. A fast, reliable inter- the table, find the new index defini- blocks describe the resources being
locks on the resource are notified of process communication mechanism tion, and begin maintaining the locked.
the conflicting request. Part of every can be – and is – useful for a num- index.
ber of purposes outside the area To request a lock, the owner finds
lock request is the address of a rou- the lock block, follows the linked list
tine to call when the lock interferes that’s normally considered data- Firebird locking summary
base locking. of requests for that lock, and adds
with another request for a lock on Although Firebird does not lock its request at the end. If other own-
the same object. Depending on the For example, Firebird uses the lock records, it uses locks extensively to ers must be notified of a conflicting
resource, the routine may cause the table to notify running transactions isolate the effects of concurrent request, they are located through
lock to be released or require the of the existence of a new index on a transactions. Locking and the lock the request blocks already in the list.
new request to wait. table. That’s important, since as table are more visible in Firebird Each owner block also has a list of
soon as an index becomes active, than in other databases because the its own requests. The performance
Transient locks like the locks on every transaction must help main- lock table is a central communica- critical part of locking is finding lock
database pages are released tain it – making new entries when it tion channel between the separate blocks. For that purpose, the lock
immediately. When a transaction stores or modifies data, removing processes that access the database table includes a hash table for
requests a page lock and that page entries when it modifies or deletes in Classic mode. In addition to con- access to lock blocks based on the
is already locked in an incompati- data. trolling access to database objects name of the resource being locked.
ble mode, the transaction or trans- like tables and data pages, the Fire-
actions that hold the lock are noti- When a transaction first references bird lock manager allows different A quick refresher on hashing
fied and must complete what they a table, it gets a lock on the exis- transactions and processes to notify
are doing and release their locks tence of indexes for the table. A hash table is an array with linked
each other of changes to the state of lists of duplicates and collisions
immediately. Two-phase locks like When another transaction wants to the database, new indexes, etc.
table locks are held until the trans- create a new index on that table, it hanging from it. The names of lock-
action that owns the lock completes. must get an exclusive lock on the Lock table specifics able objects are transformed by a
When the conflicting lock is existence of indexes for the table. function called the hash function
released, and the new lock is grant- Its request conflicts with existing The Firebird lock table is an in-mem- into the offset of one of the elements
ed, then transaction that had been locks, and the owners of those locks ory data area that contains of four of the array. When two names
waiting can proceed. are notified of the conflict. When primary types of blocks. The lock transform to the same offset, the
those transactions are in a state header block describes the lock result is a collision. When two locks
where they can accept a new index, table as a whole and contains have the same name, they are
Locks as interprocess com- pointers to lists of other blocks and duplicates and always collide.
they release their locks, and imme-
munication free blocks. Owner blocks describe
diately request new shared locks on In the Firebird lock table, the array
Lock management requires a high the existence of indexes for the the owners of lock requests – gen-
erally lock owners are transactions, of the hash table contains the
speed, completely reliable commu- table. The transaction that wants to
connections, or the SuperServer. address of a hash block. Hash
nication mechanism between trans- create the index gets its exclusive
Request blocks describe the rela- blocks contain the original name, a
actions, including transactions in lock, creates the index, and com-
tionship between an owner and a collision pointer, a duplicate point-
different processes. The actual mits, releasing its exclusive lock on
lockable resource – whether the er, and the address of the lock block
mechanism varies from platform to the existence of indexing. As other
request is granted or pending, the that corresponds to the name. The
platform, but for the database to transactions get their new locks, collision pointer contains the

address of a hash block whose ber of hash slots if the load is high. allowed to update the lock table at The change will not take effect until
News & Events name hashed to the same value. The symptom of an overloaded any instant. When updating the lock all connections to all databases on
The duplicate pointer contains the hash table is sluggish performance table, a process holds the table’s the server machine shut down.
PHP Server
address of a hash block that has under load. mutex. A non-zero mutex wait indi-
exactly the same name. cates that processes are blocked by If you increase the number of hash
One of the more interest-
The tool for checking the lock table the mutex and forced to wait for slots, you should also increase the
ing recent developments
A hash table is fast when there are is fb_lock_print, which is a com- access to the lock table. In turn, that lock table size. The second line of
in information technolo-
relatively few collisions. With no mand line utility in the bin directory indicates a performance problem the lock print
Version:114, Active owner 0,
gy has been the rise of
collisions, finding a lock block of the Firebird installation tree. The
Length: 262144, Used: 85740
browser based applica- inside the lock table, typically
involved hashing the name, index- full lock print describes the entire because looking up a lock is slow.
tions, often referred to
ing into the array, and reading the state of the lock table and is of limit-
by the acronym "LAMP".
pointer from the first hash block. ed interest. When your system is If the hash lengths are more than tells you how close you are to run-
One key hurdle for Each collision adds another pointer under load and behaving badly, min 5, avg 10, or max 30, you ning out of space in the lock table.
broad use of the LAMP to follow and name to check. The invoke the utility with no options or need to increase the number of The Version and Active owner are
technology for mid-mar- ratio of the size of the array to the switches, directing the output to a hash slots. The hash function used in uninteresting. The length is the max-
ket solutions was that it number of locks determines the file. Open the file with an editor. Firebird is quick but not terribly effi- imum size of the lock table. Used is
was never easy to con- number of collisions. Unfortunately, You'll see output that starts some- cient. It works best if the number of the amount of space currently allo-
figure and manage. the width of the array cannot be thing like this: hash slots is prime. cated for the various block types
LOCK_HEADER BLOCK
adjusted dynamically because the and hash table. If the amount used
Version:114, Active owner:0, Length:262144, Used:85740
PHPServer changes that: size of the array is part of the hash is anywhere near the total length,
Semmask:0x0, Flags: 0x0001
it can be installed with function. Changing the width uncomment this parameter in the
Enqs: 18512, Converts: 490, Rejects:0, Blocks: 0
just four clicks of the changes the result of the function configuration file by removing the
Deadlock scans:0, Deadlocks:0, Scan interval:10

mouse and support for and invalidates all existing entries in leading #, and increase the value.
#LockMemSize = 262144
Firebird is compiled in.
Acquires: 21048, Acquire blocks:0, Spin count:0
the hash table.
PHPServer shares all the Adjusting the lock table to Mutex wait:10.3%
Hash slots:101, Hash lengths (min/avg/max):3/ 15/ 30
qualities of Firebird: it is improve performance to this
… LockMemSize = 1048576
a capable, compact,
easy to install and easy The size of the hash table is set in
to manage solution. the Firebird configuration file. You The seventh and eighth lines suggest Change this line in the configuration
must shut down all activity on all that the hash table is too small and file: The value is bytes. The default lock
#LockHashSlots = 101
PHPServer is a free databases that share the hash table that it is affecting system perform- table is about a quarter of a
download – normally all databases on the ance. In the example, these values megabyte, which is insignificant on
machine – before changes take indicate a problem: Uncomment the line by removing modern computers. Changing the
Read more at:
Mutex wait: 10.3%
effect. The Classic architecture uses lock table size will not take effect
Hash slots: 101, Hash lengths (min/avg/max):3/ 15/ 30

www.fyracle.org/ the lock table more heavily than until all connections to all databases
phpserver.html SuperServer. If you choose the on the server machine shut down.
Classic architecture, you should In the Classic architecture, each the leading #. Choose a value that
check the load on the hash table process makes its own changes to is a prime number less than 2048.
LockHashSlots = 499
periodically and increase the num- the lock table. Only one process is

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Server internals
Inside BLOBs
Author: Dmitri Kouzmenko
Inside BLOBs
This is an excerpt from the book “1000 InterBase & Firebird Tips & Tricks”
by Alexey Kovyazin and Dmitri Kouzmenko, which will be published in 2006.
kdv@ib-aid.com
Author: Alexey Kovyazin

ak@ib-aid.com
How the server Initially, the basic record data on ally contains the BLOB data. The first type is the simplest. If the
works with BLOBs the data page includes a reference Depending on the size of the BLOB, size of BLOB-field data is less than
The BLOB data type is intended for to a “BLOB record” for each non- this BLOB-record will be one of the free space on the data page, it is
storing data of variable size. Fields null BLOB field, i.e. to record-like three types. placed on the data page as a sepa-
of BLOB type allow for storage of structure or quasi-record that actu- rate record of "BLOB" type.
data that cannot be placed in fields
of other types, - for example, pic-
tures, audio files, video fragments,
etc.
From the point view of the database

application developer, using BLOB
fields is as transparent as it is for
other field types (see chapter “Data
types” for details). However, there
is a significant difference between
the internal implementation mecha-
nism for BLOBs and that for other
data.
Unlike the mechanism used for han-

dling other types of fields, the data-
base engine uses a special mecha-
nism to work with BLOB fields. This
mechanism is transparently inte-
grated with other record handling
at the application level and at the
same time has its own means of
page organization. Let's consider in
detail how the BLOB-handling
mechanism works.

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Server internals
Inside BLOBs
The second type is used when the The special header contains the size will be 2 gigabytes. So, if you
size of BLOB is greater than the following information: plan to have very large BLOB
free space on the page. In this fields in your database, you should
case, references to pages contain- •The number of the first blob page experiment with storing data of a
ing the actual BLOB data are in this blob. It is used to check that large size beforehand.
stored in a quasi-record. Thus, a pages belong to one blob.
two-level structure of BLOB-field •A sequence number. This is The segment size mystery
data is used. important in checking the integrity Developers of database applica-
of a BLOB. For a BLOB pointer tions often ask what the Segment
page it is equal to zero. Size parameter in the definition of
•The length of data on a page. As a BLOB is, why we need it and
a page may or may not be filled to whether or not we should set it
the full extent, the length of actual when creating Blob-fields.
data is indicated in the header. In reality, there is no need to set
this parameter. Actually, it is a bit
Maximum BLOB size of a relic, used by the GPRE utility
BLOB Page As the internal structure for storing when pre-processing Embedded
BLOB data can have only 3 levels SQL. When working with BLOBs,
The blob page consists of the fol-
of organization, and the size of GPRE declares a buffer of speci-
lowing parts:
data page is also limited, it is pos- fied size, based on the segment
sible to calculate the maximum size size. Setting the segment size has
of a BLOB. no influence over the allocation
If the size of BLOB-field contents is and the size of segments when
very large, a three-level structure However, this is a theoretical limit storing the BLOB on disk. It also
is used – a quasi-record stores ref- (if you want, you can calculate it), has no influence on performance.
erences to BLOB pointer pages but in practice the limit will be Therefore the segment size can be
which contain references to the much lower. The reason for this safely set to any value, but it is set
actual BLOB data. lower limit is that the length of to 80 bytes by default.
BLOB-field data is determined by a
The whole structure of BLOB stor- variable of ULONG type, i.e. its Information for those who want to
age (except for the quasi-record, maximal size will be equal to 4 know everything: the number 80
of course) is implemented by one gigabytes. was chosen because 80 symbols
page type – the BLOB page type. could be allocated in alphanumer-
Different types of BLOB-pages dif- Moreover, in reality this practical ic terminals.
fer from each other in the presence limit is reduced if a UDF is to be
of a flag (value 0 or 1) defining used for BLOB processing. An
how the server should interpret the internal UDF implementation
given page. assumes that the maximum BLOB

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 TestBed
Testing the NO SAVEPOINT
Author: Vlad Horsun

Testing the NO SAVEPOINT hvlad@users.sourceforge.net
Author: Alexey Kovyazin

feature in InterBase 7.5.1 ak@ib-aid.com
In issue 1 we published Dmitri table, with the following structure: Download the InterBase 7.5.1 trial
Yemanov’s article about the inter- CREATE TABLE TEST ( version from www.borland.com.
ID NUMERIC(18,2),
nals of savepoints. While that article The installation process is obvious
NAME VARCHAR(120),
was still on the desk, Borland and well-described in the InterBase
DESCRIPTION VARCHAR(250),
announced the release of InterBase documentation.
CNT INTEGER,
7.5.1, introducing, amongst other
You can download a backup of the
QRT DOUBLE PRECISION,
things, a NO SAVEPOINT option
test database ready to use from
TS_CHANGE TIMESTAMP,
for transaction management. Is this
http://www.ibdeveloper.com/issu
TS_CREATE TIMESTAMP,
an important improvement for Inter-
Base? We decided to give this e2/testspbackup.zip (~4 Mb) or,
implementation a close look and test NOTES BLOB alternatively, an SQL script for cre-
it some, to discover what it is all ); ating it from
about. http://www.ibdeveloper.com/issu
This table contains 100,000 records, which will be updated during the test. e2/testspdatabasescript.zip (~1
The stored procedure and generators are used to fill the table with test data. Kb).
Testing NO SAVEPOINT You can increase the quantity of records in the test table by calling the
In order to analyze the problem that stored procedure to insert them: If you download the database
SELECT * FROM INSERTRECS(1000);
the new transaction option was backup, the test tables are already
intended to address, and to assess populated with records and you
its real value, we performed several The second table, TEST2DROP, has the same structure as the first and is filled can proceed straight to the section
with the same records as TEST. “Preparing to test”, below.
INSERT INTO TEST2DROP SELECT FROM TEST;
very simple SQL tests. The tests are
all 100% reproducible, so you will If you choose instead to use the
be able to verify our results easily. As you will see, the second table will be dropped immediately after connect. SQL script, you will create data-
We are just using it as a way to increase database size cheaply: the pages base yourself. Make sure you
Database for testing occupied by the TEST2DROP table will be released for reuse after we drop insert 100,000 records into table
the table. With this trick we avoid the impact of database file growth on the TEST using the INSERTRECS stored
The test database file was created in
test results. procedure and then copy all of
InterBase 7.5.1, page size = 4096,
character encoding is NONE. It them to TEST2DROP three or four
contains two tables, one stored pro- Setting test environment times.
cedure and three generators. All that is needed to perform this test is the trial installation package of Inter- After that, perform a backup of this
Base 7.5.1, the test database and an SQL script. database and you will be on the
For the test we will use only one
same position as if you had down-

loaded “ready for use” backup.
Hardware is not a material issue for these tests, since we are only comparing performance with and without the NO
SAVEPOINT option. Our test platform was a modest computer with Pentium-4, 2 GHz, with 512 RAM and an 80GB
Samsung HDD.
Preparing to test
A separate copy of the test database is used for each test case, in order to eliminate any interference between state-
ments. We create four fresh copies of the database for this purpose. Supposing all files are in a directory called
C:\TEST, simply create the four test databases from your test backup file:
gbak –c –user SYSDBA –pass masterkey C:\TEST\testspbackup.gbk C:\TEST\testsp1.ib
SQL test scripts

The first script tests a regular, one-pass update without the NO SAVEPOINT option.
For convenience, the important commands are clarified with comments:

connect "C:\TEST\testsp1.ib" USER "SYSDBA" Password "masterkey"; //Connect
drop table TEST2DROP; // Drop table TEST2DROP to free database pages
commit;
select count(*) from test; // Walk down all records in TEST to place them into cache
commit;
set time on; //enable time statistics for performed statements
set stat on; // enables writes/fetches/memory statistics
commit;
// perform bulk update of all records in TEST table
update TEST set ID = ID+1, QRT = QRT+1, NAME=NAME||'1', ts_change = CURRENT_TIMESTAMP;
commit;
quit;
The second script tests performance for the same UPDATE with the NO SAVEPOINT option:

News & Events connect "C:\testsp2.ib" USER "SYSDBA" Password "masterkey"; News & Events
drop table TEST2DROP;
Fyracle 0.8.9 commit; IBAnalyst 1.9
Janus has released a select count(*) from test; IBSurgeon has issued new
new version of Oracle- commit; version of IBAnalyst.
mode Firebird, Fyracle. set time on;
Fyracle is a specialized set stat on; Now it can better analyze
build of Firebird 1.5: commit; InterBase or Firebird data-
SET TRANSACTION NO SAVEPOINT; // enable NO SAVEPOINT
it adds temporary base statistics with using
tables, hierarchical
metadata information (in this
commit;
queries and
case connection to database
quit;
a PL/SQL engine.
is required).
Version 0.8.9 adds
support for stored Except for the inclusion of the SET TRANSACTION NO SAVEPOINT statement in the second script, both scripts are IBAnalyst is a tool that assists
procedures written the same, simply testing the behavior of engine in case of the single bulk UPDATE. a user to analyze in detail
in Java. Firebird or InterBase data-
To test sequential UPDATEs, we added several UPDATE statements--we recommend using five.
Fyracle dramatically The script for testing without NO SAVEPOINT would be: base statistics and identify
reduces the cost of port- possible problems with data-
ing Oracle-based appli- connect "E:\testsp3.ib" USER "SYSDBA" Password "masterkey"; base performance, mainte-
cations to Firebird. drop table TEST2DROP; nance and how an applica-
commit; tion interacts with the data-
select count(*) from test;
Common usage
includes the base.
Compiere open source commit;
set time on;
ERP package,
It graphically displays data-
set stat on;
mid-market deployments base statistics and can then
of Developer/ commit; automatically make intelli-
2000 applications update TEST set ID = ID+1, QRT = QRT+1, NAME=NAME||'1', ts_change = CURRENT_TIMESTAMP; gent suggestions about
and demo CD's update TEST set ID = ID+1, QRT = QRT+1, NAME=NAME||'1', ts_change = CURRENT_TIMESTAMP; improving database per-
of applications formance and database
without license trouble. maintenance
Read more at: update TEST set ID = ID+1, QRT = QRT+1, NAME=NAME||'1', ts_change = CURRENT_TIMESTAMP;
commit;
www.ibsurgeon.com/
quit;
www.janus-soft- news.html
ware.com
You can download all the scripts and the raw results of their execution from this location:
http://www.ibdeveloper.com/issue2/testresults.zip

How to perform the test

News & Events News & Events
The easiest way to perform the test is to use isql's INPUT command.
Fast Report 3.19 TECT Software
Suppose you have the scripts located in c:\test\scripts:
>isql
The new version of the
presents
famous Fast Report is. Use CONNECT or CREATE DATABASE to specify a database New versions of nice utilities
SQL>input c:\test\scripts\script01.sql;
now out. FastReport® 3 from TECT Software
is an add-in component www.tectsoft.net.
that gives your applica-
tions the ability to gener- SPGen, Stored Procedure Gen-
Test results erator for Firebird and Inter-
ate reports quickly and
efficiently. FastReport® The single bulk UPDATE Base, creates a standard set of
provides all the tools you stored procs for any table, full
need to develop reports. First, let’s perform the test where the single-pass bulk UPDATE is performed. details can be found here
All variants of FastRe- This is an excerpt from the one-pass script with default transaction settings. http://www.tectsoft.net/
SQL> update TEST set ID = ID+1, QRT = QRT+1, NAME=NAME||'1', ts_change = CURRENT
port® 3 contains:
Products/Firebird/
_TIMESTAMP;
Visual report designer FIBSPGen.aspx
Current memory = 10509940
with rulers, guides and
Delta memory = 214016

zooming, wizard for And FBMail, Firebird EMail
(FBMail) is a cross platform PHP
Max memory = 10509940
basic types of report,
utility which easily allows the
Elapsed time= 4.63 sec
export filters for html, tiff,
bmp, jpg, xls, pdf out- sending of email's direct from
puts, Dot matrix reports Buffers = 2048 within a database.
support, support for most Reads = 2111
Writes 375
This utility is specifically aimed
popular DB-engines. Full
Fetches = 974980
at ISP's that support Firebird,
WYSIWYG, text rotation
but can easily be used on any
0..360 degrees, memo
SQL> commit;
computer which is connected
object supports simple
to the internet.
html-tags (font color, b, i,
Delta memory = -169188
u, sub, sup), improved Full details
stretching (StretchMode,
can be found here:
ShiftMode properties),
Buffers = 2048
access to DB fields, http://www.tectsoft.net/
Reads = 1
styles, text flow, URLs,
Anchors. Products/Firebird/
Writes 942 FBMail.aspx
Trial! Fetches = 3
Buy Now! This is an excerpt from the one-pass script with NO SAVEPOINT enabled

SQL> SET TRANSACTION NO SAVEPOINT;

News & Events SQL> update TEST set ID = ID+1, QRT = QRT+1, NAME=NAME||'1', ts_change = CURRENT
News & Events
_TIMESTAMP; SQLHammer 1.5
CopyTiger 1.00
Delta memory = 55296
CopyTiger, Enterprise Edition
a new product The interesting
Buffers = 2048
from Microtec, developer tool,
is an advanced Reads = 2102 SQLHammer,
database Writes 350 now has
Fetches = 967154
replication & an Enterprise Edition.
synchronisation SQL> commit;
SQLHammer is
tool for
Delta memory = -8192 a high-performance tool
InterBase/ for rapid
Firebird, Elapsed time= 0.15 sec
Buffers = 2048
database development
powered by
Reads = 1 and administration.
Writes 938
their CopyCat
component set Fetches = 3 It includes a common
(see the article integrated development
about CopyCat Performance appears to be almost the same, whether the NO SAVEPOINT option is enabled or not. environment,
in this issue). registered custom
Sequential bulk UPDATE statements components
For more With the mutlti-pass script (sequential UPDATEs) the raw test results are rather large.
based on the
information about Table 1 Borland Package Library
CopyTiger see: Test results for 5 sequental
mechanism
UPDATEs
www.microtec.fr/ and an Open Tools API
copycat/ct written in
Download Delphi/Object Pascal.
an evaluation www.metadataforge.com
version

In the second UPDATE we start to see the difference. With default transaction settings this
UPDATE takes a very long time - 47 seconds - compared to only 7210 ms with NO SAVE-
POINT enabled. With default transaction settings we can see that memory usage is signifi-
cant, wherease with NO SAVEPOINT no additional memory is used.
The third and all following UPDATE statements with default settings show equal time and
memory usage values and the growth of writes parameters.
Table 1
Test results for 5 sequental UPDATEs
Figure 2
Memory usage while performing UPDATEs with and without NO SAVEPOINT
With NO SAVEPOINT usage we observe that time/memory values and writes growth are
all small and virtually equal for each pass. The corresponding graphs are below:
Inside the UNDO log

Figure 1
So what happened during the execution of the test scripts? What is the secret behind this
Time to perform UPDATEs with and without NO SAVEPOINT magic that NO SAVEPOINT does? Is there any cost attached to these gains?
For convenience, the results are tabulated above.
The first UPDATE statement has almost the same execution time with and without the NO A few words about versions
SAVEPOINT option. However, memory consumption is reduced fivefold when we use NO You probably know already that InterBase is a multi-record-version database engine,
SAVEPOINT. meaning that each time a record is changed, a new version of that record is produced. The

old version does not disappear The NO SAVEPOINT “Release Notes” for InterBase 7.5. SP1, “New in InterBase 7.5.1”, page 2-2).
immediately but is retained as a
backversion.
option Secondly, when a NO SAVEPOINT transaction is rolled back, it is marked as rolled back in the transaction
inventory page. Record version garbage thereby gets stuck in the "interesting" category and prevents the
In fact, the first time a backversion is The NO SAVEPOINT option in
OIT from advancing. Sweep is needed to advance the OIT and back out dead record versions.
written to disk, it is as a delta ver- InterBase 7.5.1 is a workaround for
sion, which saves disk space and the problem of performance loss Fuller details of the NO SAVEPOINT option are provided in the InterBase 7.5.1. Release Notes.
memory usage by writing out only during bulk updates that do multiple
the differences between the old and passes of a table. The theory is: if
Initial situation
the new versions. The engine can using the implicit savepoint man-
agement causes problems then let's Consider the implementation details of the undo-log. Figure 3 shows the initial situation:
rebuild the full old version from the
new version and chains of delta ver- kill the savepoints. No savepoints –
Recall that we perform this test on freshly-restored database, so it is guaranteed that only one version exists
sions. It is only if the same record is no problem :-)
for any record.
updated more than once within the Besides ISQL, it has been surfaced
same transaction that the full back- as a transaction parameter in both
version is written to disk. DSQL and ESQL. At the API level, a
new transaction parameter block
The UNDO log concept (TPB) option isc_tpb_no_savepoint
You may recall from Dmitri’s article can be passed in the isc_start_trans-
how each transaction is implicitly action() function call to disable
enclosed in a "frame" of savepoints, savepoints management. Syntax
each having its own undo log. This details for the latter flavors and for
log stores a record of each change the new tpb option can be found in
in sequence, ready for the possibili- the 7.5.1 release notes.
ty that a rollback will be requested. The effect of specifying the NO
A backversion materializes when- SAVEPOINT transaction parameter
Figure 3
ever an UPDATE or DELETE state- is that no undo log will be created.
However, along with the perform- Initial situation before any UPDATE - only the one record version exists, Undo log is empty
ment is performed. The engine has
to maintain all these backversions in ance gain for sequential bulk
updates, it brings some costs for The first UPDATE
the undo log for the relevant save-
point. transaction management. The first UPDATE statement creates delta backversions on disk (see figure 4). Since deltas store only the
differences between the old and new versions, they are quite small. This operation is fast and it is easy
So, the Undo Log is a mechanism to First and most obvious is that, with work for the memory manager.
manage backversions for save- NO SAVEPOINT enabled, any
points in order to enable the associ- error handling that relies on save- It is simple to visualize the undo log when we perform the first UPDATE/DELETE statement inside the trans-
ated changes to be rolled back. The points is unavailable. Any error action – the engine just records the numbers of all affected records into the bitmap structure. If it needs to
process of Undo logging is quite during a NO SAVEPOINT transac- roll back the changes associated with this savepoint, it can read the stored numbers of the affected records,
complex and maintaining it can tion precludes all subsequent exe- then walk down to the version of each and restore it from the updated version and the backversion stored
consume a lot of resources. cution and leads to rollback (see on disk.

This approach is very fast and economical on memory usage. The engine The engine could write all intermediate versions to disk but there is no reason to do so.
does not waste too many resources to handle this undo log – in fact it reuses These versions are visible only to the modifying transaction and would not be used unless
the existing multi-versioning mechanism. Resource consumption is merely a rollback was required.
the memory used to store the bitmap structure with the backversion num-
bers. We don't see any significant difference here between a transaction
with the default settings and one with the NO SAVEPOINT option enabled.
Figure 5
The second UPDATE creates a new delta backversion for transaction 1,
Figure 4 erases from disk the delta version created by the first UPDATE, and copies
UPDATE1 create small delta version on disk and put record number into UNDO log the version from UPDATE1 into the Undo log.
The second UPDATE
When the second UPDATE statement is performed on the same set of records, we have a This all makes hard work for the memory manager and the CPU, as you can see from the
different situation. growth of the “max mem”\”delta mem” parameters values in the test that uses the default
transaction parameters.
Here is a good place to note that the example we are considering is the most simple situa-
tion, where only the one global (transaction level) savepoint exists. We will also look at the When NO SAVEPOINT is enabled we avoid the expense of maintaining the Undo log. As
difference in the Undo log when an explicit (or enclosed BEGIN… END) savepoint is used. a result, we see execution time, reads/writes and memory usage as low for subsequent
updates as for the first.
To preserve on-disk integrity (remember the ‘careful write’ principle ?) the engine must
compute a new delta between the old version (by transaction1) and new version (by trans- The third UPDATE
action2, update2), store it somewhere on disk, fetch the current full version (by transac-
tion2, update1), put it into the in-memory undo-log, replace it with the new full version (with The third and all subsequent UPDATEs are similar to the second UPDATE, with one excep-
backpointers set to the newly created delta) and erase the old, now superseded delta. As tion – memory usage does not grow any further.
you can see – there is much more work to do, both on disk and in memory.

Original design of IB implement second UPDATE another way but sometime after IB6 Bor- BEGIN... END savepoint framework, the engine has to store a backversion for each associ-
land changed original behavior and we see what we see now.But this theme is for another ated record version in the Undo log.
article;)
For example, if we used an explicit savepoint, e.g. SAVEPOINT Savepoint1, upon perform-
Why is the delta of memory usage zero? The reason is that, beyond the second UPDATE, ing UPDATE2, we would have the situation illustrated in figure 7:
no record version is created. From here on, the update just replaces record data on disk
with the newest one and shifts the superseded version into the Undo log.
A more interesting question is why we see an increase in disk reads and writes during the
test. We would have expected that the third and following UPDATEs would do essentially
equal numbers of read and writes to write the newest versions and move the previous ones
to the undo log. However, we are actually seeing a growing count of writes. We have no
answer for it, but we would be pleased to know.
The following figure (figure 6) helps to illustrate the situation in the Undo log during the
sequential updates. When NO SAVEPOINT is enabled, the only pieces we need to perform
are replacing the version on disk and updating the original backversion. It is fast as the first
UPDATE.
Figure 7
If we have an explicit SAVEPOINT, each new record version associated with it will have a
corresponding backversion in the Undo log of that savepoint
In this case the memory consumption would be expected to increase each time an UPDATE
occurs within the explicit savepoint's scope.
Figure 6 Summary
The third UPDATE overwrites the UPDATE1 version in the Undo log with the UPDATE2 The new transaction option NO SAVEPOINT can solve the problem of excessive resource
version and its own version is written to disk as the latest one. usage growth that can occur with sequential bulk updates. It should be beneficial when
Explicit SAVEPOINT applied appropriately. Because the option can create its own problems by inhibiting the
advance of the OIT, it should be used with caution, of course. The developer will need to take
When an UPDATE statement is going to be performed within its own explicit or implicit extra care about database housekeeping, particularly with respect to timely sweeping.

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Development area
OOD and RDBMS, Part 1
Author: Vladimir Kotlyarevsky
Object-Oriented vlad@contek.ru
Development in RDBMS,
Part 1 of all the methods taken from the tion development - if you develop
Thanks and apologies The problem state-
articles, and of course I have tested your applications just like 20-30
This article is mostly a compilation all my own methods.
ment years ago.
of methods that are already well-
known, though many times it turned Mixing of object-oriented program- What is the problem? As you see, almost all these charac-
out that I on my own have reinvent- ming and RDBMS use is always a teristics sound good, except, proba-
Present-day relational databases
ed a well-known and quite good compromise. I have endeavored to bly, the last one. Today you can
were developed in times when the
wheel. I have endeavored to pro- recommend several approaches in hardly find a software product (in
sun shone brighter, the computers
vide readers with links to publica- which this compromise is minimised almost any area) consisting of more
were slower, mathematics was in
tions I know of that are discussing for both components. I have also than few thousand of lines which is
favour, and OOP had yet to see the
the problem. However, if I missed tried to describe both advantages written without OOP technologies.
light of day. Due to that fact most
someone’s work in the bibliogra- and disadvantages of such a com- OOP languages have been used for
RDBMSs’ have the following char-
phy, and thus violated copyright, promise. a long time for building visual
acteristics in common:
please drop me a message at forms, i.e. in UI development. It is
I should make it clear that the object also quite usual to apply OOP at
vlad@contek.ru. I apologize 1.Everyone got used to them and
approach to database design as the business logic level, if you
beforehand for possible inconven- felt comfortable with them.
described is not appropriate for implement it either on a middle-tier,
ience, and promise to add any nec-
every task. It is still true for the OOP 2.They are quite fast (if you use or on a client. But things fall apart
essary information to the article.
as a whole, too, no matter what them according to certain known when the deal comes closer to the
The sources in the bibliography are OOP apologists may say! ?. I would standards). data storage issues… During the last
listed in the order of their appear- recommend using it for such tasks as ten years there were several
ance in my mind. document storage and processing, 3.They use SQL, which is an easy,
attempts to develop an object-ori-
accounting, etc. comprehensible and time-proved
ented database system, and, as far
The described database structures data manipulation method.
And the last, but not least , I am very as I know, all those attempts were
have been simplified in order to
thankful to Dmitry Kuzmenko, 4.They are based upon a strong rather far from being successful.
illustrate the problems under con-
Alexander Nevsky and other peo- mathematical theory. The characteristics of an OODBMS
sideration as much clearly as possi-
ple who helped me in writing this are the antithesis of those for an
ble, leaving out more unimportant 5.They are convenient for applica-
article. RDBMS. They are unusual and
elements. I have tested the viability

slow; there are no standards for The OID any disadvantages. Even in the And what is more, nobody requires
data access and no underlying All objects are unique, and they pure relational model it does not run-time pointers to contain some
mathematical theory. Perhaps the must be easily identifiable. That is matter whether the surrogate key is additional information about an
OOP developer feels more comfort- why all the objects stored in the unique within a single table or the object except for a memory
able with them, although I am not database should have unique ID- whole database. address. However, there are some
sure… keys from a single set (similar to people who vigorously reject usage
OIDs should never have any real of surrogates. The most brilliant
As a result, everyone continues object pointers in run-time). These world meaning. In other words, the
identifiers are used to link to an argument against surrogates I’ve
using RDBMS, combining object- key should be completely surro- ever heard is that "they conflict with
oriented business logic and domain object, to load an object into a run- gate. I will not list here all the pros
time environment, etc. In the [1] the relational theory». This state-
objects with relational access to the and cons of surrogate keys in comment is quite arguable, since surro-
database, where these objects are article these identifiers are called parison with natural ones: those
OIDs (i.e. Object IDs), in [2] – UINs gate keys, in some sense, are much
stored. who are interested can refer to the closer to that theory than natural
(Unique Identification Number), or [4] article. The simplest explanation
"hyperkey". Let us call them OIDs, ones.
What do we need? is that everything dealing with the
though “hyperkey” is also quite a real world may change (including Those who are interested in more
The thing we need is simple – to beautiful word, isn’t it? ? . the vehicle engine number, network strong evidence supporting the use
develop a set of standard methods card number, name, passport num- of OIDs with the characteristics
that will help us to simplify the First of all, I would like to make a
couple of points concerning key ber, social security card number, described above (pure surrogate,
process of tailoring the OO-layer of and even sex ?. unique at least within the data-
business logic and a relational stor- uniqueness. Database developers
who are used to the classical base), should refer to [1], [2], and
age together. In other words, our Nobody can change their date of [4] articles.
task is to find out how to store approach to database design birth – at least not their de facto
objects in a relational database, would probably be quite surprised date of birth. But birth dates are not The simplest method of OID imple-
and how to implement links at the idea that sometimes it makes unique, anyway.) mentation in a relational database
between the objects. At the same sense to make a table key unique is a field of “integer” type, and a
time we want to keep all the advan- not only within a single table (in Remember the maxim "everything function for generating unique val-
tages provided by the relational terms of OOP – not only within a that can go bad will go bad" ("con- ues of this type. In larger or distrib-
database design and access: certain class), but also within the sequently, everything that cannot uted databases, it probably makes
speed, flexibility, and the power of whole database (all classes). How- go bad…". hum!. But let’s not talk sense to use “int64” or a combina-
relation processing. ever, such strict uniqueness offers about such gloomy things ? ). tion of several integers.
important advantages, which will Changes to some OIDs would
become obvious quite soon. More- immediately lead to changes in all
RDBMS as an object over, it often makes sense to pro- identifiers and links, and thus, as ClassId
vide complete uniqueness in a Uni- Mr. Scott Ambler wrote [1], could All objects stored in a database
storage verse, which provides considerable result in a “huge maintenance should have a persistent analogue
First let’s develop a database struc- benefits in distributed databases nightmare.” As for the surrogate of RTTI, which must be immediately
ture that would be suitable for and replication development. At the key, there is no need to change it, at available through the object identi-
accomplishing the specified task. same time, strict uniqueness of a key least in terms of dependency on the fier. Then, if we know the OID of an
within the database does not have changing world. object, keeping in mind that it is

Description varchar(128),
unique within the database, we can links, is often a very complicated
Deleted smallint,
immediately figure out what type task, to put it mildly ?.
Creation_date timestamp default CURRENT_TIMESTAMP,

the object is. This is the first advan-
Thus each object has to have
Change_date timestamp default CURRENT_TIMESTAMP ,
tage of OID uniqueness. Such an
objective may be accomplished by mandatory attributes (“OID” and
a ClassId object attribute, which “ClassId”) and desirable attributes Owner TOID);
refers to the known types list, which (“Name,” “Description,” “cre-
basically is a table (let us call it ation_date,” “change_date,” and The ClassId attribute (the object type identifier) refers to the OBJECTS table,
“CLASSES”). This table may include “owner”). Of course, when devel- that is, the type description is also an object of a certain type, which has, for
any kind of information – from just oping an application system, you example, well-known ClassId = –100. (You need to add a description of the
a simple type name to detailed type can always add extra attributes known OID). Then the list of all persistent object types that are known to the
metadata, necessary for the appli- specific to your particular require- system is sampled by a query: select OID, Name, Description from
cation domain. ments. This issue will be considered OBJECTS where ClassId = -100).
a little later.
Name, Description, cre- Each object stored in our database will have a single record in the OBJECTS
ation_date, change_date, table referring to it, and hence a corresponding set of attributes. Wherever
The OBJECTS Table we see a link to a certain unidentified object, this fact about all objects
owner
Reading this far leads us to the con- enables us to find out easily what that object is - using a standard method.
Imagine a file system where user clusion that the simplest solution is
has to remember the handles or the to store the standard attributes of all Quite often, only basic information is needed. For instance, we need just a
inodes of files instead of their file- objects in a single table. You could name to display in a lookup combobox. In this case we do not need any-
names ?. It is frequently convenient, support a set of these fixed attrib- thing but the OBJECTS table and a standard method of obtaining the name
though not always necessary, to utes separately for each type, dupli- of an object via the link to it. This is the second advantage.
have an object naming method that cating 5 or 6 fields in each table,
There are also types, simple lookup dictionaries, for example, which do not
is independent of the object type. but why? It is not a duplication of
have any attributes other than those which already are in OBJECTS. Usual-
For example, each object may have information, but it is still a duplica-
ly it is a short name (code) and a full long name that can easily be stored in
a short name and a long name. It is tion of entities. Besides, a single
the “Name” and “Description” fields of the OBJECTS table. Do you remem-
also sometimes convenient to have table would allow us to have a cen-
ber how many simple dictionaries are in your accounting system? It is like-
other object attributes for different tral “well-known” entry point for
ly, that not less than a half! Thus you may consider that you have implement-
purposes, akin to file attributes, searching of any object, and that is
ed half of these dictionaries - that is the third advantage! Why should dif-
such as “creation_date,” one of the most important advan-
ferent dictionaries (entities) be stored in different tables (relations), when
“change_date,” and “owner” (the tages of the described structure.
they have the same attributes? It does not matter that you were told to do so
“owner” attribute can be a string when you were a student!
So let us design this table:
containing the owner’s name, as
create domain TOID as integer not null;
well as a link to an object of the A simple plain dictionary can be retrieved from the OBJECTS
create table OBJECTS(

“user” type). The “Deleted” attrib- table by the following query
OID TOID primary key, select OID, Name, Description

ute is also necessary as an indicator
ClassId TOID, from OBJECTS

of the unavailability of an object.
where ClassId = :LookupId

The physical deletion of records in a
database, full of direct and indirect Name varchar(32),

if you know ClassId for this dictionary which relates one-to-one to the OBJECTS table by the Nevertheless, there are two main disadvantages: implementation of the system of
element. If you only know the type OID field. The "order number" and “comments” attrib- object-relational mapping for this method is less than simple, and there are some diffi-
name, the query becomes a bit more utes are stored in the “Name” and “Description” fields culties in the organization of type inheritance. This method is described in detail in [1]
complex: of the OBJECTS table. “Orders” also refers to and [3]. These articles also describe the implementation of type inheritance methods in
select OID, Name, Description a database.
from OBJECTS
where ClassId = (select OID from OBJECTS where ClassId = -100
Method 2. (See. [5]) All object attributes of any type are stored in the form of a
and Name = :LookupName)

record set in a single table. A simple example would look like this:
create table attributes (
OID TOID, /* link to the master-object of this attribute */
Later I will demonstrate how to add a “OBJECTS” via the “customer” field, since a customer is
attribute_id integer not null,

little more intelligence to such simple also an object, being for example, a part of the "Part-
value varchar(256),
dictionaries. ners" dictionary. You can retrieve all attributes of
“Order” type with the following query:
select o.OID, constraint attributes_pk primary key (OID, attribute_id));
Storing of more complex
objects o.Name as Number,
o.Description as Comment,
connected 1:M with OBJECTS by OID. There is also a table
ord.customer,
It is clear that some objects in real
databases are more complex than create table class_attributes (
those which can be stored in the ord.sum_total OID TOID, /*here is a link to a description-object of the type */
OBJECTS table. The method for stor- from Objects o, Orders ord attribute_id integer not null,
ing them depends on the application where o.OID = ord.OID and ord.OID = :id attribute_name varchar(32),
attribute_type integer,
domain and the object’s internal As you see, everything is simple and usual. You could
constraint class_attributes_pk primary key (OID,attribute_id))
structure. Let’s look at three well- also create a view “orders_view”, and make everything
known methods of object-relational look as it always did. ?
mapping. which describes type metadata – an attribute set (their names and types) for each
If an order has a “lines” section, and a real-world order object type.
Method 1. The objects are stored just should definitely have such a section, we can create sep-
as in a standard relational database, arate table for it, call it e.g. “order_lines”, and relate it All attributes for particular object where the OID is known are retrieved by the query:
with the type attributes mapped to
select attribute_id, value
with the “Orders” table by a relation 1:M.
create table order_lines (
table attributes. For example, docu-
from attributes
id integer not null primary key,
ment objects of the “Order” type
where OID = :oid
object_id TOID, /* reference to order object – 1:M relation */
with such attributes as "order num-
ber," "comments," "customer," and
"order amount" are stored in the item_id TOID, /* reference to ordered item */ or, with names of attributes
table Orders amount numeric(15,4), select a.attribute_id, ca.attribute_name a.value
cost numeric(15,4), from attributes a, class_attributes ca, objects o
create table Orders (
sum numeric(15,4) computed by (amount*cost)) where a.OID = :oid and
OID TOID primary key,
a.OID = o.OID and
customer TOID,
o.ClassId = ca.OID and
One very important advantage of this storage method is that it allows you to
sum_total NUMERIC(15,2)),
work with object sets as you would with normal relational tables (which they
actually are). All the advantages of the relational approach are present. a.attribute_id = ca.attribute_id

In the context of this method, you database, a significant benefit ben- Method 3. Everything is stored in utes, it would be better to to use method 1, since it is the closest to
can also emulate a relational stan- efit since, with this approach the BLOB, and one of the persistent for- the standard relational model and you will retain all the power of
dard. Instead of selecting object number of tables would not mats is applied – a custom format, SQL. If, on the other hand, data processing is not particularly com-
attributes in several records (one increase, no matter how many dif- or, for example, dfm (VCL stream- plex and data sets are not too large and/or you need a simple
attribute per a record) you can get ferent types were stored in a data- ing) from the Borland VCL, or XML, database structure, then it makes sense to use method 2. If a data-
all attributes in a single record by base. or anything you like. There is noth- base is used only as an object storage, and all operations are per-
joining or by using subqueries: ing to comment on here. The advan- formed with run-time instances of objects without using native data-
tages are obvious: object retrieval base tools like SQL (except work with attributes stored in the
select o.OID, logic is simple and fast; no extra OBJECTS table), the third method would probably be the best
o.Name as Number, database structures are necessary choice due to the speed and ease of implementation.
o.Description as Comment,
– just a single BLOB field; you can
To be continued…
a1.value as customer,
store any custom objects, including
absolutely unstructured objects
a2.value as sum_total (such as MS Word documents,
from OBJECTS o
HTML pages, etc). The disadvan-
left join attributes a1 on a1.OID = o.OID and
tage is also obvious: there is nothing
FastReport 3 - new generation

relational about this approach and
a1.attribute_id = 1
of the reporting tools.

you would have to perform all data
left join attributes a2 on a2.OID = o.OID and processing outside of the database,
a2.attribute_id = 2
using the database only for object
where o.OID = :id;

storage.
Editor’s note: Even more impor- Visual report designer with rulers, guides and
zooming, wizards for base type reports, export
Clearly, the more attributes object The main disadvatage is that this tant, it is impossible to query the
filters to html, tiff, bmp, jpg, xls, pdf, Dot matrix
has, the slower the loading process method is so different from the stan- database for objects with certain
reports support, support most popular DB-engines.
will be, since each attribute requires dard relational model that many attributes, as the attribute data is
an additional join in the query. standard techniques used on rela- stored in the BLOB.
Full WYSIWYG, text rotation 0..360 degrees,
tional databases cannot be applied.
memo object supports simple html-tags

Returning to the example with the The speed of object retrieval is sig- It was not difficult to come to the fol-
“Order” document, we see that the
(font color, b, i, u, sub, sup), improved stretching
nificantly lower than the previous lowing conclusions (well I know –
"order number" and "comments" everyone already knows them ?).
(StretchMode, ShiftMode properties),
method 1 and it decreases also
attributes are still stored in the All the three methods described are
Access to DB fields, styles, text flow, URLs, Anchors.
with the growth of the attribute
OBJECTS table, but "customer" and count of a type. It is not very suit- undoubtedly useful and have a right
"order amount" are stored in two to live. Moreover, it sometimes
able for working with objects using
separate records of the “attributes” makes sense to use all three of them
SQL from inside the database, in
table. This approach is described by in a single application. But when
stored procedures, for example. choosing among these three meth-
Anatoliy Tentser in article [5]. Its There is also a certain level of data
advantages are rather important: a ods, take into account the listed
redundancy because three fields advantages and disadvantages of
standardized method of retrieving (“OID,” “attribute_id,” and
and storing object attributes; ease each method. If your application
“value”) are applicable to every involves massive relational data
http://www.fast-report.com/
of extending and changing a type; attribute instead of just the one field
ease in implementing object inheri- processing like searching or group-
described in Method 1. ing, or it is likely to be added in the
tance; very simple structure for the
future, using a certain object attrib-

Replicating and synchronizing
Replicating and synchronizing Author:

Jonathan Neve
InterBase/Firebird jonathan@microtec.fr
www.microtec.fr
databases using CopyCat
PART 1 : Data logging Two-way replication
BASICS OF DATABASE Before anything can be replicated, One obvious difficulty involved in two-way replication is how to
all changes to each database must avoid changes that have been replicated to one database from
REPLICATION of course be logged. CopyCat cre- replicating back to the original database. Since all the changes to
A replicator is a tool for keeping ates a log table and triggers for the database are logged, the changes made by the replicator are
several databases synchronized each table that is to be replicated. also logged, and will therefore bounce back and forth between the
(either wholly or in part), on a con- These triggers insert into the log source and the target databases. How can this problem be avoid-
tinuous basis. Such a tool can have table all the information concerning ed?
many applications: it can allow for the record that was changed (table
name, primary key value(s), etc). The solution CopyCat uses is related to the sub-node management
off-line, local data editing, with a
system described above. Each sub-node is assigned a name, which
punctual synchronization upon
is used when the sub-node logs in to the database. When a sub-
reconnection to main database; it Multi-node replication node replicates its own changes to its parent, the replication triggers
can also be used over a slow con-
Replicating to and from several log the change for all the node's sub-nodes except the current user.
nection, as an alternative to a direct
nodes adds another degree of com- replicates its changes. Each node's Thus, only sub-nodes other than the originator receive the change.
connection to the central database;
plexity. Every change that is made list of sub-nodes is stored in a table
another use would be to make an Conversely, CopyCat logs in to the nodes local database using the
to one database must be applied to in the node's database. (Incidental-
automatic, off-site, incremental node name of its parent as user name. Thus, any change made to
all the others. Furthermore, when ly, the parent node is configured in
backup, by using simple one-way the local database during replication will be logged for all sub-
one database applies this change, it the replicator software itself rather
replication. nodes other than the node's parent, and any change made to the
must indicate to the originating than in the database, and therefore,
database that the change has been no software is needed on nodes parent node will be logged to other sub-nodes, but not to the origi-
Creating a replicator can be quite
applied, without in any way hinder- having no parent – which allows nating node itself.
tricky. Let's examine some of the
key design issues involved in data- ing the other databases from repli- these servers to run Linux, or any
base replication, and explain how cating the same change, either other OS supported by Primary key synchronization
these issues are implemented in before, simultaneously, or after. Interbase/FireBird). One problem with replication is that since data is edited off-line,
Microtec CopyCat, a set of Delphi / there is no centralized way to ensure that the value of a field
In CopyCat, these problems are When a data change occurs in a
C++Builder components for per- remains unique. One common answer to this problem is to use
solved using a simple and flexible replicated table, one line is gener-
forming replication between Inter- GUID values. This is a good solution if you're implementing a new
system. Each replication node can ated per sub-node. Thus, each sub-
base and FireBird databases. database (except that GUID fields are rather large, and therefore,
be have one parent node, and sev- node fetches only the log lines that
eral sub-nodes towards which it concern it. not very well suited for a primary key field), but if you have an exist-

ing database that needs replication, sion of the record: this is a conflict. Difficult database structures
it would be very difficult to replace There are certain database archi-
all primary or unique key fields by CopyCat automatically detects con-
flicts, logs them to a dedicated tectures that are difficult to repli-
GUID values. cate. Consider for example a
table, and disables replication of
Since GUID fields are, in many that record in either direction until “STOCK” table, containing one line
cases, not feasible, CopyCat imple- the conflict is resolved. The conflicts per product, and a field holding the
ments another solution. CopyCat table holds the user names of both current stock value. Suppose that
allows you to define for each pri- nodes involved in the conflict, as for a certain product, the current
mary key field (as well as up to well as a field called stock value being 45, node A adds
three other fields for which unicity is “CHOSEN_USER”. In order to 1 item to stock, setting the stock
to be maintained) a synchroniza- solve the conflict, the user simply value to 46. Simultaneously, node
tion method. In most cases, this will has to put in this field the name of B, adds 2 items to stock thereby set-
be either a generator, or a stored the node which has the correct ver- ting the current stock value to 47.
procedure call, though it could be sion of the record, and automatical- How can such a table then be repli-
any valid SQL clause. Upon repli- ly, upon the next replication, the cated? Neither A nor B have the
cation, this SQL statement is called record will be replicated and the correct value for the field, since nei-
on the server side in order to calcu- conflict resolved. ther take into consideration the
late a unique key value, and the changes from the other node.
resulting value is then applied to the This system was carefully designed
to function correctly even in some of Most replicators would require such
local database. Only after the key an architecture to be altered.
values (if any) have been changed the complex scenarios that are pos-
sible with CopyCat. For instance, Instead of having one record hold
locally is the record replicated to the current stock value of product,
the server. the conflict may in reality be
between two nodes that are not there could be one line per change.
When replicating from the parent directly connected to each other: This would solve the problem. How-
node to the local node however, since CopyCat nodes only ever ever, restructuring large databases
this behaviour does not take place: communicate directly with their par- (and the end-user applications that
the primary key values on the serv- ent, there is no way to tell if another usually go with them) could be a
er are considered to be unique. node may not have a conflicting rather major task. CopyCat was
update for a certain record. Fur- specifically designed to avoid these
thermore, it's entirely possible that problems altogether, rather than
Conflict management require the database structure to be
two nodes (having the same parent)
Suppose a replication node and its should simultaneously attempt to changed.
parent both modify the same record replicate the same record to their
during the same time period. When To solve this kind of problem, Copy-
parent. By using a snapshot-type Cat introduces stored procedure
the replicator connects to its parent transaction, and careful ordering of
to replicate its changes, it has no “replication”. That is, a mechanism
the replication process, these issues for logging stored procedure calls,
way of telling which of the two are handled transparently.
nodes has the most up-to-date ver- and replicating them to other
nodes. When dealing with an

unreplicatable table (like in the example above) one solution is to make a

stored procedure which can be called for updating the table, and using
stored procedure replication in order to replicate each of these calls. Thus,
continuing the example above, instead of replicating the values of the
STOCK table, the nodes would replicate only their changes to these values,
thereby correctly synchronizing and merging the changes to the STOCK
table.
PART 2:
GETTING STARTED WITH COPYCAT
Copycat is available in two distinct forms :
1. As a set of Delphi / C++ Builder components
2. As a standalone Win32 replication tool
We will now present each one of these products.
Below is a concise guide for getting started with the CopyCat component suite:
Download and install the evaluation components from

http://www.microtec.fr/copycat
1. CopyCat Delphi / C++ Builder Component Set

Prepare databases
CopyCat can be obtained as a set of Delphi / C++ Builder components,
1) Open Delphi, and compile the data provider package(s) for the DAC
enabling you to use replication features painlessly and in many different sit-
that you plan to use (currently IBX and FIBPlus are supported). These are
uations, and sparing you the tedious task of writing and testing custom data-
components for interfacing between the CopyCat components and the
base synchronization solutions.
underlying data-access components.
Using these components, replication or synchronization facilities can be
2) Open and run the “Configuration” example project (requires the IBX
seamlessly built into existing solutions. As an example, we have used the
provider).
CopyCat components in an application used by our development team on
their laptops to keep track of programming tasks. When the developers 3) On the “General” tab, fill in the connection parameters, and press “Con-
need to go on-site to visit a customer, the application runs in local mode on nect to Database”.
the laptop. When they return and connect up to the company network, any
changes they have made on-site are automatically synchronized with the 4) On the “Users” tab, provide the list of sub-nodes for the current data-
main Interbase database running on a Linux server. base.
Many more applications are possible since the CopyCat components are 5) On the “Tables” tab, for each table that you want to replicate, set a pri-
very flexible and allow for synchronization of even a single table! ority (relative to the other tables), and double-click on the “PKn generator”

columns to (optionally) fill the primary key synchronization method. Once Features include :
these settings have been made, set the “Created” field to 'Y', so as to gen- • Easy to use installer
erate the meta-data.
• Independent server Administrator tool (CTAdmin)
6) On the “Procedures” tab, set “Created” to 'Y' for each procedure that
you want to replicate, after having set a priority. • Configuration wizard for setting up links to master / slave databases
7) Apply all the generated SQL to all databases that should be replicated. • Robust replication engine based on Microtec CopyCat
8) For each database to be replicated, set the list of sub-nodes (in the • Fault-tolerant connection mechanism allowing for automatic resumption of lost database connections
RPL$USERS table).
• Simple & intuitive control panel
Replicate •Automatic email notification on certain events (conflicts, PK violations, etc.)
1. In Delphi, open the “Replicator” example project.
Visit the CopyTiger homepage to download a time-limited trial version.
2. Drop a provider component on the form, and hook it up to the TCcRepli-
http://www.microtec.fr/copycat/ct
cator's DBProvider property.
3. Setup the LocalDB and RemoteDB properties of the TCcReplicator with SUMMARY
the connection parameters for the local and remote databases.
In today's connected world, database replication and synchronization are topics of great interest among
4. Fill in the user name of the local and all industry professionals. With the advent of Microtec CopyCat, the Interbase / Firebird community is
remote nodes, as well as the SYSDBA obtaining a two-fold benefit :
user name and password (needed for 1. By encapsulating all the functionality of a replicator into Delphi components, CopyCat makes it easier
primary key synchronization). than ever to integrate replication and synchronization facilities into custom applications,
5. Compile and run the example. 2. By providing a standalone tool for the replication of Interbase / Firebird databases, Microtec is respond-
6. Press the “Replicate now” button. ing to another great need in the community – that of having a powerful and easy-to-use replication tool,
and one that can be connected to an existing database without disrupting it's current structure.
2. CopyTiger : CopyCat is being actively developed by Microtec and many new features are being worked on such as
the CopyCat Win32 support for replicating between heterogeneous database types (PostgreSQL, Oracle, MSSQL, MySQL,
standalone replication tool NexusDB, ...) as well as a Linux / Kylix version for the components and the standalone tool.
You can find more information about CopyCat at:

For those who have replication or synchronization needs but do not have http://www.microtec.fr/copycat, or by contacting us at copycat@microtec.fr.
the time or resources to develop their own integrated system, Microtec pro- As as promotional operation, Microtec is offering a 20% discount to IBDeveloper readers for the pur-
poses a standalone database replication tool based on the CopyCat tech- chase of any CopyCat product, for the first 100 licenses sold!
nology called COPYTIGER.
Click on the BUY NOW icon on our web site and enter the following
ShareIt coupon code:
ibd-852-150

Using IBAnalyst
Author: Dmitri Kouzmenko
Understanding Your kdv@ib-aid.com
Database with IBAnalyst

I have been working with InterBase Thankfully, prior to working with database statistics from time to time. indices. It has in-place warnings
since 1994. Back then, most data- InterBase, I was interested in differ- which are available during the
bases were small, and did not ent data structures, how they are Let's take a look at the capabilities browsing of statistics; it also
require any tuning. Of course, there stored and what algorithms they of IBAnalyst. IBAnalyst can take includes hint comments and recom-
were occasions when I had to use. This helped me to interpret the statistics from gstat or the Services mendation reports.
change ibconfig on a server, and output of gstat. At that time I decid- API and compile them into a report
giving you complete information The summary shown in Figure 1
reconfigured hardware or the OS, ed to write a tool that could analyze
about the database, its tables and provides general information about
but that was almost all I could do to gstat output to help in tuning the
tune performance. database or at least to identify the Figure 1 Summary of database statistics
cause of performance problems.
Four years ago, our company
began to provide technical support Long story, but the outcome was
and training of InterBase users. that IBAnalyst was created. In spite
Working with many production of my experience it still allows me to
databases also taught me a lot of find very interesting things or per-
different things. However, most of formance issues in different data-
what I learned concerned applica- bases.
tions – transaction parameters
usage, optimizing queries and Real systems have runtime perform-
result sets. ance that fluctuates like a wave. The
amplitude of such ‘waves’ can be
Of course, I had known for quite low or high, so you can see how
some time about gstat – the tool that performance differs from day to
gives database statistics informa- day (or hour by hour). Actual per-
tion. If you have ever looked in formance depends on many factors,
gstat output or read opguide.pdf including the design of the applica-
about it, you would know that the tion, server configuration, transac-
statistical output looks like just a tion concurrency, version garbage
bunch of numbers and nothing else. in the database and so on. To find
Ok, you can discover fragmentation out what is happening in a data-
information for a particular table or base (both positive and negative
index, but what other useful infor- aspects of performance) you should
mation can be obtained? at the very least take a peek at the

Using IBAnalyst
your database. The warnings or stored for unknown time by the We will examine the next 8 rows as if there any other active transactions between oldest active and
comments shown are based on operating system in its file cache. a group, as they all display aspects next transaction, but there can be such transactions. Usually, if
carefully gathered knowledge InterBase 6 creates databases with of the transaction state of the data- the oldest active gets stuck, there are two possible causes: a) that
obtained from a large number of Forced Writes OFF. base: some transaction is active for a long time or b) the application
real-world production databases. design allows transactions to run for a long time. Both causes pre-
Why is this marked in red on the • The Oldest transaction is the vent garbage collection and consume server resources.
Note: All figures in this article con- IBAnalyst report? The answer is sim- oldest non-committed transaction.
tain gstat statistics which were taken ple – using asynchronous writes can Any lower transaction numbers are • Transactions per day – this is calculated from Next transaction,
from a real-world production data- cause database corruption in cases for committed transactions, and no divided by the number of days passed since the creation of the
base (with the permission of its own- of power, OS or server failure. record versions are available for database to the point where the statistics are retrieved. This can be
ers). such transactions. Transaction num- correct only for production databases, or for databases that are
Tip: It is interesting that modern bers higher than the oldest transac- periodically restored from backup, causing transaction numbering
As I said before, raw database sta- HDD interfaces (ATA, SATA, SCSI) tion are for transactions that can be to be reset.
tistics look cryptic and are hard to do not show any major difference in in any state. This is also called the
interpret. IBAnalyst highlights any performance with Forced Write set As you have already learned, if there any warnings, they are shown
"oldest interesting transaction",
potential problems clearly in yellow On or Off1 . as colored lines, with clear, descriptive hints on how to fix or prevent
because it freezes when a transac-
or red and the detail of the problem the problem.
tion is ended with rollback, and
can be read simply by placing the Next on the report is the mysterious
server can not undo its changes at It should be noted that database statistics are not always useful. Sta-
cursor over the relevant entry and "sweep interval". If positive, it sets
that moment. tistics that are gathered during work and housekeeping operations
reading the hint that is displayed. the size of the gap between the old-
can be meaningless.
est2 and oldest snapshot transaction, • The Oldest snapshot – the old-
What can we discover from the at which the engine is alerted to the est active (i.e., not yet committed) Do not gather statistics if you:
above figure? This is a dialect 3 need to start an automatic garbage transaction that existed at the start
database with a page size of 4096 collection. On some systems, hitting of the transaction that is currently • Just restored your database
bytes. Six to eight years ago devel- this threshold will cause a "sudden the Oldest Active transaction.
opers used a default page size of • Performed backup (gbak –b db.gdb) without the –g switch
performance loss" effect, and as a Indicates lowest snapshot transac-
1024 bytes, but in more recent result it is sometimes recommended tion number that is interested in • Recently performed a manual sweep (gfix –sweep)
times such a small page size could record versions.
that the sweep interval be set to 0
lead to many performance prob- Statistics you get on such occasions will be practically useless. It is
(disabling automatic sweeping
lems. Since this database has a • The Oldest active3 – the oldest also correct that during normal work there can be times where data-
entirely). Here, the sweep interval is
page size of 4k, there is no warning currently active transaction . base is in perfect state, for example, when applications make less
marked yellow, because the value database load than usual (users are at lunch or it's a quiet time in the
displayed, as this page size is okay.
of the sweep gap is negative, • The Next transaction – the business day).
Next, we can see that the Forced which it can be in InterBase 6.0, transaction number that will be
Firebird and Yaffil statistics but not in 1- InterBase 7.5 and Firebird 1.5 have special features that can periodi-
Write parameter is set to OFF and assigned to new transaction
cally flush unsaved pages if Forced Writes is Off.
marked red. InterBase 4.x and 5.x InterBase 7.x. When the value of the
sweep gap is greater than the • Active transactions – IBAnalyst 2 - Oldest transaction is the same Oldest interesting transaction, mentioned
by default had this parameter ON.
will give a warning if the oldest everywhere. Gstat output does not show this transaction as "interesting".
Forced Writes itself is a write cache sweep interval (if the sweep interval
is not 0) the report entry for the active transaction number is 3 - Really it is the oldest transaction that was active when the oldest trans-
method: when ON it writes
30% lower than the daily transac- action currently active started. This is because only new transaction start
changed data immediately to disk, sweep interval will be marked red
tions count. The statistics do not tell moves "Oldest active" forward. In the production systems with regular
but OFF means that writes will be with an appropriate hint. transactions it can be considered as currently oldest active transaction.

Using IBAnalyst
How to seize the moment multi-user conflicts can be tested ly, and what the table size is in the main reason for performance degradation. For some tables
when there is something with two or three concurrently run- megabytes. Most of these warnings there can be a lot of versions that are still "in use". The server can
wrong with the database? ning applications. However, with are customizable. not decide whether they really are in use, because active transac-
Your applications can be designed larger numbers of users, garbage tions potentially need any of these versions. Accordingly, the serv-
collection problems can arise. Such In this database example there are
so well that they will always work er does not consider these versions as garbage, and it takes longer
potential problems can be caught if several activities. First of all, yellow
with transactions and data correct- and longer to construct a correct record from the big chain of ver-
you gather database statistics at the color in the VerLen column warns
ly, not making sweep gaps, not sions whenever a transaction happens to read it. In Figure 2 you
correct moments. that space taken by record versions
accumulating a lot of active trans- can see two tables that have the versions count three times higher
is larger than that occupied by the
actions, not keeping long running than the record count. Using this information you can also check
records themselves. This can result
snapshots and so on. Usually it does Table information whether the fact that your applications are updating these tables so
from updating a lot of fields in a
not happen (sorry, colleagues). frequently is by design, or because of coding mistake or an appli-
Let's take look at another sample record or by bulk deletes. See the
cation design flaw.
The most common reason is that output from IBAnalyst. rows in which MaxVers column is
developers test their applications marked in blue. This shows that only
The IBAnalyst Table statistics view is one version per record is stored and The Index view
running only two or three simultane- also very useful. It can show which
ous users. When the application is consequently this is caused by bulk IIndices are used by database engine to enforce primary key, for-
tables have a lot of record versions, deletes. So, both indications can tell
then used in a production environ- eign key and unique constraints. They also speed up the retrieval of
where a large number of that this is really "bulk deletes", and
ment with fifteen or more simultane- data. Unique indices are the best for retrieving data, but the level of
updates/deletes were made, frag- number in Versions column is close
ous users, the database can behave benefit from non-unique indices depends on the diversity of the
mented tables, with fragmentation to number of deleted records.
unpredictably. Of course, multi-user indexed data.
caused by update/delete or by
mode can work okay because most blobs, and so on. You can see which Long-living active transactions pre- For example, look at ADDR_ADDRESS_IDX6. First of all, the index
Figure 2 Table statistics tables are being updated frequent- vent garbage collection, and this is
name itself tells that it was created manually. If statistics
were taken by the Services API with metadata info, you
can see what columns are indexed (in IBAnalyst 1.83
and greater). For the index under examination you can
see that it has 34999 keys, TotalDup is 34995 and
MaxDup is 25056. Both duplicate columns are marked
in red. This is because there are only 4 unique key val-
ues amongst all the keys in this index, as can be seen
from the Uniques column. Furthermore, the greatest
duplicate chain (key pointing to records with the same
column value) is 25056 – i.e. almost all keys store one
of four unique values. As a result, this index could:
• Reduce the speed of the restore process. Okay, thir-

ty-five thousand keys is not a big deal for modern data-
bases and hardware, but the impact should be noted
anyway.

Using IBAnalyst
• Slow down garbage collection. Indices with a low count of unique Reports
values can impede garbage collection by up to ten times in compar-
ison with a completely unique index. This problem has been solved There is no need to look through the entire report each time, spotting cell color and reading hints for new warnings.
in InterBase 7.1/7.5 and Firebird 2.0. More direct and detailed information can be had by using the Recommendations feature of IBAnalyst. Just load the
statistics and go to the Reports/View Recommendations menu. This report provides a step-by-step analysis, includ-
• Produce unnecessary page reads when the optimizer reads the ing more detailed descriptive warnings about forced writes, sweep interval, database activity, transaction state,
index. It depends on the value being searched in a particular query database page size, sweeping, transaction inventory pages, fragmented tables, tables with lot of record versions,
- searching by an index that has a larger value for MaxDup will be massive deletes/updates, deep indices, optimizer-unfriendly indices, useless indices and even empty tables. All of
slower. Searching by value that has fewer duplicate values will be this information and the accompanying suggestions are dynamically created based on the statistics being loaded.
faster, but only you know what data is stored in that indexed column.
As an example of the report output, let’s have a look at a report generated for the database statistics you saw
That is why IBAnalyst draws your attention to such indices, marking earlier in this article: "Overall size of transaction inventory pages (TIP) is big - 94 kilobytes or 23 pages.
them red and yellow, and including them in the Recommendations Read_committed transaction uses global TIP, but snapshot transactions make own copies of TIP in memory. Big TIP
report. Unfortunately most of the “bad" indices are automatically size can slowdown performance. Try to run sweep manually (gfix -sweep) to decrease TIP size. "
created to enforce foreign-key constraints. In some cases this prob-
lem can be solved by preventing, using triggers, deletes or updates Here is another quote from table/indices part of report: "Versioned tables count: 8. Large amount of record versions
of primary key in lookup tables. But if it is not possible to implement usually slowdown performance. If there are a lot of record versions in table, than garbage collection does not work,
such changes, IBAnalyst will show you "bad" indices on Foreign or records are not being read by any select statement. You can try select count(*) on that tables to enforce garbage
Keys every time you view statistics. collection, but this can take long time (if there are lot of versions and non-unique indices exist) and can be unsuccess-
ful if there is at least one transaction interested in these versions. Here
is the list of tables with version/record ratio greater than 3 :
Table Records Versions Rec/Vers size
CLIENTS_PR 3388 10944 92%
DICT_PRICE 30 1992 45%
DOCS 9 2225 64%
N_PART 13835 72594 83%
REGISTR_NC 241 4085 56%
SKL_NC 1640 7736 170%
STAT_QUICK 17649 85062 110%
UO_LOCK 283 8490 144%
Summary
IBAnalyst is an invaluable tool that assists a user in performing
detailed analysis of Firebird or InterBase database statistics and
identifying possible problems with a database in terms of perform-
ance, maintenance and how an application interacts with the data-
base. It takes cryptic database statistics and displays them in an
easy-to-understand, graphical manner and will automatically make
sensible suggestions about improving database performance and
easing database maintenance.

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Readers feedback
Comments to “Temporary tables”
Readers feedback
We received a lot of feedback emails for article “Working with temporary tables in InterBase 7.5” which was published in issue 1.
The one of them is impressed me and with permission of its respective owner I’d like to publish it.
Alexey Kovyazin, Chief Editor
One thing I'd like to ask you to because of the complexity of a and from then on it is transparent to dynamic, either re user/session
change is re temp tables. I suggest statement. If developers have a fall- users/applications that the data data isolation, or re data where the
you even create another myth box back method to reduce complexity inside is user/session-specific. Web structure is unknown in advance,
for it. It is the sentence in those cases, that's an advantage. applications would be a good but needs to be processed like DB
Much better than asking developers example. data with a fixed structure. That
'Surely, most often temporary to supply their own query plans. Firebird does not have them is
tables were necessary to those The argument reminds me a bit of plainly a lack of an important fea-
developers who had been working Also, 'serious' RDBMS like Informix MySQL reasoning when it comes to ture, in the same category as cross
with MS SQL before they started to had them at least 15 years ago, features their 'RDBMS' does/did DB operations (only through qli,
use InterBase/Firebird.' when MS's database expertise did not have. Transactions / Foreign which means 'unusable'). Both fea-
not go further than MS Access. Cer- Keys / Triggers etc were all bad tures could make Firebird much
This myth does not want to die. tainly those MS database develop- and unnecessary because they did
Temporary tables (TT) are not a more suitable as the basis for OLAP
ers who need TTs to be able to do not have them (officially: they slow systems, an area where Firebird is
means for underskilled DB kids, their job would not have managed down the whole system and intro-
who cannot write any complex SQL lacking considerably.
to deal with an Informix server if duce dependencies). Of course
statement. They are *the* means of complexity was their main problem. they do. To call a flat file system a Well, to be fair, Firebird developers
dealing with data that is *structural- RDBMS is obviously good for mar- are working on both topics.
ly* dynamic, but still needs to be The two preceding paragraphs keting. Now they are putting in all
processed like data with a fixed were about local temp tables. There those essential database features
structure. So all OLAP systems are also global temporary tables Volker Rehn
which were declared crap by them
based on RDBMS are heavily (GTTs). A point in favour of GTTs is not long ago. And you can see volker.rehn@bigpond.com
dependent on this feature - or, if it is that one can give users their own already that their marketing now
not present, it requires a whole lot workspace within a database with- tells us how important those fea-
of unnecessary and complicated out having to setup some clumsy tures are. I bet we won't see MySQL
workarounds. I'm talking out of administration for it. No user id benchmarks for a while ;-) .
experience. scattered around in tables where
they don't belong, no explicit We should not make a similar mis-
Then, there are situations where the cleanup, no demanding role man- take. Temporary tables are impor-
optimizer simply loses the plot agement. Just setup tables as temp, tant if the nature of a system is

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Miscellaneous
Invitations, Subscription,
Donations, Sponsorships
Subscribe now!
This is the first, official To receive future issues
book on Firebird — the
free, independent,
notifications send email to
open source relational subscribe@ibdeveloper.com
database server that
emerged in 2000.
Based on the actual
Firebird Project, this
book will provide you
all you need to know
about Firebird data-
base development, like
installation, multi-plat-
form configuration, Magazine
SQL language, inter- CD
faces, and mainte-
nance.
Donations

The InterBase and Firebird Developer Magazine, Issue 2, 2005

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

The InterBase and Firebird Developer Magazine, Issue 2, 2005

Hochgeladen von

Copyright:

Verfügbare Formate

The InterBase and Firebird Developer Magazine 2005 ISSUE 2 Contents

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 2

Rock around the blog Editor’s note

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 3

ed to InterBase and Firebird and ward for your feedback!

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 4

On Silly Questions and Idiotic Outcomes Off-topic questions are another

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 5

News & Events

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 6

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 7

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 8

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 9

Deadlock scans:0, Deadlocks:0, Scan interval:10

Hash slots: 101, Hash lengths (min/avg/max):3/ 15/ 30

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 10

Author: Dmitri Kouzmenko

Author: Alexey Kovyazin

From the point view of the database

Unlike the mechanism used for han-

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 11

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 12

Author: Vlad Horsun

Author: Alexey Kovyazin

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 13

loaded “ready for use” backup.

gbak –c –user SYSDBA –pass masterkey C:\TEST\testspbackup.gbk C:\TEST\testsp1.ib

gbak –c –user SYSDBA –pass masterkey C:\TEST\testspbackup.gbk C:\TEST\testsp2.ib

gbak –c –user SYSDBA –pass masterkey C:\TEST\testspbackup.gbk C:\TEST\testsp3.ib

gbak –c –user SYSDBA –pass masterkey C:\TEST\testspbackup.gbk C:\TEST\testsp4.ib

SQL test scripts

For convenience, the important commands are clarified with comments:

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 14

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 15

How to perform the test

Delta memory = 214016

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 16

SQL> SET TRANSACTION NO SAVEPOINT;

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 17

Inside the UNDO log

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 18

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 19

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 20

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 21

Author: Vladimir Kotlyarevsky

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 22

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 23

Creation_date timestamp default CURRENT_TIMESTAMP,

create table OBJECTS(

OID TOID primary key, select OID, Name, Description

ClassId TOID, from OBJECTS

where ClassId = :LookupId

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 24

and Name = :LookupName)

attribute_id integer not null,

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 25

FastReport 3 - new generation

of the reporting tools.

where o.OID = :id;

memo object supports simple html-tags

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 26

Replicating and synchronizing Author:

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 27

©2005 www.ibdeveloper.com All right reserved www.ibdeveloper.com 28