One advantage of the Citadel system over other, less tightly integrated groupware packages is
that it has the ability to defer potentially resource-intensive operations until off-hours,
improving the interactive performance of the system during the hours that users are online
and active. This is primarily used for performing "delete" operations in a batch mode.
This article explains the technological underpinnings, and is mainly intended for developers.
Data model
In order to understand what's going on under the covers, there are several things you need to
know about Citadel's data model:
Any given message may exist in one or more rooms. The second and subsequent copy are not
actual copies, but simply additional references to the same message number, similar to hard
links in a POSIX filesystem.
We keep a reference count of how many rooms are holding any particular message. When the
message is originally saved to a room, it is set to 1. When the message is copied to an
additional room, it is incremented. When the message is deleted from a room, it is decremented.
Rooms which exist only in the namespace of a single user (in other words, a private mailbox)
actually have the account's user number prepended to the name. For example, if an account
has user number 12345, his inbox is actually named "0000012345.Mail". This prefix is hidden
from the client.
Synchronous operations
Here are some activities which are performed synchronously -- in other words, the user must
wait while they are completed.
Saving a new message, deleting a message, copying a message: we have to adjust the
reference count. This is performed using the AdjRefCount() function, which accepts a
message number and a delta (increment or decrement). In order to make this operation take
as little time as possible, all we do is write these values to the end of a "reference count
adjustments" record that is kept internally.
Deleting a room: in the past, this was a potentially time consuming operation, because we
had to wait around for the system to delete every message in the room. So instead, we rename
the room, prepending a bogus namespace (9999999999) for a user who does not exist. We also
insert a timestamp and sequence number into the name to ensure that we don't accidentally
create a name which already exists. Since the room now only exists in the namespace of a user
who does not exist, it appears to have been deleted, and we only consumed a few milliseconds
of server time. The real work will happen later...
Asynchronous operations (or, what happens during a purger run)
Here's where the magic happens. We run a nightly batch job, affectionately known as The
Dreaded Auto-Purger, which is responsible for cleaning everything up. It does a lot of
work, in a very specific order to ensure that it doesn't have to run twice to get everything.
The code can all be found in modules/expire/serv_expire.c. Here's how it works.
Purge users. If the system is configured to automatically delete inactive accounts,
the user file is scanned, and the date of last login is calculated. Accounts which have not
been accessed in the configured amount of time are deleted. If the system is using an external
source of authentication (such as a PAM database), we instead delete accounts which no longer
exist on the host system. Either way, you will note that we only delete the account itself --
we are not yet deleting rooms or messages which belong to the account.
Purge messages. For rooms which are configured to automatically expire messages older than
a certain age, and for rooms which are configured to keep no more than a specific maximum number
of messages online, we go into those rooms and delete the old messages. This is done similar
to an interactive delete: the message pointer is removed and its reference count is decremented.
Purge rooms. The system may be configured to automatically expire rooms which
have not been accessed in a certain amount of time; if so, these rooms are deleted now.
We also delete any rooms which exist in a namespace belonging to a user who does not exist.
The latter conditon conveniently removes rooms which were deleted, or which belonged to a
user who was deleted. Before deleting a room, we of course delete every message in the room
(again with the same operation: remove the pointer, decrement the reference count).
Purge visits. The "visits" table contains records which describe the relationship
between one user and one room. It handles things like access control, seen/unseen message
flags, and other flags. At this time we delete any record which refers to a user or
room which no longer exists.
Purge Use Table. The "use table" keeps track of the Message ID's of messages which
recently arrived over a network, including a Citadel network, or RSS aggregation, or POP3
aggregation. In the latter two cases, these records are refreshed every time a message
re-appears. We keep this data around in order to keep the same message from being imported
multiple times. At this time, we delete any records which are older than a certain age.
Purge EUID Index Table. This table is simply an index of messages by EUID, for rooms
which require it. We delete records which are no longer in use.
Process the reference count adjustment queue. By this time we now have a lot of data
in the reference count adjustment record. Now it is time to process this data.
Reference count adjustments are then processed one at a time.
The reference count for each message is kept in the message's metadata record, and we adjust
it by whatever value each record specifies.
When a message's reference count reaches zero, we know that there are no longer any references
to the message anywhere on the system.
Before deleting the message from disk, however, we first must remove it from the full-text
index. That operation is performed at this time.
After the message is de-indexed, it is finally deleted from the message database. Remember,
however, that you will not see an immediate reduction of disk utilization on the host system,
because Berkeley DB does not shrink its files when records are deleted. This space will be
marked as unused, and new messages can potentially be stored there. Therefore on a well-managed
system with a fairly consistent traffic rate and a sensible expire policy, disk utilization
will initially grow until it reaches an equilibrium of new messages vs. expiring messages,
and then it will stay there. On the other hand, if you have no expire policy and your users
never empty their trash folders, you may expect disk utilization to grow indefinitely.