Linux: Mailbox locking over NFS

Why locking?

If you have multiple systems accessing (writing) a mail spool with at least one of them accessing the spool via NFS, all systems have to make notice that another system changed the spool file before writing. Otherwise mail will be lost.

This primarily affects mbox and other formats, which hold all messages in one file (in contrast to formats like Maildir, which use separate files for every message).

Locking mechanisms:

Dotlocking:

Dotlocking was told to be NFS-safe in the past. It works by creating a ``mailbox.lock'' file. This locking method may cause problems, when caching mechanisms make the creation or removal of the lock file or a size or timestamp change of the spool file invisible.

fcntl() locking:

fcntl() locking implements locking directly via the kernel. This should work without problems on a local machine, but via NFS this requires locking support in the NFS server.

The (old) user space nfsd does not support locking, while the kernel nfs server (knfsd), which comes with Linux 2.2 (and newer) supports locking.

Please note, that fcntl() locking will fail, if your NFS server doesn't support locking, so choosing the optimal locking technique does not only depend on the client machine but also on the NFS server machine.

Combine fcntl() and dotlocking:

At the moment, it seems to be the best decision in a Linux environment, to use a combination of fcntl() and dotlocking. This works with servers that support locking (via fcntl) as well as on servers without locking support (via the dotlock).

But make sure, that different mailbox accessing programs don't dead lock each other, so if a program isn't able to get both locks, it should not hold the one lock while waiting for the second one.

Problems with the NFS client in Linux 2.2.* and newer:

NFS server does not supporting locking:

The NFS client in Linux 2.2.* and newer presumes, that the accessed NFS server supports locking. If the NFS server does not support locking but the client tries to fcntl() lock a file on this server, the fcntl() call will fail. To avoid this failure you can mount the filesystem with the nolock option (in your /etc/fstab).

Caching in the new NFS client code:

The NFS client in Linux 2.2.* and newer does a very good job in caching. The problem with this is, that a program on the client may not notice a change of the spoolfile on the server. This can cause mail loss for example if a new mail arrives at the server without the client detecting this and then overwriting the complete folder with the (modified) client copy of the folder.

As a work around for this caching problem, Linux 2.2.* and newer has a side effect implemented in the fcntl() call, which flushes the cache of the locked file. So using fcntl() locking should work around all problems. The problem with this trick is, that it doesn't work correct with kernels <=2.2.12, if the server doesn't support locking. So upgrade the client machine to at least Linux 2.2.13 in this case!

Summary:

  • Always use a combination of fcntl() and dotlocking
  • If the server doesn't support locking, use the nolock mount option on the client.
  • If the server doesn't support locking, do not use Kernel 2.2.0-2.2.12 on the client. Use either 2.0.* or 2.2.13 or newer instead!

See also:

NFS FAQ