Crash tolerance is a new (as of release 1.21) feature that can be
enabled at compile time, and used in environments with appropriate
support from the OS and the filesystem. As of version
1.20.90, this means a Linux kernel 5.12.12 or later and
a filesystem that supports reflink copying, such as XFS, BtrFS, or
OCFS2. If these prerequisites are met, crash tolerance code will
be enabled automatically by the configure
script when
building the package.
The crash-tolerance mechanism, when used correctly, guarantees that a consistent recent state of application data can be recovered following a crash. Specifically, it guarantees that the state of the database file corresponding to the most recent successful gdbm_sync() call can be recovered.
If the new mechanism is used correctly, crashes such as power
outages, OS kernel panics, and (some) application process crashes
will be tolerated. Non-tolerated failures include physical
destruction of storage devices and corruption due to bugs in
application logic. For example, the new mechanism won’t help if a
pointer bug in your application corrupts GDBM
private in-memory
data which in turn corrupts the database file.
In the following sections we will describe how to enable crash tolerance in your application and what to do if a crash occurs.