ArangoDB Server WAL Options (MMFiles)

ArangoDB Server WAL Options (MMFiles)

WAL is an acronym for write-ahead log.

The write-ahead log is a sequence of logfiles that are written in an append-onlyfashion. Full logfiles will eventually be garbage-collected, and the relevant datamight be transferred into collection journals and datafiles. Unneeded and alreadygarbage-collected logfiles will either be deleted or kept for the purpose of keepinga replication backlog.

Since ArangoDB 2.2, the MMFiles storage engine will write all data-modificationoperations into its write-ahead log.

With ArangoDB 3.2 another storage engine option becomes available:RocksDB.In case of using RocksDB the subsequent options don’t have a useful meaning.

Logfile size

the size of each WAL logfile—wal.logfile-size

Specifies the filesize (in bytes) for each write-ahead logfile. Thelogfilesize should be chosen so that each logfile can store a considerable amountofdocuments. The bigger the logfile size is chosen, the longer it will taketo fill up a single logfile, which also influences the delay until thedatain a logfile will be garbage-collected and written to collection journalsand datafiles. It also affects how long logfile recovery will take atserver start.

Allow oversize entries

whether or not oversize entries are allowed—wal.allow-oversize-entries

Whether or not it is allowed to store individual documents that are biggerthan would fit into a single logfile. Setting the option to false willmakesuch operations fail with an error. Setting the option to true will makesuch operations succeed, but with a high potential performance impact.The reason is that for each oversize operation, an individual oversizelogfile needs to be created which may also block other operations.The option should be set to false if it is certain that documents willalways have a size smaller than a single logfile.

Number of reserve logfiles

maximum number of reserve logfiles—wal.reserve-logfiles

The maximum number of reserve logfiles that ArangoDB will create in abackground process. Reserve logfiles are useful in the situation when anoperation needs to be written to a logfile but the reserve space in thelogfile is too low for storing the operation. In this case, a new logfileneeds to be created to store the operation. Creating new logfiles isnormally slow, so ArangoDB will try to pre-create logfiles in a backgroundprocess so there are always reserve logfiles when the active logfile getsfull. The number of reserve logfiles that ArangoDB keeps in the backgroundis configurable with this option.

Number of historic logfiles

maximum number of historic logfiles—wal.historic-logfiles

The maximum number of historic logfiles that ArangoDB will keep after theyhave been garbage-collected. If no replication is used, there is no needto keep historic logfiles except for having a local changelog.

In a replication setup, the number of historic logfiles affects the amountof data a slave can fetch from the master’s logs. The more historiclogfiles, the more historic data is available for a slave, which is usefulif the connection between master and slave is unstable or slow. Not havingenough historic logfiles available might lead to logfile data beingdeletedon the master already before a slave has fetched it.

Sync interval

interval for automatic, non-requested disk syncs—wal.sync-interval

The interval (in milliseconds) that ArangoDB will use to automaticallysynchronize data in its write-ahead logs to disk. Automatic syncs willonly be performed for not-yet synchronized data, and only for operations that have been executed without the waitForSync attribute.

Flush timeout

WAL flush timeout`—wal.flush-timeout

The timeout (in milliseconds) that ArangoDB will at most wait when flushinga full WAL logfile to disk. When the timeout is reached and the flush isnot completed, the operation that requested the flush will fail with a lock timeout error.

Throttling

Throttle writes to WAL when at least such many operations arewaiting for garbage collection:—wal.throttle-when-pending

The maximum value for the number of write-ahead log garbage-collectionqueue elements. If set to 0, the queue size is unbounded, and nowrite-throttling will occur. If set to a non-zero value, write-throttlingwill automatically kick in when the garbage-collection queue contains atleast as many elements as specified by this option.While write-throttling is active, data-modification operations willintentionally be delayed by a configurable amount of time. This is toensure the write-ahead log garbage collector can catch up with theoperations executed.Write-throttling will stay active until the garbage-collection queue sizegoes down below the specified value.Write-throttling is turned off by default.

—wal.throttle-wait

This option determines the maximum wait time (in milliseconds) foroperations that are write-throttled. If write-throttling is active and anew write operation is to be executed, it will wait for at most thespecified amount of time for the write-ahead log garbage-collection queuesize to fall below the throttling threshold. If the queue size decreasesbefore the maximum wait time is over, the operation will be executednormally. If the queue size does not decrease before the wait time isover, the operation will be aborted with an error.This option only has an effect if —wal.throttle-when-pending has anon-zero value, which is not the default.

Number of slots

Maximum number of slots to be used in parallel:—wal.slots

Configures the amount of write slots the write-ahead log can give to writeoperations in parallel. Any write operation will lease a slot and returnit to the write-ahead log when it is finished writing the data. A slot willremain blocked until the data in it was synchronized to disk. After that,a slot becomes reusable by following operations. The required number ofslots is thus determined by the parallelism of write operations and thedisk synchronization speed. Slow disks probably need higher values, andfast disks may only require a value lower than the default.

Ignore logfile errors

Ignore logfile errors when opening logfiles:—wal.ignore-logfile-errors

Ignores any recovery errors caused by corrupted logfiles on startup. Whenset to false, the recovery procedure on startup will fail with an errorwhenever it encounters a corrupted (that includes only half-written)logfile. This is a security precaution to prevent data loss in case of diskerrors etc. When the recovery procedure aborts because of corruption, anycorrupted files can be inspected and fixed (or removed) manually and theserver can be restarted afterwards.

Setting the option to true will make the server continue with the recoveryprocedure even in case it detects corrupt logfile entries. In this case itwill stop at the first corrupted logfile entry and ignore all others, whichmight cause data loss.

Ignore recovery errors

Ignore recovery errors:—wal.ignore-recovery-errors

Ignores any recovery errors not caused by corrupted logfiles but by logicalerrors. Logical errors can occur if logfiles or any other server datafileshave been manually edited or the server is somehow misconfigured.

Ignore (non-WAL) datafile errors

Ignore datafile errors when loading collections:—database.ignore-datafile-errors boolean

If set to false, CRC mismatch and other errors in collection datafileswill lead to a collection not being loaded at all. The collection in thiscase becomes unavailable. If such collection needs to be loaded during WALrecovery, the WAL recovery will also abort (if not forced with option—wal.ignore-recovery-errors true).

Setting this flag to false protects users from unintentionally using acollection with corrupted datafiles, from which only a subset of theoriginal data can be recovered. Working with such collection could leadto data loss and follow up errors.In order to access such collection, it is required to inspect and repairthe collection datafile with the datafile debugger (arango-dfdb).

If set to true, CRC mismatch and other errors during the loading of acollection will lead to the datafile being partially loaded, up to theposition of the first error. All data up to until the invalid positionwill be loaded. This will enable users to continue with collectiondatafileseven if they are corrupted, but this will result in only a partial loadof the original data and potential follow up errors. The WAL recoverywill still abort when encountering a collection with a corrupted datafile,at least if —wal.ignore-recovery-errors is not set to true.

Setting the option to true will also automatically repair potentially corrupted VERSION files of databases on startup, so that the startup canproceed.

The default value is false, so collections with corrupted datafiles willnot be loaded at all, preventing partial loads and follow up errors. However,if such collection is required at server startup, during WAL recovery, theserver will abort the recovery and refuse to start.

WAL