Durability Configuration

Global Configuration

Pre-setting on database creation

There are global configuration values for durability, which can be adjusted byspecifying the following configuration options:

default wait for sync behavior—database.wait-for-sync boolean

Default wait-for-sync value. Can be overwritten when creating a newcollection.

The default is false.

force syncing of collection properties to disk—database.force-sync-properties boolean

Force syncing of collection properties to disk after creating a collectionor updating its properties.

If turned off, no fsync will happen for the collection and databaseproperties stored in parameter.json files in the file system. Turningoff this option will speed up workloads that create and drop a lot ofcollections (e.g. test suites).

The default is true.

interval for automatic, non-requested disk syncs—wal.sync-interval

The interval (in milliseconds) that ArangoDB will use to automaticallysynchronize data in its write-ahead logs to disk. Automatic syncs willonlybe performed for not-yet synchronized data, and only for operations thathave been executed without the waitForSync attribute.

—rocksdb.sync-interval

The interval (in milliseconds) that ArangoDB will use to automaticallysynchronize data in RocksDB’s write-ahead logs to disk. Automatic syncs willonly be performed for not-yet synchronized data, and only for operations thathave been executed without the waitForSync attribute.

Adjusting at run-time

The total amount of disk storage required by ArangoDB is determined by the size ofthe write-ahead logfiles plus the sizes of the collection journals and datafiles.

There are the following options for configuring the number and sizes of the write-aheadlogfiles:maximum number of reserve logfiles—wal.reserve-logfiles

The maximum number of reserve logfiles that ArangoDB will create in abackground process. Reserve logfiles are useful in the situation when anoperation needs to be written to a logfile but the reserve space in thelogfile is too low for storing the operation. In this case, a new logfileneeds to be created to store the operation. Creating new logfiles isnormally slow, so ArangoDB will try to pre-create logfiles in a backgroundprocess so there are always reserve logfiles when the active logfile getsfull. The number of reserve logfiles that ArangoDB keeps in the backgroundis configurable with this option.maximum number of historic logfiles—wal.historic-logfiles

The maximum number of historic logfiles that ArangoDB will keep after theyhave been garbage-collected. If no replication is used, there is no needto keep historic logfiles except for having a local changelog.

In a replication setup, the number of historic logfiles affects the amountof data a slave can fetch from the master’s logs. The more historiclogfiles, the more historic data is available for a slave, which is usefulif the connection between master and slave is unstable or slow. Not havingenough historic logfiles available might lead to logfile data beingdeletedon the master already before a slave has fetched it.the size of each WAL logfile—wal.logfile-size

Specifies the filesize (in bytes) for each write-ahead logfile. Thelogfilesize should be chosen so that each logfile can store a considerable amountofdocuments. The bigger the logfile size is chosen, the longer it will taketo fill up a single logfile, which also influences the delay until thedatain a logfile will be garbage-collected and written to collection journalsand datafiles. It also affects how long logfile recovery will take atserver start.whether or not oversize entries are allowed—wal.allow-oversize-entries

Whether or not it is allowed to store individual documents that are biggerthan would fit into a single logfile. Setting the option to false willmakesuch operations fail with an error. Setting the option to true will makesuch operations succeed, but with a high potential performance impact.The reason is that for each oversize operation, an individual oversizelogfile needs to be created which may also block other operations.The option should be set to false if it is certain that documents willalways have a size smaller than a single logfile.

When data gets copied from the write-ahead logfiles into the journals or datafilesof collections, files will be created on the collection level. How big these filesare is determined by the following global configuration value:—database.maximal-journal-size size

Maximal size of journal in bytes. Can be overwritten when creating a newcollection. Note that this also limits the maximal size of a singledocument.

The default is 32MB.

Per-collection configuration

Pre-setting during collection creation

You can also configure the durability behavior on a per-collection basis.Use the ArangoDB shell to change these properties.

gets or sets the properties of a collectioncollection.properties()

Returns an object containing all collection properties.

  • waitForSync: If true creating a document will only returnafter the data was synced to disk.

  • journalSize : The size of the journal in bytes.This option is meaningful for the MMFiles storage engine only.

  • isVolatile: If true then the collection data will bekept in memory only and ArangoDB will not write or sync the datato disk.This option is meaningful for the MMFiles storage engine only.

  • keyOptions (optional) additional options for key generation. This isa JSON array containing the following attributes (note: some of theattributes are optional):

    • type: the type of the key generator used for the collection.
    • allowUserKeys: if set to true, then it is allowed to supplyown key values in the key attribute of a document. If set to_false, then the key generator will solely be responsible forgenerating keys and supplying own key values in the key_ attributeof documents is considered an error.
    • increment: increment value for autoincrement key generator.Not used for other key generator types.
    • offset: initial offset value for autoincrement key generator.Not used for other key generator types.
  • indexBuckets: number of buckets into which indexes using a hashtable are split. The default is 16 and this number has to be apower of 2 and less than or equal to 1024.This option is meaningful for the MMFiles storage engine only.

For very large collections one should increase this to avoid long pauseswhen the hash table has to be initially built or resized, since bucketsare resized individually and can be initially built in parallel. Forexample, 64 might be a sensible value for a collection with 100000 000 documents. Currently, only the edge index respects thisvalue, but other index types might follow in future ArangoDB versions.Changes (see below) are applied when the collection is loaded the nexttime.

In a cluster setup, the result will also contain the following attributes:

  • numberOfShards: the number of shards of the collection.

  • shardKeys: contains the names of document attributes that are used todetermine the target shard for documents.

  • replicationFactor: determines how many copies of each shard are kept on different DBServers.

collection.properties(properties)

Changes the collection properties. properties must be an object withone or more of the following attribute(s):

  • waitForSync: If true creating a document will only returnafter the data was synced to disk.

  • journalSize : The size of the journal in bytes.This option is meaningful for the MMFiles storage engine only.

  • indexBuckets : See above, changes are only applied when thecollection is loaded the next time.This option is meaningful for the MMFiles storage engine only.

  • replicationFactor : Change the number of shard copies kept on different DBServers, valid values are integer numbersin the range of 1-10 (Cluster only)

Note: it is not possible to change the journal size after the journal ordatafile has been created. Changing this parameter will only effect newlycreated journals. Also note that you cannot lower the journal size to lessthen size of the largest document already stored in the collection.

Note: some other collection properties, such as type, isVolatile,or keyOptions cannot be changed once the collection is created.

Examples

Read all properties

  1. arangosh> db.example.properties();
  2. {
  3. "doCompact" : true,
  4. "journalSize" : 33554432,
  5. "isSystem" : false,
  6. "isVolatile" : false,
  7. "waitForSync" : false,
  8. "keyOptions" : {
  9. "type" : "traditional",
  10. "allowUserKeys" : true,
  11. "lastValue" : 0
  12. },
  13. "indexBuckets" : 8
  14. }

Hide execution results

  1. arangosh> db.example.properties();

Show execution results

Change a property

  1. arangosh> db.example.properties({ waitForSync : true });
  2. {
  3. "doCompact" : true,
  4. "journalSize" : 33554432,
  5. "isSystem" : false,
  6. "isVolatile" : false,
  7. "waitForSync" : true,
  8. "keyOptions" : {
  9. "type" : "traditional",
  10. "allowUserKeys" : true,
  11. "lastValue" : 0
  12. },
  13. "indexBuckets" : 8
  14. }

Hide execution results

  1. arangosh> db.example.properties({ waitForSync : true });

Show execution results

Adjusting at run-time

The journal size can also be adjusted on a per-collection level using the collection’sproperties method.

Per-operation configuration

Many data-modification operations and also ArangoDB’s transactions allow to specify a waitForSync attribute, which when set ensures the operation data has beensynchronized to disk when the operation returns.

Disk-Usage Configuration (MMFiles engine)

The amount of disk space used by the MMFiles engine is determined by a few configurationoptions.