Introduction to Replication

Replication allows you to replicate data onto another machine. Itforms the base of all disaster recovery and failover features ArangoDBoffers.

ArangoDB offers asynchronous and synchronous replication,depending on which type of arangodb deployment you are using.Since ArangoDB 3.2 the synchronous replication replication is the only replicationtype used in a cluster whereas the asynchronous replication is only available betweensingle-server nodes. Future versions of ArangoDB may reintroduce asynchronousreplication for the cluster.

We will describe pros and cons of each of them in the followingsections.

Asynchronous replication

In ArangoDB any write operation will be logged to the write-aheadlog. When using Asynchronous replication slaves will connect to amaster and apply all the events from the log in the same orderlocally. After that, they will have the same state of data as themaster database.

Synchronous replication

Synchronous replication only works within a cluster and is typicallyused for mission critical data which must be accessible at alltimes. Synchronous replication generally stores a copy of a shard’sdata on another db server and keeps it in sync. Essentially, when storingdata after enabling synchronous replication the cluster will wait forall replicas to write all the data before greenlighting the writeoperation to the client. This will naturally increase the latency abit, since one more network hop is needed for each write. However, itwill enable the cluster to immediately fail over to a replica wheneveran outage has been detected, without losing any committed data, andmostly without even signaling an error condition to the client.

Synchronous replication is organized such that every shard has aleader and r-1 followers, where r denoted the replicationfactor. The number of followers can be controlled using thereplicationFactor parameter whenever you create a collection, thereplicationFactor parameter is the total number of copies beingkept, that is, it is one plus the number of followers.

Satellite collections

Satellite collections are synchronously replicated collections having a dynamic replicationFactor.They will replicate all data to all database servers allowing the database servers to join datalocally instead of doing heavy network operations.

Satellite collections are an enterprise only feature.