5.1. Introduction to Replication

One of CouchDB’s strengths is the ability to synchronize two copies of the samedatabase. This enables users to distribute data across several nodes ordata centers, but also to move data more closely to clients.

Replication involves a source and a destination database, which can be on thesame or on different CouchDB instances. The aim of the replication is that atthe end of the process, all active documents on the source database are also inthe destination database and all documents that were deleted in the sourcedatabases are also deleted on the destination database (if they even existed).

5.1.1. Transient and Persistant Replication

There are two different ways to set up a replication. The first one that wasintroduced into CouchDB leads do a replication that could be called transient.Transient means that there are no documents backing up the replication. So after arestart of the CouchDB server the replication will disapear. Later, the_replicator database was introduced, which keeps documentscontaining your replication parameters. Such a replication can be called persistent.Transient replications were kept for backward compatibility. Both replications canhave different replication states.

5.1.2. Triggering, Stopping and Monitoring Replications

A persistent replication is controlled through a document in the_replicator database, where each document describes onereplication process (see Replication Settings). For setting up atransient replication the api endpoint/_replicate can be used. A replication is triggeredby sending a JSON object either to the _replicate endpoint or storing it as adocument into the _replicator database.

If a replication is currently running its status can be inspected through theactive tasks API (see /_active_tasks, Replication Statusand /_scheduler/jobs).

For document based-replications, /_scheduler/docs can be used toget a complete state summary. This API is preferred as it will show the state of thereplication document before it becomes a replication job.

For transient replications there is no way to query their state when the job isfinished.

A replication can be stopped by deleting the document, or by updating it withits cancel property set to true.

5.1.3. Replication Procedure

During replication, CouchDB will compare the source and the destinationdatabase to determine which documents differ between the source and thedestination database. It does so by following the Changes Feeds on the sourceand comparing the documents to the destination. Changes are submitted to thedestination in batches where they can introduce conflicts. Documents thatalready exist on the destination in the same revision are not transferred. Asthe deletion of documents is represented by a new revision, a document deletedon the source will also be deleted on the target.

A replication task will finish once it reaches the end of the changes feed. Ifits continuous property is set to true, it will wait for new changes toappear until the task is canceled. Replication tasks also create checkpointdocuments on the destination to ensure that a restarted task can continue fromwhere it stopped, for example after it has crashed.

When a replication task is initiated on the sending node, it is called _push_replication, if it is initiated by the receiving node, it is called _pull_replication.

5.1.4. Master - Master replication

One replication task will only transfer changes in one direction. To achievemaster-master replication, it is possible to set up two replication tasks inopposite direction. When a change is replicated from database A to B by thefirst task, the second task from B to A will discover that the new change onB already exists in A and will wait for further changes.

5.1.5. Controlling which Documents to Replicate

There are three options for controlling which documents are replicated,and which are skipped:

Selector Objects can be included in a replication document (seeReplication Settings). A selector object contains a query expressionthat is used to test whether a document should be replicated.

Filter Functions can be used in a replication (seeReplication Settings). The replication task evaluatesthe filter function for each document in the changes feed. The document isonly replicated if the filter returns true.

Note

Using a selector provides performance benefits when compared with using aFilter Functions. You should use Selector Objects where possible.

Note

When using replication filters that depend on the document’s content,deleted documents may pose a problem, since the document passed to thefilter will not contain any of the document’s content. This can beresolved by adding a _deleted:true field to the document insteadof using the DELETE HTTP method, paired with the use of avalidate document update handler to ensure the fieldsrequired for replication filters are always present. Take note, though,that the deleted document will still contain all of its data (includingattachments)!

5.1.6. Migrating Data to Clients

Replication can be especially useful for bringing data closer to clients.PouchDB implements the replication algorithm of CouchDBin JavaScript, making it possible to make data from a CouchDB databaseavailable in an offline browser application, and synchronize changes back toCouchDB.

原文: http://docs.couchdb.org/en/stable/replication/intro.html