5.3. Replicator Database

Changed in version 2.1.0: Scheduling replicator was introduced.Replication states, by default are not written back to documentsanymore. There are new replication job states and new API endpoints_scheduler/jobs and _scheduler/docs.

The _replicator database works like any other in CouchDB, butdocuments added to it will trigger replications. Create (PUT orPOST) a document to start replication. DELETE a replicationdocument to cancel an ongoing replication.

These documents have exactly the same content as the JSON objects weused to POST to _replicate (fields source, target,create_target, continuous, doc_ids, filter,query_params, use_checkpoints, checkpoint_interval).

Replication documents can have a user defined _id (handy for findinga specific replication request later). Design Documents (and _localdocuments) added to the replicator database are ignored.

The default replicator database is _replicator. Additionalreplicator databases can be created. To be recognized as such by thesystem, their database names should end with /_replicator.

5.3.1. Basics

Let’s say you POST the following document into _replicator:

  1. {
  2. "_id": "my_rep",
  3. "source": "http://myserver.com/foo",
  4. "target": "http://user:pass@localhost:5984/bar",
  5. "create_target": true,
  6. "continuous": true
  7. }

In the couch log you’ll see 2 entries like these:

  1. [notice] 2017-04-05T17:16:19.646716Z node1@127.0.0.1 <0.29432.0> -------- Replication `"a81a78e822837e66df423d54279c15fe+continuous+create_target"` is using:
  2. 4 worker processes
  3. a worker batch size of 500
  4. 20 HTTP connections
  5. a connection timeout of 30000 milliseconds
  6. 10 retries per request
  7. socket options are: [{keepalive,true},{nodelay,false}]
  8. [notice] 2017-04-05T17:16:19.646759Z node1@127.0.0.1 <0.29432.0> -------- Document `my_rep` triggered replication `a81a78e822837e66df423d54279c15fe+continuous+create_target`

Replication state of this document can then be queried from@localhost:5984/_scheduler/docs/_replicator/my_rep"">http://adm:pass@localhost:5984/_scheduler/docs/_replicator/my_rep

  1. {
  2. "database": "_replicator",
  3. "doc_id": "my_rep",
  4. "error_count": 0,
  5. "id": "a81a78e822837e66df423d54279c15fe+continuous+create_target",
  6. "info": null,
  7. "last_updated": "2017-04-05T19:18:15Z",
  8. "node": "node1@127.0.0.1",
  9. "proxy": null,
  10. "source": "http://myserver.com/foo/",
  11. "start_time": "2017-04-05T19:18:15Z",
  12. "state": "running",
  13. "target": "http://adm:*****@localhost:5984/bar/"
  14. }

The state is running. That means replicator has scheduled thisreplication job to run. Replication document contents stay the same.Previously, before version 2.1, it was updated with the triggeredstate.

The replication job will also appear in

@localhost:5984/_scheduler/jobs"">http://adm:pass@localhost:5984/_scheduler/jobs

  1. {
  2. "jobs": [
  3. {
  4. "database": "_replicator",
  5. "doc_id": "my_rep",
  6. "history": [
  7. {
  8. "timestamp": "2017-04-05T19:18:15Z",
  9. "type": "started"
  10. },
  11. {
  12. "timestamp": "2017-04-05T19:18:15Z",
  13. "type": "added"
  14. }
  15. ],
  16. "id": "a81a78e822837e66df423d54279c15fe+continuous+create_target",
  17. "node": "node1@127.0.0.1",
  18. "pid": "<0.1174.0>",
  19. "source": "http://myserver.com/foo/",
  20. "start_time": "2017-04-05T19:18:15Z",
  21. "target": "http://adm:*****@localhost:5984/bar/",
  22. "user": null
  23. }
  24. ],
  25. "offset": 0,
  26. "total_rows": 1
  27. }

_scheduler/jobs shows more information, such as a detailed history ofstate changes. If a persistent replication has not yet started,has failed, or is completed, information about its state can only be foundin _scheduler/docs. Keep in mind that some replication documents could beinvalid and could not become a replication job. Others might be delayedbecause they are fetching data from a slow source database.

If there is an error, for example if the source database is missing, thereplication job will crash and retry after a wait period. Eachsuccessive crash will result in a longer waiting period.

For example, POST-ing this document

  1. {
  2. "_id": "my_rep_crashing",
  3. "source": "http://myserver.com/missing",
  4. "target": "http://user:pass@localhost:5984/bar",
  5. "create_target": true,
  6. "continuous": true
  7. }

when source database is missing, will result in periodic starts andcrashes with an increasingly larger interval. The history list from_scheduler/jobs for this replication would look something like this:

  1. [
  2. {
  3. "reason": "db_not_found: could not open http://adm:*****@localhost:5984/missing/",
  4. "timestamp": "2017-04-05T20:55:10Z",
  5. "type": "crashed"
  6. },
  7. {
  8. "timestamp": "2017-04-05T20:55:10Z",
  9. "type": "started"
  10. },
  11. {
  12. "reason": "db_not_found: could not open http://adm:*****@localhost:5984/missing/",
  13. "timestamp": "2017-04-05T20:47:10Z",
  14. "type": "crashed"
  15. },
  16. {
  17. "timestamp": "2017-04-05T20:47:10Z",
  18. "type": "started"
  19. }
  20. ]

_scheduler/docs shows a shorter summary:

  1. {
  2. "database": "_replicator",
  3. "doc_id": "my_rep_crashing",
  4. "error_count": 6,
  5. "id": "cb78391640ed34e9578e638d9bb00e44+create_target",
  6. "info": "db_not_found: could not open http://adm:*****@localhost:5984/missing/",
  7. "last_updated": "2017-04-05T20:55:10Z",
  8. "node": "node1@127.0.0.1",
  9. "proxy": null,
  10. "source": "http://adm:*****@localhost:5984/missing/",
  11. "start_time": "2017-04-05T20:38:34Z",
  12. "state": "crashing",
  13. "target": "http://adm:*****@localhost:5984/bar/"
  14. }

Repeated crashes are described as a crashing state. -ing suffiximplies this is a temporary state. User at any moment could create themissing database and then replication job could return back to thenormal.

5.3.2. Documents describing the same replication

Lets suppose 2 documents are added to the _replicator database inthe following order:

  1. {
  2. "_id": "my_rep",
  3. "source": "http://myserver.com/foo",
  4. "target": "http://user:pass@localhost:5984/bar",
  5. "create_target": true,
  6. "continuous": true
  7. }

and

  1. {
  2. "_id": "my_rep_dup",
  3. "source": "http://myserver.com/foo",
  4. "target": "http://user:pass@localhost:5984/bar",
  5. "create_target": true,
  6. "continuous": true
  7. }

Both describe exactly the same replication (only their _ids differ).In this case document my_rep triggers the replication, whilemy_rep_dup` will fail. Inspecting _scheduler/docs explainsexactly why it failed:

  1. {
  2. "database": "_replicator",
  3. "doc_id": "my_rep_dup",
  4. "error_count": 1,
  5. "id": null,
  6. "info": "Replication `a81a78e822837e66df423d54279c15fe+continuous+create_target` specified by document `my_rep_dup` already started, triggered by document `my_rep` from db `_replicator`",
  7. "last_updated": "2017-04-05T21:41:51Z",
  8. "source": "http://myserver.com/foo/",
  9. "start_time": "2017-04-05T21:41:51Z",
  10. "state": "failed",
  11. "target": "http://adm:*****@localhost:5984/bar/"
  12. }

Notice the state for this replication is failed. Unlikecrashing, failed state is terminal. As long as both documentsare present the replicator will not retry to run my_rep_dupreplication. Another reason could be malformed documents. For example ifworker process count is specified as a string ("worker_processes": "a
few"
) instead of an integer, failure will occur.

5.3.3. Replication Scheduler

Once replication jobs are created they are managed by the scheduler. Thescheduler is the replication component which periodically stops somejobs and starts others. This behavior makes it possible to have alarger number of jobs than the cluster could run simultaneously.Replication jobs which keep failing will be penalized and forced towait. The wait time increases exponentially with each consecutivefailure.

When deciding which jobs to stop and which to start, the scheduler usesa round-robin algorithm to ensure fairness. Jobs which have been runningthe longest time will be stopped, and jobs which have been waiting thelongest time will be started.

Note

Non-continuous (normal) replication are treated differentlyonce they start running. See Normal vs Continuous Replications section for more information.

The behavior of the scheduler can configured via max_jobs,interval and max_churn options. See Replicatorconfiguration section for additional information.

5.3.4. Replication states

Replication jobs during their life-cycle pass through various states.This is a diagram of all the states and transitions between them:

Replication state diagram
Replication state diagram

Blue and yellow shapes represent replication job states.

Trapezoidal shapes represent external APIs, that’s how users interactwith the replicator. Writing documents to _replicator is thepreferred way of creating replications, but posting to the_replicate HTTP endpoint is also supported.

Six-sided shapes are internal API boundaries. They are optional for thisdiagram and are only shown as additional information to help clarify how thereplicator works. There are two processing stages: the first is wherereplication documents are parsed and become replication jobs, and the second isthe scheduler itself. The scheduler runs replication jobs, periodicallystopping and starting some. Jobs posted via the _replicate endpoint bypassthe first component and go straight to the scheduler.

5.3.4.1. States descriptions

Before explaining the details of each state, it is worth noticing thatcolor and shape of each state in the diagram:

Blue vs yellow partitions states into “healthy” and “unhealthy”,respectively. Unhealthy states indicate something has gone wrong and itmight need user’s attention.

Rectangle vs oval separates “terminal” states from “non-terminal”ones. Terminal states are those which will not transition to otherstates any more. Informally, jobs in a terminal state will not beretried and don’t consume memory or CPU resources.

- Initializing: Indicates replicator has noticed the change fromthe replication document. Jobs should transition quickly through thisstate. Being stuck here for a while could mean there is an internalerror.- Failed: Replication document could not be processed and turnedinto a valid replication job for the scheduler. This state isterminal and requires user intervention to fix the problem. A typicalreason for ending up in this state is a malformed document. Forexample, specifying an integer for a parameter which accepts aboolean. Another reason for failure could be specifying a duplicatereplication. A duplicate replication is a replication with identicalparameters but a different document ID.- Error: Replication document update could not be turned into areplication job. Unlike the Failed state, this one is temporary,and replicator will keep retrying periodically. There is anexponential backoff applied in case of consecutive failures. The mainreason this state exists is to handle filtered replications withcustom user functions. Filter function content is needed in order tocalculate the replication ID. A replication job could not be createduntil the function code is retrieved. Because retrieval happens overthe network, temporary failures have to be handled.- Running: Replication job is running normally. This means, theremight be a change feed open, and if changes are noticed, they wouldbe processed and posted to the target. Job is still consideredRunning even if its workers are currently not streaming changesfrom source to target and are just waiting on the change feed.Continuous replications will most likely end up in this state.- Pending: Replication job is not running and is waiting its turn.This state is reached when the number of replication jobs added tothe scheduler exceeds replicator.max_jobs. In that case schedulerwill periodically stop and start subsets of jobs trying to give eachone a fair chance at making progress.- Crashing: Replication job has been successfully added to thereplication scheduler. However an error was encountered during thelast run. Error could be a network failure, a missing sourcedatabase, a permissions error, etc. Repeated consecutive crashesresult in an exponential backoff. This state is considered temporary(non-terminal) and replication jobs will be periodically retried.Maximum backoff interval is around a day or so.- Completed: This is a terminal, successful state fornon-continuous replications. Once in this state the replication is“forgotten” by the scheduler and it doesn’t consume any more CPU ormemory resources. Continuous replication jobs will never reach thisstate.

5.3.4.2. Normal vs Continuous Replications

Normal (non-continuous) replications once started will be allowed to runto completion. That behavior is to preserve their semantics ofreplicating a snapshot of the source database to the target. For exampleif new documents are added to the source after the replication arestarted, those updates should not show up on the target database.Stopping and restring a normal replication would violate thatconstraint.

Warning

When there is a mix of continuous and normal replications,once normal replication are scheduled to run, they might temporarilystarve continuous replication jobs.

However, normal replications will still be stopped and rescheduled if anoperator reduces the value for the maximum number of replications. Thisis so that if an operator decides replications are overwhelming a nodethat it has the ability to recover. Any stopped replications will beresubmitted to the queue to be rescheduled.

5.3.5. Compatibility Mode

Previous version of CouchDB replicator wrote state updates back toreplication documents. In cases where user code programmatically readthose states, there is compatibility mode enabled via a configurationsetting:

  1. [replicator]
  2. update_docs = true

In this mode replicator will continue to write state updates to thedocuments.

To effectively disable the scheduling behavior, which periodically stopand starts jobs, set max_jobs configuration setting to a largenumber. For example:

  1. [replicator]
  2. max_jobs = 9999999

See Replicator configuration section forother replicator configuration options.

5.3.6. Canceling replications

To cancel a replication simply DELETE the document which triggeredthe replication. To update a replication, for example, change the numberof worker or the source, simply update the document with new data. Ifthere is extra application-specific data in the replication documents,that data is ignored by the replicator.

5.3.7. Server restart

When CouchDB is restarted, it checks its _replicator databases andrestarts replications described by documents if they are not already inin a completed or failed state. If they are, they are ignored.

5.3.8. Clustering

In a cluster, replication jobs are balanced evenly among all the nodesnodes such that a replication job runs on only one node at a time.

Every time there is a cluster membership change, that is when nodes areadded or removed, as it happens in a rolling reboot, replicatorapplication will notice the change, rescan all the document and runningreplication, and re-evaluate their cluster placement in light of the newset of live nodes. This mechanism also provides replication fail-over incase a node fails. Replication jobs started from replication documents(but not those started from _replicate HTTP endpoint) willautomatically migrate one of the live nodes.

5.3.9. Additional Replicator Databases

Imagine replicator database (_replicator) has these two documentswhich represent pull replications from servers A and B:

  1. {
  2. "_id": "rep_from_A",
  3. "source": "http://aserver.com:5984/foo",
  4. "target": "http://user:pass@localhost:5984/foo_a",
  5. "continuous": true
  6. }
  1. {
  2. "_id": "rep_from_B",
  3. "source": "http://bserver.com:5984/foo",
  4. "target": "http://user:pass@localhost:5984/foo_b",
  5. "continuous": true
  6. }

Now without stopping and restarting CouchDB, add another replicatordatabase. For example another/_replicator:

  1. $ curl -X PUT http://user:pass@localhost:5984/another%2F_replicator/
  2. {"ok":true}

Note

A / character in a database name, when used in a URL, should be escaped.

Then add a replication document to the new replicator database:

  1. {
  2. "_id": "rep_from_X",
  3. "source": "http://xserver.com:5984/foo",
  4. "target": "http://user:pass@localhost:5984/foo_x",
  5. "continuous": true
  6. }

From now on, there are three replications active in the system: tworeplications from A and B, and a new one from X.

Then remove the additional replicator database:

  1. $ curl -X DELETE http://user:pass@localhost:5984/another%2F_replicator/
  2. {"ok":true}

After this operation, replication pulling from server X will be stoppedand the replications in the _replicator database (pulling fromservers A and B) will continue.

5.3.10. Replicating the replicator database

Imagine you have in server C a replicator database with the twofollowing pull replication documents in it:

  1. {
  2. "_id": "rep_from_A",
  3. "source": "http://aserver.com:5984/foo",
  4. "target": "http://user:pass@localhost:5984/foo_a",
  5. "continuous": true
  6. }
  1. {
  2. "_id": "rep_from_B",
  3. "source": "http://bserver.com:5984/foo",
  4. "target": "http://user:pass@localhost:5984/foo_b",
  5. "continuous": true
  6. }

Now you would like to have the same pull replications going on in serverD, that is, you would like to have server D pull replicating fromservers A and B. You have two options:

  • Explicitly add two documents to server’s D replicator database
  • Replicate server’s C replicator database into server’s D replicatordatabase
    Both alternatives accomplish exactly the same goal.

5.3.11. Delegations

Replication documents can have a custom user_ctx property. Thisproperty defines the user context under which a replication runs. Forthe old way of triggering a replication (POSTing to /_replicate/),this property is not needed. That’s because information about theauthenticated user is readily available during the replication, which isnot persistent in that case. Now, with the replicator database, theproblem is that information about which user is starting a particularreplication is only present when the replication document is written.The information in the replication document and the replication itselfare persistent, however. This implementation detail implies that in thecase of a non-admin user, a user_ctx property containing the user’sname and a subset of their roles must be defined in the replicationdocument. This is enforced by the document update validation functionpresent in the default design document of the replicator database. Thevalidation function also ensures that non-admin users are unable to setthe value of the user context’s name property to anything other thantheir own user name. The same principle applies for roles.

For admins, the user_ctx property is optional, and if it’s missingit defaults to a user context with name null and an empty list ofroles, which means design documents won’t be written to local targets.If writing design documents to local targets is desired, the role_admin must be present in the user context’s list of roles.

Also, for admins the user_ctx property can be used to trigger areplication on behalf of another user. This is the user context thatwill be passed to local target database document validation functions.

Note

The user_ctx property only has effect for local endpoints.

Example delegated replication document:

  1. {
  2. "_id": "my_rep",
  3. "source": "http://bserver.com:5984/foo",
  4. "target": "http://user:pass@localhost:5984/bar",
  5. "continuous": true,
  6. "user_ctx": {
  7. "name": "joe",
  8. "roles": ["erlanger", "researcher"]
  9. }
  10. }

As stated before, the user_ctx property is optional for admins,while being mandatory for regular (non-admin) users. When the rolesproperty of user_ctx is missing, it defaults to the empty list[].

5.3.12. Selector Objects

Including a Selector Object in the replication document enables you touse a query expression to determine if a document should be included inthe replication.

The selector specifies fields in the document, and provides an expressionto evaluate with the field content or other data. If the expression resolvesto true, the document is replicated.

The selector object must:

  • Be structured as valid JSON.
  • Contain a valid query expression.
    The syntax for a selector is the same as theselectorsyntax used for _find.

Using a selector is significantly more efficient than using a JavaScriptfilter function, and is the recommended option if filtering on documentattributes only.

原文: http://docs.couchdb.org/en/stable/replication/replicator.html