Replica Set Oplog

Replica Set Oplog

The oplog (operations log) is a special cappedcollection that keeps a rolling record of all operations that modifythe data stored in your databases.

Note

Starting in MongoDB 4.0, unlike other capped collections, the oplogcan grow past its configured size limit to avoid deleting themajority commit point.

MongoDB applies database operationson the primary and then records the operations on theprimary’s oplog. The secondary members then copy and applythese operations in an asynchronous process. Allreplica set members contain a copy of the oplog, in thelocal.oplog.rs collection, which allows them to maintain thecurrent state of the database.

To facilitate replication, all replica set members send heartbeats(pings) to all other members. Any secondary member can importoplog entries from any other member.

Each operation in the oplog is idempotent. That is, oplogoperations produce the same results whether applied once or multipletimes to the target dataset.

Oplog Size

When you start a replica set member for the first time, MongoDB createsan oplog of a default size if you do not specify the oplog size. [1]

For Unix and Windows systems
The default oplog size depends on the storage engine:

Storage EngineDefault Oplog SizeLower BoundUpper BoundIn-Memory Storage Engine5% of physical memory50 MB50 GBWiredTiger Storage Engine5% of free disk space990 MB50 GB

For 64-bit macOS systems
The default oplog size is 192 MB of either physical memory or freedisk space depending on the storage engine:

Storage EngineDefault Oplog SizeIn-Memory Storage Engine192 MB of physical memoryWiredTiger Storage Engine192 MB of free disk space

In most cases, the default oplog size is sufficient. For example, if anoplog is 5% of free disk space and fills up in 24 hours of operations, thensecondaries can stop copying entries from the oplog for up to 24 hourswithout becoming too stale to continue replicating. However, mostreplica sets have much lower operation volumes, and their oplogs canhold much higher numbers of operations.

Before mongod creates an oplog, you can specify its size withthe oplogSizeMB option. Once you have started areplica set member for the first time, use thereplSetResizeOplog administrative command to change theoplog size. replSetResizeOplog enables you to resize theoplog dynamically without restarting the mongod process.

[1]	Starting in MongoDB 4.0, the oplog can grow past its configured sizelimit to avoid deleting the `majority commit point`.

Workloads that Might Require a Larger Oplog Size

If you can predict your replica set’s workload to resemble one of thefollowing patterns, then you might want to create an oplog that islarger than the default. Conversely, if your application predominantlyperforms reads with a minimal amount of write operations, a smaller oplogmay be sufficient.

The following workloads might require a larger oplog size.

Updates to Multiple Documents at Once

The oplog must translate multi-updates into individual operations inorder to maintain idempotency. This can use a greatdeal of oplog space without a corresponding increase in data size or diskuse.

Deletions Equal the Same Amount of Data as Inserts

If you delete roughly the same amount of data as you insert, thedatabase will not grow significantly in disk use, but the sizeof the operation log can be quite large.

Significant Number of In-Place Updates

If a significant portion of the workload is updates that do notincrease the size of the documents, the database records a large numberof operations but does not change the quantity of data on disk.

Oplog Status

To view oplog status, including the size and the time range ofoperations, issue the rs.printReplicationInfo() method. Formore information on oplog status, seeCheck the Size of the Oplog.

Replication Lag and Flow Control

Under various exceptional situations, updates to a secondary’s oplog might lag behind the desired performance time. Usedb.getReplicationInfo() from a secondary member and thereplication statusoutput to assess the current state of replication and determine ifthere is any unintended replication delay.

Starting in MongoDB 4.2, administrators can limit the rate at whichthe primary applies its writes with the goal of keeping the majoritycommitted lag undera configurable maximum value flowControlTargetLagSeconds.

By default, flow control is enabled.

Note

For flow control to engage, the replica set/sharded cluster musthave: featureCompatibilityVersion (FCV) of4.2 and read concern majority enabled. That is, enabled flowcontrol has no effect if FCV is not 4.2 or if read concernmajority is disabled.

See Replication Lag for moreinformation.

Slow Oplog Application

Starting in version 4.2 (also available starting in version 4.0.6),secondary members of a replica set now log oplog entries that takelonger than the slow operation threshold to apply. These messages arelogged for the secondaries under theREPL component with the text applied op: <oplog entry> took<num>ms.

2018-11-16T12:31:35.886-0500 I REPL   [repl writer worker 13] applied op: command { ... }, took 112ms

The slow oplog application logging on secondaries are:

Not affected by the slowOpSampleRate;i.e. all slow oplog entries are logged by the secondary.
Not affected by thelogLevel/systemLog.verbosity level (or thesystemLog.component.replication.verbosity level); i.e. foroplog entries, the secondary logs only the slow oplog entries.Increasing the verbosity level does not log all oplog entries.
Not captured by the profiler and not affected by theprofiling level.

For more information on setting the slow operation threshold, see

mongod —slowms
slowOpThresholdMs
The profile command or db.setProfilingLevel()shell helper method.