cleanupOrphaned

Definition

  • cleanupOrphaned

New in version 2.6.

Deletes from a shard the orphaned documents whose shard key values fall into a single or a singlecontiguous range that do not belong to the shard. For example, iftwo contiguous ranges do not belong to the shard, thecleanupOrphaned examines both ranges for orphaneddocuments.

To run, issue cleanupOrphaned in the admin databasedirectly on the mongod instance that is the primaryreplica set member of the shard. You do not need to disable thebalancer before running cleanupOrphaned.

Note

Do not run cleanupOrphaned on amongos instance.

cleanupOrphaned has the following syntax:

  1. db.runCommand( {
  2. cleanupOrphaned: "<database>.<collection>",
  3. startingFromKey: <minimumShardKeyValue>,
  4. secondaryThrottle: <boolean>,
  5. writeConcern: <document>
  6. } )

cleanupOrphaned has the following fields:

FieldTypeDescriptioncleanupOrphanedstringThe namespace, i.e. both the database and the collection name, of thesharded collection for which to clean the orphaned data.startingFromKeydocumentOptional. The shard key value that determines the lower bound of thecleanup range. The default value is MinKey.

If the range that contains the specified startingFromKey valuebelongs to a chunk owned by the shard, cleanupOrphanedcontinues to examine the next ranges until it finds a range not ownedby the shard. See Determine Range for details.secondaryThrottlebooleanOptional. If true, each delete operation must be replicated to anothersecondary before the cleanup operation proceeds further. Iffalse, do not wait for replication. Defaults to false.

Independent of the secondaryThrottle setting, after the finaldelete, cleanupOrphaned waits for all deletes toreplicate to a majority of replica set members before returning.writeConcerndocumentOptional. A document that expresses the write concern that the secondaryThrottle will use towait for the secondaries when removing orphaned data.

writeConcern requires secondaryThrottle: true.

Behavior

Performance

cleanupOrphaned scans the documents in the shard todetermine whether the documents belong to the shard. As such, runningcleanupOrphaned can impact performance; however,performance will depend on the number of orphaned documents in therange.

To remove all orphaned documents in a shard, you can run the command ina loop (see Remove All Orphaned Documents from a Shard for anexample). If concerned about the performance impact of this operation,you may prefer to include a pause in-between iterations.

Alternatively, to mitigate the impact of cleanupOrphaned,you may prefer to run the command at off peak hours.

Determine Range

The cleanupOrphaned command uses the startingFromKeyvalue, if specified, to determine the start of the range to examine fororphaned document:

  • If the startingFromKey value falls into a range for a chunk notowned by the shard, cleanupOrphaned begins examining atthe start of this range, which may not necessarily be thestartingFromKey.
  • If the startingFromKey value falls into a range for a chunk ownedby the shard, cleanupOrphaned moves onto the next rangeuntil it finds a range for a chunk not owned by the shard.

The cleanupOrphaned deletes orphaned documents from thestart of the determined range and ends at the start of the chunk rangethat belongs to the shard.

Consider the following key space with documents distributed acrossShard A and Shard B.

Diagram of shard key value space, showing chunk ranges and shards.

Shard A owns:

  • Chunk 1 with the range { x: minKey } —> { x: -75 },
  • Chunk 2 with the range { x: -75 } —> { x: 25 }, and
  • Chunk 4 with the range { x: 175 } —> { x: 200 }.

Shard B owns:

  • Chunk 3 with the range { x: 25 } —> { x: 175 } and
  • Chunk 5 with the range { x: 200 } —> { x: maxKey }.

If on Shard A, the cleanupOrphaned command runs withstartingFromKey: { x: -70 } or any other value belonging to range forChunk 1 or Chunk 2, the cleanupOrphaned command examinesthe Chunk 3 range of { x: 25 } —> { x: 175 } to deleteorphaned data.

If on Shard B, the cleanupOrphaned command runs withthe startingFromKey: { x: -70 } or any other value belonging to rangefor Chunk 1, the cleanupOrphaned command examines thecombined contiguous range for Chunk 1 and Chunk 2, namely {x: minKey } —> { x: 25 } to delete orphaned data.

Required Access

On systems running with authorization, you must haveclusterAdmin privileges to run cleanupOrphaned.

Output

Return Document

Each cleanupOrphaned command returns a document containinga subset of the following fields:

  • cleanupOrphaned.ok
  • Equal to 1 on success.

A value of 1 indicates that cleanupOrphaned scannedthe specified shard key range, deleted any orphaned documentsfound in that range, and confirmed that all deletes replicated to amajority of the members of that shard’s replica set. If confirmationdoes not arrive within 1 hour, cleanupOrphanedtimes out.

A value of 0 could indicate either of two cases:

  • cleanupOrphaned found orphaned documents on theshard but could not delete them.
  • cleanupOrphaned found and deleted orphaneddocuments, but could not confirm replication before the 1hour timeout. In this case, replication does occur but onlyafter cleanupOrphaned returns.
  • cleanupOrphaned.stoppedAtKey
  • The upper bound of the cleanup range of shard keys. If present, thevalue corresponds to the lower bound of the next chunk on the shard.The absence of the field signifies that the cleanup range was theuppermost range for the shard.

Examples

The following examples run the cleanupOrphaned commanddirectly on the primary of the shard.

Remove Orphaned Documents for a Specific Range

For a sharded collection info in the test database, a shardowns a single chunk with the range: { x: MinKey } —> { x: 10 }.

The shard also contains documents whose shard keys values fall in arange for a chunk not owned by the shard: { x: 10 } —> { x: MaxKey}.

To remove orphaned documents within the { x: 10 } => { x: MaxKey }range, you can specify a startingFromKey with a value that falls intothis range, as in the following example:

  1. db.adminCommand( {
  2. "cleanupOrphaned": "test.info",
  3. "startingFromKey": { x: 10 },
  4. "secondaryThrottle": true
  5. } )

Or you can specify a startingFromKey with a value that falls into theprevious range, as in the following:

  1. db.adminCommand( {
  2. "cleanupOrphaned": "test.info",
  3. "startingFromKey": { x: 2 },
  4. "secondaryThrottle": true
  5. } )

Since { x: 2 } falls into a range that belongs to a chunk owned bythe shard, cleanupOrphaned examines the next range to finda range not owned by the shard, in this case { x: 10 } => { x: MaxKey}.

Remove All Orphaned Documents from a Shard

cleanupOrphaned examines documents from a singlecontiguous range of shard keys. To remove all orphaned documents fromthe shard, you can run cleanupOrphaned in a loop, usingthe returned stoppedAtKey as the next startingFromKey, as inthe following:

  1. var nextKey = { };
  2. var result;
  3.  
  4. while ( nextKey != null ) {
  5. result = db.adminCommand( { cleanupOrphaned: "test.user", startingFromKey: nextKey } );
  6.  
  7. if (result.ok != 1)
  8. print("Unable to complete at this time: failure or timeout.")
  9.  
  10. printjson(result);
  11.  
  12. nextKey = result.stoppedAtKey;
  13. }