Create Chunks in a Sharded Cluster

In most situations a sharded cluster will create/split anddistribute chunks automatically without user intervention. However, ina limited number of cases, MongoDB cannot create enough chunks ordistribute data fast enough to support the required throughput.

For example, if you want to ingest a large volume of data into acluster that is unbalanced, or where the ingestion of data will lead todata imbalance, such as with monotonically increasing or decreasingshard keys. Pre-splitting the chunks of an empty sharded collectioncan help with the throughput in these cases.

Alternatively, starting in MongoDB 4.0.3, by defining the zonesand zone rangesbefore sharding an empty or anon-existing collection, the shard collection operation creates chunksfor the defined zone ranges as well as any additional chunks to coverthe entire range of the shard key values and performs an initial chunkdistribution based on the zone ranges. For more information, seeEmpty Collection.

Warning

Only pre-split chunks for an empty collection. Manually splittingchunks for a populated collection can lead to unpredictable chunkranges and sizes as well as inefficient or ineffective balancingbehavior.

To split empty chunks manually, you can run the split command:

Example

To create chunks for documents in the myapp.userscollection using the email field as the shard key,use the following operation in the mongo shell:

  1. for ( var x=97; x<97+26; x++ ){
  2. for ( var y=97; y<97+26; y+=6 ) {
  3. var prefix = String.fromCharCode(x) + String.fromCharCode(y);
  4. db.adminCommand( { split: "myapp.users", middle: { email : prefix } } );
  5. }
  6. }

This assumes a collection size of 100 million documents.