Sync Modules

New in version Kraken.

The Multisite functionality of RGW introduced in Jewel allowed the ability tocreate multiple zones and mirror data and metadata between them. Sync Modulesare built atop of the multisite framework that allows for forwarding data andmetadata to a different external tier. A sync module allows for a set of actionsto be performed whenever a change in data occurs (metadata ops like bucket oruser creation etc. are also regarded as changes in data). As the rgw multisitechanges are eventually consistent at remote sites, changes are propagatedasynchronously. This would allow for unlocking use cases such as backing up theobject storage to an external cloud cluster or a custom backup solution usingtape drives, indexing metadata in ElasticSearch etc.

A sync module configuration is local to a zone. The sync module determineswhether the zone exports data or can only consume data that was modified inanother zone. As of luminous the supported sync plugins are elasticsearch,rgw, which is the default sync plugin that synchronises data between thezones and log which is a trivial sync plugin that logs the metadataoperation that happens in the remote zones. The following docs are written withthe example of a zone using elasticsearch sync module, the process would be similarfor configuring any sync plugin

Requirements and Assumptions

Let us assume a simple multisite configuration as described in the Multisitedocs, of 2 zones us-east and us-west, let’s add a third zoneus-east-es which is a zone that only processes metadata from the othersites. This zone can be in the same or a different ceph cluster as us-east.This zone would only consume metadata from other zones and RGWs in this zonewill not serve any end user requests directly.

Configuring Sync Modules

Create the third zone similar to the Multisite docs, for example

  1. # radosgw-admin zone create --rgw-zonegroup=us --rgw-zone=us-east-es \
  2. --access-key={system-key} --secret={secret} --endpoints=http://rgw-es:80

A sync module can be configured for this zone via the following

  1. # radosgw-admin zone modify --rgw-zone={zone-name} --tier-type={tier-type} --tier-config={set of key=value pairs}

For example in the elasticsearch sync module

  1. # radosgw-admin zone modify --rgw-zone={zone-name} --tier-type=elasticsearch \
  2. --tier-config=endpoint=http://localhost:9200,num_shards=10,num_replicas=1

For the various supported tier-config options refer to the elasticsearch sync module docs

Finally update the period

  1. # radosgw-admin period update --commit

Now start the radosgw in the zone

  1. # systemctl start ceph-radosgw@rgw.`hostname -s`
  2. # systemctl enable ceph-radosgw@rgw.`hostname -s`