Communicating between Replication Manager Sites

Replication Manager provides all of the communications between sites needed for an application that performs all of its write operations on the master site and only performs read operations on client sites.

For applications with a simple data and transaction model, Replication Manager also provides automatic write forwarding as a configurable option. When write forwarding is configured, some write operations can be performed on clients. In addition, an application can use Replication Manager message channels to send its own messages to other sites in the replication group using Replication Manager’s internal communications infrastructure.

These approaches are described in the following sections.

Configuring for Write Forwarding

By default, write operations cannot be performed on a replication client site. Replication Manager provides a configurable option that allows forwarding of simple client put and delete operations to the master site for processing. These operations must use an implicit NULL transaction ID to be forwarded. Any other write operation that specifies a non-NULL transaction or uses a cursor returns an error. This option is turned off by default.

To configure write forwarding, use rep_set_config with the DB_REPMGR_CONF_FORWARD_WRITES option. (See the DB_REPMGR_CONF_FORWARD_WRITES section in the Berkeley DB C API Reference Guide for more information.)

The master must have an open database handle for the database on which a forwarded write operation is being performed. All sites in the replication group should have the same value for this configuration option.

The following restrictions apply to the use of write forwarding:

  • The application cannot use Replication Manager message channels for any other purpose.

  • All sites in the replication group must be on platforms of the same endianness.

  • Use of write forwarding is not supported with BDB SQL HA.

  • Write forwarding cannot be performed on databases using callbacks or values that must be supplied for each DbEnv::open() call to avoid database corruption. Examples include the callbacks or values defined by the DB->set_dup_compare(), DB->set_bt_compare(), DB->set_hash() and DB->set_heapsize() methods.

  • Write forwarding cannot be performed on a partitioned database.

  • Bulk put and del operations using DB_MULTIPLE or DB_MULTIPLE_KEY flags are not supported.

For more information, see Berkeley DB Getting Started with Replicated Applications.

Using Replication Manager message channels

The various sites comprising a replication group frequently need to communicate with one another. Mostly, these messages are handled for you internally by the Replication Manager. However, your application may have a requirement to pass messages beyond what the Replication Manager requires in order to satisfy its own internal workings.

For this reason, you can access and use the Replication Manager’s internal message channels. You do this by using the DB_CHANNEL class, and by implementing a message handling function on each of your sites.

Note that an example of using Replication Manager message channels is available in the distribution. See Ex_rep_chan: a Replication Manager channel example for more information.

DB_CHANNEL

The DB_CHANNEL class provides a series of methods which allow you to send messages to the other sites in your replication group. You create a DB_CHANNEL handle using the DB_ENV->repmgr_channel() method. When you are done with the handle, close it using the DB_CHANNEL->close() method. A closed handle must never be accessed again. Note that all channel handles should be closed before the associated environment handle is closed. Also, allow all message operations to complete on the channel before closing the handle.

When you create a DB_CHANNEL handle, you indicate what channel you want to use. Possibilities are:

  • The numerical env ID of a remote site in the replication group.

  • DB_EID_MASTER

    Messages sent on this channel are sent only to the master site. Note that messages are always sent to the current master, even if the master has changed since the channel was opened.

    If the local site is the master, then sending messages on this channel will result in the local site receiving those messages echoed back to itself.

Sending messages over a message channel

You can send any message you want over a message channel. The message can be as simple as a character string and as complex as a large data structure. However, before you can send the message, you must encapsulate it within one or more DBTs. This means marshaling the message if it is contained within a complex data structure.

The methods that you use to send messages all accept an array of DBTs. This means that in most circumstances it is perfectly acceptable to send multi-part messages.

Messages may be sent either asynchronously or synchronously. To send a message asynchronously, use the DB_CHANNEL->send_msg() method. This method sends its message and then immediately returns without waiting for any sort of a response.

To send a message synchronously, use the DB_CHANNEL->send_request() method. This method blocks until it receives a response from the site to which it sent the message (or until a timeout threshold is reached).

Message Responses

Message responses are required if a message is sent on a channel using the DB_CHANNEL->send_request() method. That method accepts the address of a single DBT which is used to receive the response from the remote site.

Message responses are encapsulated in a single DBT. The response can be anything from a complex data structure, to a string, to a simple type, to no information at all. In the latter case, receipt of the DBT is sufficient to indicate that the request was received at the remote site.

Responses are sent back from the remote system using its message handling function. Usually that function calls DB_CHANNEL->send_msg() to send a single response.

The response must be contained in a single DBT. If a multi-part response is required by the application, you can configure the response DBT that you provide to DB_CHANNEL->send_request() for bulk operations.

Receiving messages

Messages received at a remote site are handled using a callback function. This function is configured for the local environment using the DB_ENV->repmgr_msg_dispatch() method. For best results, the message dispatch function should be configured for the local environment before replication is started. In this way, you do not run the risk of missing messages sent after replication has started but before the message dispatch function is configured for the environment.

The callback configured by DB_ENV->repmgr_msg_dispatch() accepts four parameters of note:

  • A response channel. This is the channel the function will use to respond to the message, if a response is required. To respond to the message, the function uses the DB_CHANNEL->send_msg() method.

  • An array of DBTs. These hold the message that this function must handle.

  • A numerical value that indicates how many elements the previously described array holds.

  • A flag that indicates whether the message requires a response. If the flag is set to DB_REPMGR_NEED_RESPONSE, then the function should send a single DBT in response using the channel provided to this function, and the DB_CHANNEL->send_msg() method.

For an example of using this callback, see the operation_dispatch() function, which is available with the ex_rep_chan example in your product distribution.