Messenger v2

What is it

The messenger v2 protocol, or msgr2, is the second major revision onCeph’s on-wire protocol. It brings with it several key features:

  • A secure mode that encrypts all data passing over the network

  • Improved encapsulation of authentication payloads, enabling futureintegration of new authentication modes like Kerberos

  • Improved earlier feature advertisement and negotiation, enablingfuture protocol revisions

Ceph daemons can now bind to multiple ports, allowing both legacy Cephclients and new v2-capable clients to connect to the same cluster.

By default, monitors now bind to the new IANA-assigned port 3300(ce4h or 0xce4) for the new v2 protocol, while also binding to theold default port 6789 for the legacy v1 protocol.

Address formats

Prior to nautilus, all network addresses were rendered like1.2.3.4:567/89012 where there was an IP address, a port, and anonce to uniquely identify a client or daemon on the network.Starting with nautilus, we now have three different address types:

  • v2: v2:1.2.3.4:578/89012 identifies a daemon binding to aport speaking the new v2 protocol

  • v1: v1:1.2.3.4:578/89012 identifies a daemon binding to aport speaking the legacy v1 protocol. Any address that waspreviously shown with any prefix is now shown as a v1: address.

  • TYPE_ANY addresses identify a client that can speak eitherversion of the protocol. Prior to nautilus, clients would appear as1.2.3.4:0/123456, where the port of 0 indicates they are clientsand do not accept incoming connections. Starting with Nautilus,these clients are now internally represented by a TYPE_ANYaddress, and still shown with no prefix, because they mayconnect to daemons using the v2 or v1 protocol, depending on whatprotocol(s) the daemons are using.

Because daemons now bind to multiple ports, they are now described bya vector of addresses instead of a single address. For example,dumping the monitor map on a Nautilus cluster now includes lineslike:

  1. epoch 1
  2. fsid 50fcf227-be32-4bcb-8b41-34ca8370bd16
  3. last_changed 2019-02-25 11:10:46.700821
  4. created 2019-02-25 11:10:46.700821
  5. min_mon_release 14 (nautilus)
  6. 0: [v2:10.0.0.10:3300/0,v1:10.0.0.10:6789/0] mon.foo
  7. 1: [v2:10.0.0.11:3300/0,v1:10.0.0.11:6789/0] mon.bar
  8. 2: [v2:10.0.0.12:3300/0,v1:10.0.0.12:6789/0] mon.baz

The bracketed list or vector of addresses means that the same daemon can bereached on multiple ports (and protocols). Any client or other daemonconnecting to that daemon will use the v2 protocol (listed first) ifpossible; otherwise it will back to the legacy v1 protocol. Legacyclients will only see the v1 addresses and will continue to connect asthey did before, with the v1 protocol.

Starting in Nautilus, the mon_host configuration option and -m<mon-host> command line options support the same bracketed addressvector syntax.

Bind configuration options

Two new configuration options control whether the v1 and/or v2protocol is used:

  • ms_bind_msgr1 [default: true] controls whether a daemon bindsto a port speaking the v1 protocol

  • ms_bind_msgr2 [default: true] controls whether a daemon bindsto a port speaking the v2 protocol

Similarly, two options control whether IPv4 and IPv6 addresses are used:

  • ms_bind_ipv4 [default: true] controls whether a daemon bindsto an IPv4 address

  • ms_bind_ipv6 [default: false] controls whether a daemon bindsto an IPv6 address

Connection modes

The v2 protocol supports two connection modes:

  • crc mode provides:

    • a strong initial authentication when the connection is established(with cephx, mutual authentication of both parties with protectionfrom a man-in-the-middle or eavesdropper), and

    • a crc32c integrity check to protect against bit flips due to flakyhardware or cosmic rays

crc mode does not provide:

  • secrecy (an eavesdropper on the network can see allpost-authentication traffic as it goes by) or

  • protection from a malicious man-in-the-middle (who can deliberatemodify traffic as it goes by, as long as they are careful toadjust the crc32c values to match)

  • secure mode provides:

    • a strong initial authentication when the connection is established(with cephx, mutual authentication of both parties with protectionfrom a man-in-the-middle or eavesdropper), and

    • full encryption of all post-authentication traffic, including acryptographic integrity check.

In Nautilus, secure mode uses the AES-GCM stream cipher,which is generally very fast on modern processors (e.g., faster thana SHA-256 cryptographic hash).

Connection mode configuration options

For most connections, there are options that control which modes are used:

  • ms_cluster_mode is the connection mode (or permitted modes) usedfor intra-cluster communication between Ceph daemons. If multiplemodes are listed, the modes listed first are preferred.

  • ms_service_mode is a list of permitted modes for clients to usewhen connecting to the cluster.

  • ms_client_mode is a list of connection modes, in order ofpreference, for clients to use (or allow) when talking to a Cephcluster.

There are a parallel set of options that apply specifically tomonitors, allowing administrators to set different (usually moresecure) requirements on communication with the monitors.

  • ms_mon_cluster_mode is the connection mode (or permitted modes)to use between monitors.

  • ms_mon_service_mode is a list of permitted modes for clients orother Ceph daemons to use when connecting to monitors.

  • ms_mon_client_mode is a list of connection modes, in order ofpreference, for clients or non-monitor daemons to use whenconnecting to monitors.

Transitioning from v1-only to v2-plus-v1

By default, ms_bind_msgr2 is true starting with Nautilus 14.2.z.However, until the monitors start using v2, only limited services willstart advertising v2 addresses.

For most users, the monitors are binding to the default legacy port 6789 for the v1 protocol. When this is the case, enabling v2 is as simple as:

  1. ceph mon enable-msgr2

If the monitors are bound to non-standard ports, you will need tospecify an additional port for v2 explicitly. For example, if yourmonitor mon.a binds to 1.2.3.4:1111, and you want to add v2 onport 1112,:

  1. ceph mon set-addrs a [v2:1.2.3.4:1112,v1:1.2.3.4:1111]

Once the monitors bind to v2, each daemon will start advertising a v2address when it is next restarted.

Updating ceph.conf and mon_host

Prior to Nautilus, a CLI user or daemon will normally discover themonitors via the mon_host option in /etc/ceph/ceph.conf. Thesyntax for this option has expanded starting with Nautilus to allowsupport the new bracketed list format. For example, an old linelike:

  1. mon_host = 10.0.0.1:6789,10.0.0.2:6789,10.0.0.3:6789

Can be changed to:

  1. mon_host = [v2:10.0.0.1:3300/0,v1:10.0.0.1:6789/0],[v2:10.0.0.2:3300/0,v1:10.0.0.2:6789/0],[v2:10.0.0.3:3300/0,v1:10.0.0.3:6789/0]

However, when default ports are used (3300 and 6789), they canbe omitted:

  1. mon_host = 10.0.0.1,10.0.0.2,10.0.0.3

Once v2 has been enabled on the monitors, ceph.conf may need to beupdated to either specify no ports (this is usually simplest), orexplicitly specify both the v2 and v1 addresses. Note, however, thatthe new bracketed syntax is only understood by Nautilus and later, sodo not make that change on hosts that have not yet had their cephpackages upgraded.

When you are updating ceph.conf, note the new ceph configgenerate-minimal-conf command (which generates a barebones configfile with just enough information to reach the monitors) and theceph config assimilate-conf (which moves config file options intothe monitors’ configuration database) may be helpful. For example,:

  1. # ceph config assimilate-conf < /etc/ceph/ceph.conf
  2. # ceph config generate-minimal-config > /etc/ceph/ceph.conf.new
  3. # cat /etc/ceph/ceph.conf.new
  4. # minimal ceph.conf for 0e5a806b-0ce5-4bc6-b949-aa6f68f5c2a3
  5. [global]
  6. fsid = 0e5a806b-0ce5-4bc6-b949-aa6f68f5c2a3
  7. mon_host = [v2:10.0.0.1:3300/0,v1:10.0.0.1:6789/0]
  8. # mv /etc/ceph/ceph.conf.new /etc/ceph/ceph.conf

Protocol

For a detailed description of the v2 wire protocol, see msgr2 protocol.