Terminology¶
RMR stands for Reliable Multicast over RTRS (RDMA transport). Reliable multicast in the networking world provides only “writing” of packets to a group of peers. RMR also provides “reading” of packets from a group of hosts — a reliable read/write multicast group. This section introduces terminology used in the code and documentation of RMR.
Group¶
A group of storage servers in a cluster mutually responsible for mirroring data is called a pool (also referred to in code and docs as an RMR group, RMG, or RTRS multicast group).
Equivalent concepts in other systems: an MD-RAID+RNBD “configuration” or a DRBD “resource” at the level of hosts.
pool_name¶
The name of the pool. Must be ASCII and must match across the compute client and all storage nodes when creating a pool.
group_id¶
A u32 derived as a jhash() of the pool name. Used to identify the pool in protocol message headers (rmr_msg_hdr). It is computed automatically from the pool name — not set by the user directly.
member_id¶
Integer ID for a storage node. Uniquely identifies a particular storage node within a pool. Set by the user when creating a server pool.
Client-side structures¶
rmr_clt_pool¶
Structure holding client-specific data for a pool. Includes members that hold references for inflight IO tracking, recovery work, IO unit tracking, stats, etc.
A client pool created on the compute client serves IO to and from an upper layer client (BRMR, etc.). A client pool created on a storage node is used by the RMR server pool for syncing data through internal connections to and from other storage nodes. To create such a sync pool, the parameter sync=y is used when creating an RMR client pool.
rmr_clt_sess¶
Represents an RTRS connection to a storage node. For each RTRS connection opened, RMR maintains one rmr_clt_sess in the global g_sess_list. Objects are identified by session name in the format <client-hostname@server-hostname>.
Multiple pools that replicate to the same storage node share a single rmr_clt_sess. See Client Sessions for details.
rmr_clt_pool_sess¶
Represents a replication leg for an RMR pool. An RMR pool with a replication factor of 2 has two rmr_clt_pool_sess objects in its session list. Each uses an rmr_clt_sess to send IO and command messages over RTRS.
sessname¶
User-assigned name for an rmr_clt_pool_sess. Can be any string. The in-house convention is <clt_hostname@server_hostname>.
stg_members¶
An xarray on the pool that maps member_id to the corresponding rmr_clt_pool_sess. It is the authoritative list of storage members for a pool and is used by the IO path to iterate over members for write replication and dirty map piggyback.
pool_md¶
struct rmr_pool_md is the pool metadata structure. It holds the persisted description of a pool: pool name, group_id, chunk_size, mapped_size, queue_depth, map version, and an srv_md array with one entry per storage member. On assemble, the client reads pool_md from the server to learn which members belong to the pool and reconstruct dirty maps.
Server-side structures¶
rmr_srv_pool¶
Structure holding server-specific data for a pool. Includes members that hold references for the sync thread, backend io_store, dirty map, last_io tracking, and metadata sync work.
last_io¶
An array of rmr_id_t entries in rmr_srv_pool, one slot per queue depth position. Each storage node records the IDs of the most recently processed IOs in this array and persists it to the backend. During recovery, when all sessions are in RECONNECTING state and no authoritative dirty map is available, the client compares last_io across storage nodes to determine which node has the most up-to-date data before re-enabling the pool.
rmr_srv_sess¶
Server-side representation of an RTRS connection from a compute client or peer storage node. Maintained in the server’s global g_sess_list.
rmr_srv_pool_sess¶
Server-side per-pool session. Tracks the state of a single client connection within the context of a specific server pool. The server-side counterpart to rmr_clt_pool_sess.