One Pool One Device¶
In RMR, a pool is scoped to a single mapped block device. Replicating multiple devices on the same compute client therefore uses multiple RMR pools — one per device — even when those pools all connect to the same set of storage nodes. The cost is bounded: pools connecting to the same storage node share a single RTRS session, so the network footprint scales with the number of unique storage nodes, not with the number of devices. See Client Sessions for the structural split (rmr_clt_pool_sess per leg, rmr_clt_sess per RTRS connection) that enables this sharing.
Properties of the model¶
No cross-device coupling. A network or backend failure on one device’s pool does not affect any other device’s pool. Each pool is an independent object with its own state, dirty maps, and recovery work.
Per-device leg manipulation. Adding or removing a
rmr_clt_pool_sessleg on a pool changes replication for exactly one device. Migrating a device to a new storage node is straightforward — add a leg on the new node, let it sync, remove the leg on the old node — and does not disturb replication for any other device.Per-device replication topology. Replication factor and the set of storage nodes are chosen independently per device.
Single-device dirty map. A server pool’s dirty map tracks chunks for one device. There is no device dimension to disambiguate; chunks are identified by chunk number alone.
Implementation¶
BRMR client.
brmr_clt_map_device()callsbrmr_clt_create_pool(), which callsrmr_clt_open()once per BRMR pool. Each device is mapped through its own BRMR pool with a unique pool name. The block-device tag set, request queue, and refcount all live on the BRMR pool struct.BRMR server. Each
brmr_srv_blk_devis bound to onermr_poolvia itspoolfield. Backend store registration is therefore one-to-one with an RMR server pool.Dirty map.
struct rmr_dirty_id_mapis held per pool and per peer member, indexed bymember_id. There is no device key.
The one-to-one binding is enforced at the BRMR layer’s user-facing flow rather than by an explicit kernel-level check. The BRMR pool struct retains some legacy fields (a per-pool device list and a shared tag set) from the earlier multi-device-per-pool design, but brmr_clt_map_device() always creates a fresh pool for each call and never reuses an existing one. These fields will be investigated and removed in the future if they are not needed for the one-pool-one-device model.
Vestiges of the original multi-device design¶
The original RMR design considered carrying multiple devices in a single pool, with IOs differentiated by a 128-bit identifier rmr_id_t = (u64 a, u64 b) — one field would have carried a device ID. To keep an error on one device from disrupting others within the shared session, that direction also envisioned channels: per-device control planes inside a single RMR session.
Channels were never built, and there are no remnants of them in the code today. The 128-bit rmr_id_t, however, is still around. It has been repurposed, not retired:
id.bis the starting chunk number for an IO.id.ais the count of consecutive chunks the IO touches, starting atid.b. Seermr_map_calc_chunk()inrmr/rmr-map.cfor the encoding.
id.a is currently constrained to 1 — the client IO submission path enforces this with BUG_ON(id.a > 1) in rmr/rmr-clt.c. Parts of the multi-chunk infrastructure are already in place (rmr_map_calc_chunk() computes the count, and dirty-map iteration loops in rmr-map.c already iterate over id.a chunks), but the sync/wait-list interaction needed for IOs that span multiple chunks has not been worked out.
The two-u64 identifier is wider than the current model strictly needs. The id.a / id.b split is planned to be removed once the multi-chunk picture is settled, in favour of a single chunk-number field. No concrete refactor is staged in the code yet.