[RFC PATCH 3/4] cxl/extent: Support extents in sharable CDAT regions

From: John Groves

Date: Thu Apr 23 2026 - 19:52:46 EST


From: John Groves <John@xxxxxxxxxx>

The previous cleanup (per-tag assembly, ordering, integrity) intentionally
left extents in sharable CDAT regions rejected at the per-extent gate:
cxl_validate_extent() returned -ENXIO for any shared_extn_seq != 0. The
group-level 1..n contiguity check (cxl_check_group_seq) was written so
that turning the gate off would leave the sequencing contract enforced.

Turn the gate off and replace it with a partition-aware check.

The CDAT DSMAS entry for each DC partition already carries the
ACPI_CDAT_DSMAS_SHAREABLE bit, propagated into
cxl_dpa_partition::perf.shareable by the CDAT parse path. That bit is
the authoritative answer to "is this extent multi-host sharable":

- Sharable partition. An extent's tag (UUID) is the allocation
identity that every sharing host uses, so the tag must be
non-null. The device stamps each extent with its position in the
allocation via shared_extn_seq, so the field must be non-zero.
The group-level check then verifies the sorted group is exactly
1, 2, ..., n.

- Non-sharable partition. No multi-host coordination is required,
so the tag is optional (null UUID means "untagged") and
shared_extn_seq must be 0.

Any cross-mixing (sharable + null tag, sharable + seq == 0,
non-sharable + seq != 0) is a device firmware bug and the extent is
rejected with a per-extent firmware-bug message.

Refactor to support the check:

- Rename cxl_validate_extent_partition() to cxl_extent_dc_partition()
and return the matching DC partition (NULL on not-found). One
call site; the caller now has the partition in hand and can read
perf.shareable without a second lookup.

- Rewrite cxl_validate_extent() around the partition's sharable
flag as described above. Drop the blanket shared_extn_seq != 0
reject.

- Update the comments on extent_seq_compare() and
cxl_check_group_seq() to drop the "sharable not yet surfaced"
caveats; both branches of each helper are now reachable.

The cxl_add_pending() pipeline (sort, cross-More-chain uniqueness,
sequence-number integrity, alignment, attach, online) is unchanged:
all of it was already written for both regimes.

Signed-off-by: John Groves <John@xxxxxxxxxx>
Signed-off-by: John Groves <john@xxxxxxxxxx>
---
drivers/cxl/core/mbox.c | 94 ++++++++++++++++++++++++++++-------------
1 file changed, 65 insertions(+), 29 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 804b7846b5726..3ffcd90698f3c 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -939,14 +939,20 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
}
EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, "CXL");

-static int cxl_validate_extent_partition(struct cxl_memdev_state *mds,
- struct cxl_extent *extent,
- struct range *ext_range)
+/*
+ * Find the DC (Dynamic Capacity) partition that fully contains @ext_range,
+ * or NULL if the extent falls outside every DC partition on this memdev.
+ * The returned pointer is owned by mds->cxlds.part[] and lives for the
+ * lifetime of the memdev.
+ */
+static const struct cxl_dpa_partition *
+cxl_extent_dc_partition(struct cxl_memdev_state *mds,
+ struct cxl_extent *extent,
+ struct range *ext_range)
{
struct cxl_dev_state *cxlds = &mds->cxlds;
struct device *dev = mds->cxlds.dev;

- /* Extents must be within the DC partition boundary */
for (int i = 0; i < cxlds->nr_partitions; i++) {
struct cxl_dpa_partition *part = &cxlds->part[i];
struct range partition_range = {
@@ -959,25 +965,37 @@ static int cxl_validate_extent_partition(struct cxl_memdev_state *mds,

if (range_contains(&partition_range, ext_range)) {
dev_dbg(dev, "DC extent DPA %pra (DCR:%pra)(%pU)\n",
- &ext_range, &partition_range, extent->uuid);
- return 0;
+ ext_range, &partition_range, extent->uuid);
+ return part;
}
}

dev_err_ratelimited(dev,
"DC extent DPA %pra (%pU) is not in a valid DC partition\n",
- &ext_range, extent->uuid);
- return -ENXIO;
+ ext_range, extent->uuid);
+ return NULL;
}

/*
- * CXL 3.1 permits both tagged (non-null UUID) and untagged (null UUID)
- * extents. The spec is silent on whether untagged extents from different
- * events may be aggregated; we allow them to be combined into a single
- * dax device for simplicity. Sharable extents (shared_extn_seq != 0) are
- * not supported yet and are rejected here.
+ * Per-extent validation for an Add-Capacity event. Two regimes, chosen
+ * by the DC partition's CDAT-advertised sharability:
*
- * Partition boundary and region-attachment are validated separately.
+ * Sharable partition (DSMAS flag ACPI_CDAT_DSMAS_SHAREABLE,
+ * reflected in part->perf.shareable):
+ * - A non-null tag (UUID) is required. The tag is the allocation
+ * identity that every host sharing the allocation uses.
+ * - shared_extn_seq must be non-zero. Together with the other
+ * members of the tag group it forms the 1..n contiguous set that
+ * cxl_check_group_seq() enforces.
+ *
+ * Non-sharable partition:
+ * - The tag is optional; null UUID is permitted.
+ * - shared_extn_seq must be 0. Sequencing is meaningless when
+ * only one host consumes the allocation.
+ *
+ * Any cross-mixing (sharable partition with null tag or seq == 0;
+ * non-sharable partition with non-zero seq) is a device firmware bug.
+ * Partition-boundary and region-attachment checks are separate.
*/
static int cxl_validate_extent(struct cxl_memdev_state *mds,
struct cxl_extent_list_node *pos)
@@ -990,16 +1008,36 @@ static int cxl_validate_extent(struct cxl_memdev_state *mds,
le64_to_cpu(extent->length) - 1,
};
uuid_t *uuid = (uuid_t *)extent->uuid;
+ const struct cxl_dpa_partition *part;
+ u16 seq = le16_to_cpu(extent->shared_extn_seq);

- if (le16_to_cpu(extent->shared_extn_seq) != 0) {
- dev_dbg(dev,
- "DC extent DPA %pra (%pU) is sharable; not supported\n",
- &ext_range, uuid);
+ part = cxl_extent_dc_partition(mds, extent, &ext_range);
+ if (!part)
return -ENXIO;
+
+ if (part->perf.shareable) {
+ if (uuid_is_null(uuid)) {
+ dev_err_ratelimited(dev,
+ "DC extent DPA %pra: sharable-partition extent has null tag (firmware bug)\n",
+ &ext_range);
+ return -ENXIO;
+ }
+ if (seq == 0) {
+ dev_err_ratelimited(dev,
+ "DC extent DPA %pra (%pU): sharable-partition extent missing shared_extn_seq (firmware bug)\n",
+ &ext_range, uuid);
+ return -ENXIO;
+ }
+ return 0;
}

- if (cxl_validate_extent_partition(mds, extent, &ext_range))
+ /* Non-sharable partition. */
+ if (seq != 0) {
+ dev_err_ratelimited(dev,
+ "DC extent DPA %pra (%pU): non-sharable partition but shared_extn_seq=%u (firmware bug)\n",
+ &ext_range, uuid, seq);
return -ENXIO;
+ }

return 0;
}
@@ -1322,10 +1360,8 @@ static bool cxl_extent_dcd_aligned(const struct cxl_extent *extent)
* arrival order is a sufficient definition of "the order the device
* sent them." list_sort() is stable, so when every element in a group
* has shared_extn_seq == 0, ties fall back to list order — which is
- * arrival order via list_add_tail() in add_to_pending_list(). Thus
- * the same comparator gives the right answer for both cases, and the
- * code stays correct if/when sharable (sequenced) extents become
- * supported.
+ * arrival order via list_add_tail() in add_to_pending_list(). One
+ * comparator, both regimes.
*/
static int extent_seq_compare(void *priv,
const struct list_head *a,
@@ -1437,12 +1473,12 @@ static bool cxl_tag_already_committed(struct cxl_memdev_state *mds,
* acceptance would surface a dax device whose backing layout does
* not reflect the device's allocation.
*
- * NOTE: at the time of this patch cxl_validate_extent() still rejects
- * any extent with shared_extn_seq != 0 per-extent (sharable extents are
- * not yet surfaced). This group-level check therefore only exercises
- * the all-zero arm in the current driver; the non-zero arms are in
- * place so that lifting the per-extent restriction does not leave the
- * sequencing-integrity contract unenforced.
+ * cxl_validate_extent() enforces the per-extent partition/sharable
+ * consistency (sharable partition -> non-null tag + non-zero seq;
+ * non-sharable -> seq == 0), so by the time a group reaches this
+ * check all members agree on regime. This helper then enforces the
+ * group-level invariants the per-extent check cannot see: that
+ * sharable groups form an exact 1..n set with no gap or duplicate.
*/
static int cxl_check_group_seq(struct device *dev,
const uuid_t *tag,
--
2.53.0