Re: [PATCH 1/2] cxl/region: fix region leak when attach_target fails in cxl_add_to_region
From: Gregory Price
Date: Mon Feb 23 2026 - 15:15:33 EST
On Mon, Feb 23, 2026 at 11:48:42AM -0800, Alison Schofield wrote:
> On Fri, Feb 20, 2026 at 11:30:12PM -0500, Gregory Price wrote:
> > cxl_add_to_region() ignores the return value of attach_target(). When
> > attach_target() fails (e.g. cxl_port_setup_targets() returns -ENXIO),
> > the auto-discovered region remains registered with its HPA resource
> > consumed but never reaches COMMIT state. Subsequent region creation
> > attempts fail with -ENOSPC because the HPA range is already reserved.
> >
> > Track whether this call to cxl_add_to_region() created the region, and
> > call drop_region() on attach_target() failure to unregister it and
> > release the HPA resource. Pre-existing regions are left alone since
> > other endpoints may already be attached.
>
> I see you dropping this, perhaps just for the moment, because
> the drop_region() you wanted to use is not available yet.
>
Yeah it's not a particularly useful cleanup in the current
infrastructure because nothing actually uses this pattern (yet).
> This looks a lot like
> https://lore.kernel.org/linux-cxl/2a613604c0cdda6d9f838ae9b47ea6d936c5e4ce.1769746294.git.alison.schofield@xxxxxxxxx/
> cxl/region: Unregister auto-created region when assembly fails
> When auto-created region assembly fails the region remains registered
> but disabled. The region continues to reserve its memory resource,
> preventing DAX from registering the memory.
> Unregister the region on assembly failure to release the resource.
>
> And the review comments on that one, or at least on that thread in
> general, was to leave all the broken things in place.
> I didn't agree with that, and hope to see this version move ahead
> when you have the drop_region you need.
>
>
The important note here is the difference between auto-regions and
manually created regions. For auto-regions, you might have another
endpoint show up looking for the partially created region - and then
just go off and create it anyway because it thinks it was first.
But in my driver, i'm explicitly converting these auto-regions into
other things, and if that fails it causes *all other* region creation to
fail - even if it wasn't actually dependent on that original region.
This is only an issue if you have two devices unbind/bind cycling at
the same time - i.e.
echo 0000:d0:00.00 > cxl_pci/unbind
echo 0000:e0:00.00 > cxl_pci/unbind
echo 0000:d0:00.00 > mydriver/bind
echo 0000:e0:00.00 > mydriver/bind
If the platform has pre-programmed and locked the decoders, and one of
the two devices fails to probe and leaves a hanging partially
created region, the other device will fail too.
It's a pretty narrow failure scenario.
~Gregory