Re: [PATCH v2] cxl/port: Fix missing port lock in cxl_dport_remove()

From: Bowman, Terry

Date: Tue Jun 09 2026 - 11:20:12 EST


On 6/9/2026 2:40 AM, Richard Cheng wrote:
> On Mon, Jun 08, 2026 at 05:35:23PM +0800, Dave Jiang wrote:
>>
>>
>> On 6/8/26 3:35 PM, Terry Bowman wrote:
>>> xa_erase() in cxl_dport_remove() runs without the port device lock,
>>> creating a race with any caller that does xa_load() on port->dports
>>> and then dereferences the returned dport pointer. A concurrent
>>> cxl_dport_remove() can erase and free the dport between the xa_load()
>>> and the caller acquiring the port lock, causing a use-after-free.
>>>
>>> For non-root ports the port lock is already held by the caller on two
>>> paths:
>>>
>>> 1. Driver unbind: devres_release_all() is called from
>>> __device_release_driver() which holds port->dev.mutex.
>>>
>>> 2. Dynamic endpoint removal: cxl_detach_ep() takes the port lock
>>> before calling del_dports() -> del_dport() -> devres_release_group(),
>>> which synchronously runs cxl_dport_remove().
>>>
>>> Use cond_cxl_root_lock/unlock(), which only acquires the port lock when
>>> the port is a root port and the lock is therefore not already held.
>>> This matches the pattern used in __devm_cxl_add_dport() for the same
>>> reason.
>>>
>>> The write-side fix to cxl_dport_remove() is necessary but not
>>> sufficient. Callers that obtain a dport pointer via cxl_mem_find_port()
>>> use a lockless xa_load() and must not dereference that pointer until a
>>> lock that excludes free_dport()/kfree() is held.
>>>
>
> Hi Terry,
>
> I think the mechanism is right, cond_cxl_root_lock() in cxl_dport_remove()
> is a no-op for non-root ports and a real qcquire only for root dports, so
> no deadlock.
>
> But this only cover the 2 cxl_mem_find_port() callers. The sibling
> cxl_pci_find_port() has similar lockless xa_load() plus deref, and those
> callers aren't fied.
>
> Could you either extend the same fix in this series?
> I'm happy to send a follow-up for other readers if that's easier for you.
>

Youre right there are other callsites requiring similar changes. You're follow-up
would helpful. It will allow me to return to the error handling series.
Thanks.

>>> For root ports, dport_to_host() returns uport_dev, so all three devres
>>> actions (free_dport, cxl_dport_remove, cxl_dport_unlink) are registered
>>> on uport_dev. __device_release_driver() holds uport_dev->mutex for the
>>> full teardown sequence including kfree(dport). Holding uport_dev->mutex
>>> on the read side therefore excludes concurrent dport freeing.
>>>
>>> Fix rcd_pcie_cap_emit() by passing NULL to cxl_mem_find_port() to avoid
>>> capturing a lockless dport pointer, then re-fetching dport inside the
>>> uport_dev guard via cxl_find_dport_by_dev(). The previous guard on
>>> root->dev was wrong: cxl_dport_remove() releases root->dev before
>>> free_dport() runs, so root->dev does not protect against concurrent
>>> kfree(dport).
>>>
>>> Fix cxl_mem_probe() similarly: pass NULL to cxl_mem_find_port(), then
>>> re-fetch dport inside scoped_guard(device, &parent_port->dev) for the
>>> VH path, and re-fetch again inside scoped_guard(device, uport_dev) for
>>> the RCH path. This closes both the TOCTOU window between the lockless
>>> xa_load() and the guard acquisition, and the window between the two
>>> sequential guards in the RCH path where a concurrent surprise removal
>>> could free dport before devm_cxl_add_endpoint() dereferences it.
>>>
>>> Reported-by: Sashiko
>>> Fixes: 391785859e7e ("cxl/port: Move dport tracking to an xarray")
>>> Link: https://lore.kernel.org/linux-cxl/20260505173029.2718246-1-terry.bowman@xxxxxxx/
>>> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
>>> Reviewed-by: Ben Cheatham <Benjamin.Cheatham@xxxxxxx>
>>> ---
>>> drivers/cxl/core/port.c | 10 +++++++
>>> drivers/cxl/mem.c | 65 +++++++++++++++++++++++++++++++----------
>>> drivers/cxl/pci.c | 17 +++++++----
>>> 3 files changed, 72 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>>> index c5aacd7054f1..0b8f144596e8 100644
>>> --- a/drivers/cxl/core/port.c
>>> +++ b/drivers/cxl/core/port.c
>>> @@ -1092,8 +1092,18 @@ static void cxl_dport_remove(void *data)
>>> struct cxl_dport *dport = data;
>>> struct cxl_port *port = dport->port;
>>>
>>> + /*
>>> + * For non-root ports the port lock is already held by the caller
>>> + * via devres_release_all() during driver unbind, which holds
>>> + * port->dev.mutex throughout. Acquiring it again unconditionally
>>> + * would deadlock. Use cond_cxl_root_lock() which only acquires
>>> + * when the port is a root port and the lock is therefore not yet
>>> + * held.
>>> + */
>>> + cond_cxl_root_lock(port);
>>> port->nr_dports--;
>>> xa_erase(&port->dports, (unsigned long) dport->dport_dev);
>>> + cond_cxl_root_unlock(port);
>
> In the comment above, maybe worth adding some contents about this is
> also what makes the RCH reads safe. It's no obvious for me.
>
> Best regards,
> Richard Cheng.
>

For RCH/RCD case, the cond_cxl_root_lock() only synchronizes accesses to the
xarray. The RP in the RCH/RCD context is a SW construct. We need to hold the
port->uport_dev->mutex to prevent kfree().

- Terry


>>> put_device(dport->dport_dev);
>>> }
>>>
>>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>>> index fcffe24dcb42..345b56f215ff 100644
>>> --- a/drivers/cxl/mem.c
>>> +++ b/drivers/cxl/mem.c
>>> @@ -70,9 +70,9 @@ static int cxl_mem_probe(struct device *dev)
>>> struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>>> struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>>> struct cxl_dev_state *cxlds = cxlmd->cxlds;
>>> - struct device *endpoint_parent;
>>> struct cxl_dport *dport;
>>> struct dentry *dentry;
>>> + bool rch = false;
>>> int rc;
>>>
>>> if (!cxlds->media_ready)
>>> @@ -107,8 +107,7 @@ static int cxl_mem_probe(struct device *dev)
>>> if (rc)
>>> return rc;
>>>
>>> - struct cxl_port *parent_port __free(put_cxl_port) =
>>> - cxl_mem_find_port(cxlmd, &dport);
>>> + struct cxl_port *parent_port __free(put_cxl_port) = cxl_mem_find_port(cxlmd, NULL);
>>> if (!parent_port) {
>>> dev_err(dev, "CXL port topology not found\n");
>>> return -ENXIO;
>>> @@ -123,21 +122,57 @@ static int cxl_mem_probe(struct device *dev)
>>> }
>>> }
>>>
>>> - if (dport->rch)
>>> - endpoint_parent = parent_port->uport_dev;
>>> - else
>>> - endpoint_parent = &parent_port->dev;
>>> -
>>> - scoped_guard(device, endpoint_parent) {
>>> - if (!endpoint_parent->driver) {
>>> - dev_err(dev, "CXL port topology %s not enabled\n",
>>> - dev_name(endpoint_parent));
>>> + scoped_guard(device, &parent_port->dev) {
>>> + /*
>>> + * Re-fetch dport under the port lock to close the TOCTOU
>>> + * window between cxl_mem_find_port()'s lockless xa_load() and
>>> + * this guard acquisition. A concurrent surprise removal can
>>> + * free the dport in that window.
>>> + */
>>> + dport = cxl_find_dport_by_dev(parent_port, cxlmd->dev.parent->parent);
>>> + if (!dport) {
>>> + dev_err(dev, "CXL port topology %s not found\n",
>>> + dev_name(&parent_port->dev));
>>> return -ENXIO;
>>> }
>>> + rch = dport->rch;
>>> +
>>> + if (!rch) {
>>> + if (!parent_port->dev.driver) {
>>> + dev_err(dev, "CXL port topology %s not enabled\n",
>>> + dev_name(&parent_port->dev));
>>> + return -ENXIO;
>>> + }
>>> + rc = devm_cxl_add_endpoint(&parent_port->dev, cxlmd, dport);
>>> + if (rc)
>>> + return rc;
>>> + }
>>> + }
>>>
>>> - rc = devm_cxl_add_endpoint(endpoint_parent, cxlmd, dport);
>>> - if (rc)
>>> - return rc;
>>> + if (rch) {
>>> + struct device *uport_dev = parent_port->uport_dev;
>>> +
>>> + scoped_guard(device, uport_dev) {
>>> + if (!uport_dev->driver) {
>>> + dev_err(dev, "CXL port topology %s not enabled\n",
>>> + dev_name(uport_dev));
>>> + return -ENXIO;
>>> + }
>>> + /*
>>> + * Re-fetch dport under uport_dev lock. uport_dev->mutex
>>> + * is held for the full devres teardown sequence including
>>> + * free_dport()/kfree(), so this excludes concurrent
>>> + * hotplug removal through the entire dereference.
>>> + */
>>> + dport = cxl_find_dport_by_dev(parent_port, cxlmd->dev.parent->parent);
>>> + if (!dport) {
>>> + dev_err(dev, "CXL RCH dport not found\n");
>>> + return -ENXIO;
>>> + }
>>> + rc = devm_cxl_add_endpoint(uport_dev, cxlmd, dport);
>>> + if (rc)
>>> + return rc;
>>> + }
>>
>> Still reviewing the patch, but thoughts on moving the two new big blocks to a helper function?
>>
>> DJ
>>
>>> }
>>>
>>> if (cxlmd->attach) {
>>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>>> index bace662dc988..710a62a66429 100644
>>> --- a/drivers/cxl/pci.c
>>> +++ b/drivers/cxl/pci.c
>>> @@ -708,10 +708,10 @@ static ssize_t rcd_pcie_cap_emit(struct device *dev, u16 offset, char *buf, size
>>> {
>>> struct cxl_dev_state *cxlds = dev_get_drvdata(dev);
>>> struct cxl_memdev *cxlmd = cxlds->cxlmd;
>>> - struct device *root_dev;
>>> struct cxl_dport *dport;
>>> + struct device *root_dev;
>>> struct cxl_port *root __free(put_cxl_port) =
>>> - cxl_mem_find_port(cxlmd, &dport);
>>> + cxl_mem_find_port(cxlmd, NULL);
>>>
>>> if (!root)
>>> return -ENXIO;
>>> @@ -720,13 +720,20 @@ static ssize_t rcd_pcie_cap_emit(struct device *dev, u16 offset, char *buf, size
>>> if (!root_dev)
>>> return -ENXIO;
>>>
>>> - if (!dport->regs.rcd_pcie_cap)
>>> - return -ENXIO;
>>> -
>>> guard(device)(root_dev);
>>> if (!root_dev->driver)
>>> return -ENXIO;
>>>
>>> + /*
>>> + * Fetch dport under uport_dev lock to protect against concurrent
>>> + * hotplug removal. uport_dev->mutex is held for the entire devres
>>> + * teardown sequence including free_dport(), so holding it here
>>> + * excludes concurrent kfree(dport).
>>> + */
>>> + dport = cxl_find_dport_by_dev(root, cxlmd->dev.parent->parent);
>>> + if (!dport || !dport->regs.rcd_pcie_cap)
>>> + return -ENXIO;
>>> +
>>> switch (width) {
>>> case 2:
>>> return sysfs_emit(buf, "%#x\n",
>>
>>