Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

From: Davidlohr Bueso

Date: Tue Mar 10 2026 - 14:47:25 EST


On Sun, 08 Mar 2026, Sungwoo Kim wrote:

A race exists when two concurrent sysfs writes to delete_region specify
the same region name. Both calls succeed in cxl_find_region_by_name()
(which only does device_find_child_by_name and takes a reference), and
both then proceed to call devm_release_action(). The first call atomically
removes and releases the devres entry successfully. The second call finds
no matching entry, causing devres_release() to return -ENOENT, which trips
the WARN_ON.

afaict the splat is also triggable via devres_release_all(), ie: unbinding
the host bridge. Basically cxl_find_region_by_name() succeeds because the
region hasn't been device_del()'d yet:

CPU0 CPU1
devres_release_all()
// take devres_lock
remove_nodes(devres_head) // mv to local todo
// drop devres_lock delete_region_store()
cxlr = cxl_find_region_by_name() // success
devm_release_action(unregister_region)
devres_release()
devres_remove()
// hold devres_lock
find_dr(devres_head) // does not find it
WARN_ON(-ENOENT)
release_nodes() // drain todo
unregister_region(cxlr) // release() cb
device_del()
Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
followed by a manual call to unregister_region(). devm_remove_action_nowarn()
removes the devres tracking entry and returns an error code.

While devm_remove_action_nowarn() has only a single driver user (gpio), using it
here would seem to fit the requirement of independent lifetime management; and
ultimately these races seem benign as unregister_region() is only being called
once.

------------[ cut here ]------------
WARNING: drivers/base/devres.c:824 at devm_release_action drivers/base/devres.c:824 [inline], CPU#0: syz.1.12224/47589
WARNING: drivers/base/devres.c:824 at devm_release_action+0x2b2/0x360 drivers/base/devres.c:817, CPU#0: syz.1.12224/47589

I see you are using syzkaller; I added cxl support as well a while back based
on the usb fuzzying approach, and also triggered this issue (which was in my
to-investigate backlog, so glad you ran into this).

Thanks,
Davidlohr