[PATCH v2] software node: Only the managing device can unreference managed software node
From: mike . isely
Date: Tue Mar 03 2026 - 16:14:00 EST
From: Mike Isely <mike.isely@xxxxxxxxxxxxxxxxx>
Explanation here is lengthy because the problem is subtle.
A scenario exists where device_create_managed_software_node() is used
to create an swnode instance that will be implicitly shared to a child
device despite best intentions not to permit such sharing (per the
comment in device_create_managed_software_node()). I encountered this
with the sfp kernel module when it was instantiated with properties
via a call to platform_device_register_full() - it will create hwmon
child devices which get all property references. Unfortunately with
just a "managed" boolean in struct swnode handling this, then
kobject_put() gets called for the managed aspect when the child device
goes away instead of the parent. This leads to premature freeing of
the swnode structure, followed by use-after-free problems, heap
corruption, and generally chaos / crashes / misbehavior in the kernel.
This commit changes that boolean into a pointer to the actual managing
struct device, which is then checked against the struct device
instance that is actually going away (via the usual call back into
software_node_notify_remove()). Thus the child device removal is
ignored as it should, and we only do the kobject_put() when the actual
managing struct device instance goes away. We effectively carry a
little bit more information now so that we can be sure to clean up
only when the correct struct device instance is actually going away.
Note that while we are now keeping a pointer to a struct device here,
this is safe to do because the pointer itself only stays in use while
the pointed-to device remains valid. (So no need to be concerned
about additional reference counting.)
Here's a succinct, generic sequence that spells out the steps which
lead to trouble:
1. Somebody creates a struct device instance.
2. Somebody calls device_create_managed_software_node() associating
it with that struct device instance. This causes two kobject
references to be counted against the swnode instance (one for the
struct device and one because the managed flag was set). refcount=2
3. Some time later, a child struct device is created and the
properties of the parent are shared with the child. This causes
another kobject reference to be counted against the swnode instance.
refcount=3
4. Some time later, that child struct device instance goes
away (perhaps due to a hotplug removal). During the tear-down,
software_node_notify_remove() notices that the managed field of the
swnode instance is set, so TWO kobject references are removed
instead of the single reference created by the previous step.
refcount=1
5. Repeat step 3. refcount=2
6. Repeat step 4. refcount=0
7. Now the kobject reference count inside the swnode instance has
dropped to zero and so the swnode instance is released. Notice that
the parent device is still holding a physical reference, now
pointing into freed memory. Chaos likely follows. If lucky, a NULL
pointer kernel oops is the result.
The patch fixes this by causing step 4 to only do the second reference
decrement if the device being torn down is the same one that
established the managing relationship in the first place. Then
everything stays balanced.
I confirmed this exact sequence of behavior in the context of the sfp
kernel module with appropriate printk debug trace added, when
instantiated with platform_device_register_full() with properties
provided. While that's may be hard for to reproduce without sfp
hardware, anything else which performs those steps should produce
similar results.
Note that because platform_device_register_full() will implicitly
employ a managed swnode instance any time it is passed properties in
its pdevinfo argument, then by implication anyone calling
platform_device_register_full() in that manner - which happens in
multiple other places in the kernel - is at risk for the same issue.
Signed-off-by: Mike Isely <mike.isely@xxxxxxxxxxxxxxxxx>
Signed-off-by: Mike Isely <isely@xxxxxxxxx>
---
drivers/base/swnode.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/base/swnode.c b/drivers/base/swnode.c
index 53b3f0061ad12..18a903a35197c 100644
--- a/drivers/base/swnode.c
+++ b/drivers/base/swnode.c
@@ -36,8 +36,8 @@ struct swnode {
struct list_head children;
struct swnode *parent;
+ struct device *managing_dev;
unsigned int allocated:1;
- unsigned int managed:1;
};
static DEFINE_IDA(swnode_root_ids);
@@ -1060,7 +1060,7 @@ int device_create_managed_software_node(struct device *dev,
if (IS_ERR(fwnode))
return PTR_ERR(fwnode);
- to_swnode(fwnode)->managed = true;
+ to_swnode(fwnode)->managing_dev = dev;
set_secondary_fwnode(dev, fwnode);
if (device_is_registered(dev))
@@ -1104,7 +1104,7 @@ void software_node_notify_remove(struct device *dev)
sysfs_remove_link(&dev->kobj, "software_node");
kobject_put(&swnode->kobj);
- if (swnode->managed) {
+ if (swnode->managing_dev == dev) {
set_secondary_fwnode(dev, NULL);
kobject_put(&swnode->kobj);
}
--
2.47.3
---
Correct issue in drivers/base/swnode.c that can lead to use-after-free
due to kobject reference counting error, which itself is due to
incorrect behavior with the "managed" struct swnode flag in
circumstances involving child struct device instances where the parent
struct device is managing a struct swnode.
Use-after-free in this case leads to likely kernel heap corruption, so
any manner of chaos can result, if left unaddressed.