Re: [PATCH] Devcoredump: fix use-after-free issue when releasing devcd device

From: Mukesh Ojha
Date: Fri Oct 27 2023 - 02:23:56 EST




On 10/27/2023 11:25 AM, Yu Wang wrote:
With sample code as below, it may hit use-after-free issue when
releasing devcd device.

struct my_coredump_state {
struct completion dump_done;
...
};

static void my_coredump_free(void *data)
{
struct my_coredump_state *dump_state = data;
...
complete(&dump_state->dump_done);
}

static void my_dev_release(struct device *dev)
{
kfree(dev);
}

static void my_coredump()
{
struct my_coredump_state dump_state;
struct device *new_device =
kzalloc(sizeof(*new_device), GFP_KERNEL);

...
new_device->release = my_dev_release;
device_initialize(new_device);
...
device_add(new_device);
...
init_completion(&dump_state.dump_done);
dev_coredumpm(new_device, NULL, &dump_state, datalen, GFP_KERNEL,
my_coredump_read, my_coredump_free);
wait_for_completion(&dump_state.dump_done);
device_del(new_device);
put_device(new_device);
}

In devcoredump framework, devcd_dev_release() will be called when
releasing the devcd device, it will call the free() callback first
and try to delete the symlink in sysfs directory of the failing device.
Eventhough it has checked 'devcd->failing_dev->kobj.sd' before that,
there is no mechanism to ensure it's still available when accessing
it in kernfs_find_ns(), refer to the diagram as below:

Thread A was waiting for 'dump_state.dump_done' at #A-1-2 after
calling dev_coredumpm().
When thread B calling devcd->free() at #B-2-1, it wakes up
thread A from point #A-1-2, which will call device_del() to
delete the device.
If #B-2-2 comes before #A-3-1, but #B-4 comes after #A-4, it
will hit use-after-free issue when trying to access
'devcd->failing_dev->kobj.sd'.

#A-1-1: dev_coredumpm()
#A-1-2: wait_for_completion(&dump_state.dump_done)
#A-1-3: device_del()
#A-2: kobject_del()
#A-3-1: sysfs_remove_dir() --> set kobj->sd=NULL
#A-3-2: kernfs_put()
#A-4: kmem_cache_free() --> free kobj->sd

#B-1: devcd_dev_release()
#B-2-1: devcd->free(devcd->data)
#B-2-2: check devcd->failing_dev->kobj.sd
#B-2-3: sysfs_delete_link()
#B-3: kernfs_remove_by_name_ns()
#B-4: kernfs_find_ns() --> access devcd->failing_dev->kobj.sd

To fix this issue, put operations on devcd->failing_dev before
calling the free() callback in devcd_dev_release().

Signed-off-by: Yu Wang <quic_yyuwang@xxxxxxxxxxx>
---
drivers/base/devcoredump.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index 91536ee05f14..35c704ddfeae 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -83,9 +83,6 @@ static void devcd_dev_release(struct device *dev)
{
struct devcd_entry *devcd = dev_to_devcd(dev);
- devcd->free(devcd->data);
- module_put(devcd->owner);
-
/*
* this seems racy, but I don't see a notifier or such on
* a struct device to know when it goes away?

Does this comment became obsolete now ?

-Mukesh

@@ -95,6 +92,8 @@ static void devcd_dev_release(struct device *dev)
"devcoredump");
put_device(devcd->failing_dev);
+ devcd->free(devcd->data);
+ module_put(devcd->owner);
kfree(devcd);
}