Re: [PATCH v2] drivers: core: synchronize really_probe() and dev_uevent()

From: Greg Kroah-Hartman
Date: Fri Jul 12 2024 - 04:57:00 EST


On Thu, Jul 11, 2024 at 05:07:21PM -0700, Dan Williams wrote:
> Dirk Behme wrote:
> > Synchronize the dev->driver usage in really_probe() and dev_uevent().
> > These can run in different threads, what can result in the following
> > race condition for dev->driver uninitialization:
>
> This fix introduces an ABBA deadlock scenario via the known antipattern
> of taking the device_lock() within device attributes that are removed
> while the lock is held.

Ugh, yes :(

device attributes should not be taking that lock, don't we have a
different call for an attribute that will be removing itself?

> Lockdep splat below. I previously reported this on a syzbot report
> against nvdimm subsytems with a more complicated splat [1], but this one
> is more straightforward.
>
> Recall that the reason this lockdep report is not widespread is because
> CXL and NVDIMM are among the only subsystems that add lockdep coverage
> to device_lock() with a local key.
>
> [1]: http://lore.kernel.org/667a2ae44c0c0_5be92947e@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.notmuch
>
> One potential hack is something like this if it is backstopped with
> synchronization between unregistering drivers from buses relative to
> uevent callbacks for those buses:
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 2b4c0624b704..dfba73ef39af 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2640,6 +2640,7 @@ static const char *dev_uevent_name(const struct kobject *kobj)
> static int dev_uevent(const struct kobject *kobj, struct kobj_uevent_env *env)
> {
> const struct device *dev = kobj_to_dev(kobj);
> + struct device_driver *driver;
> int retval = 0;
>
> /* add device node properties if present */
> @@ -2668,8 +2669,14 @@ static int dev_uevent(const struct kobject *kobj, struct kobj_uevent_env *env)
> if (dev->type && dev->type->name)
> add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
>
> - if (dev->driver)
> - add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> + /*
> + * While it is likely that this races driver detach, it is
> + * unlikely that any driver attached with this device is racing being
> + * freed relative to a uevent for the same device
> + */
> + driver = READ_ONCE(dev->driver);
> + if (driver)
> + add_uevent_var(env, "DRIVER=%s", driver->name);
>
> /* Add common DT information about the device */
> of_device_uevent(dev, env);
>

I'll take this patch for now if you want to also include the removal of
the lock patch that caused your splat.

thanks,

greg k-h