Re: [PATCH] driver core: Fix uevent_show() vs driver detach race

From: Dan Williams
Date: Fri Jul 12 2024 - 19:49:36 EST


Tetsuo Handa wrote:
> On 2024/07/13 4:42, Dan Williams wrote:
> > @@ -2668,8 +2670,12 @@ static int dev_uevent(const struct kobject *kobj, struct kobj_uevent_env *env)
> > if (dev->type && dev->type->name)
> > add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
> >
> > - if (dev->driver)
> > - add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> > + /* Synchronize with module_remove_driver() */
> > + rcu_read_lock();
> > + driver = READ_ONCE(dev->driver);
> > + if (driver)
> > + add_uevent_var(env, "DRIVER=%s", driver->name);
> > + rcu_read_unlock();
> >
>
> Given that read of dev->driver is protected using RCU,
>
> > @@ -97,6 +98,9 @@ void module_remove_driver(struct device_driver *drv)
> > if (!drv)
> > return;
> >
>
> where is
>
> dev->driver = NULL;
>
> performed prior to

It happens in __device_release_driver() and several places in the driver
probe failure path. However, the point of this patch is that the
"dev->driver = NULL" event does not really matter for this sysfs
attribute.

This attribute just wants to opportunistically report the driver name to
userspace, but that result is ephemeral. I.e. as soon as a dev_uevent()
adds a DRIVER environment variable that result could be immediately
invalidated before userspace has a chance to do anything with the
result.

Even with the current device_lock() solution userspace can not depend on
the driver still being attached when it goes to act on the DRIVER
environment variable.

> > + /* Synchronize with dev_uevent() */
> > + synchronize_rcu();
> > +
>
> this synchronize_rcu(), in order to make sure that
> READ_ONCE(dev->driver) in dev_uevent() observes NULL?

No, this synchronize_rcu() is to make sure that if dev_uevent() wins the
race and observes that dev->driver is not NULL that it is still safe to
dereference that result because the 'struct device_driver' object is
still live.

A 'struct device_driver' instance is typically static data in a kernel
module that does not get freed until after driver_unregister(). Calls to
driver_unregister() typically only happen at module removal time. So
this synchronize_rcu() delays module removal until dev_uevent() finishes
reading driver->name.