Re: [PATCH] drivers: core: Make dev->driver usage safe in dev_uevent()

From: Eugeniu Rosca
Date: Tue Apr 30 2024 - 04:25:06 EST


Hi Greg,

On Tue, Apr 30, 2024 at 09:20:10AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Apr 30, 2024 at 06:55:31AM +0200, Dirk Behme wrote:
> > Inspired by the function dev_driver_string() in the same file make sure
> > in case of uninitialization dev->driver is used safely in dev_uevent(),
> > as well.
>
> I think you are racing and just getting "lucky" with your change here,
> just like dev_driver_string() is doing there (that READ_ONCE() really
> isn't doing much to protect you...)

I hope below details shed more details on the repro:
https://gist.github.com/erosca/1e8a87fbcc9e5ad0fecd32ebcb6266c3

To improve the occurrence rate:
- a dummy ds90ux9xx-dummy driver was used
- a dummy i2c node was added to DTS
- a dummy pr_alert() was added to dev_uevent() @ drivers/base/core.c
- UBSAN + KASAN enabled in .config

> > This change is based on the observation of the following race condition:
> >
> > Thread #1:
> > ==========
> >
> > really_probe() {
> > ...
> > probe_failed:
> > ...
> > device_unbind_cleanup(dev) {
> > ...
> > dev->driver = NULL; // <= Failed probe sets dev->driver to NULL
> > ...
> > }
> > ...
> > }
> >
> > Thread #2:
> > ==========
> >
> > dev_uevent() {
>
> Wait, how can dev_uevent() be called if probe fails? Who is calling
> that?

dev_uevent() is called by reading /sys/bus/i2c/devices/<dev>/uevent.
Not directly triggered by the probe failure.
Please, kindly check the above gist/notes.

[--- cut ---]

> > - if (dev->driver)
> > - add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> > + /* dev->driver can change to NULL underneath us because of unbinding
> > + * or failing probe(), so be careful about accessing it.
> > + */
> > + drv = READ_ONCE(dev->driver);
> > + if (drv)
> > + add_uevent_var(env, "DRIVER=%s", drv->name);
>
> Again, you are just reducing the window here. Maybe a bit, but not all
> that much overall as there is no real lock present.

The main objective of the patch is to "cache" dev->driver, such
that it is not cleared asynchronously from a parallel thread.
A refined/minimal locking alternative (if feasible) is welcome.

>
> So how is this actually solving anything? And who is calling a uevent
> on a device that is not probed properly? Userspace? Within the kernel?
> Something else?

Repro details provided in the gist/notes above.

BR, Eugeniu