Re: [PATCH] driver core: fix possible missing of device probe

From: Ming Lei
Date: Fri Sep 28 2012 - 09:42:39 EST


On Fri, Sep 28, 2012 at 4:46 PM, Russell King - ARM Linux
<linux@xxxxxxxxxxxxxxxx> wrote:
>
> Actually, this description is rubbish. It's not about the SMP barriers,
> or read/write device/driver lists, or about two CPUs (my test setup only
> has one CPU.)

Could you point out in detail what is wrong with my description?

Looks the description has been posted for about 12 days...

>
> It's about threads, and the relative timing of those threads through

I do not mention threads case in one CPU because the context in
which device_add runs will always see the driver added into
bus's driver list after applying the patch in this situation. And it
is obvious under UP case.

But in SMP case, it is not so obvious like UP, driver_register and
device_add will be run in two individual CPU, so memory barrier
plays a role in making device_add see the new added driver in
the situation described in the commit log, and avoid the problem.

> the driver model code. All in all, what with the error path issue, and
> now the blatently wrong description, I'm not gaining much confidence in
> Ming Lei.
>
> The below is actually what's happening, according to my analysis - and
> you'll notice that the device list has absolutely nothing to do with it:

I don't think your analysis is complete for the two reasons:

- the problem isn't only triggered by deferred probe, and can be triggered
by device_add too
- your analysis doesn't cover SMP case

In fact, it is just a race between driver_register and device_add or
bus_probe_device, and should be nothing to do with deferred probe.

See my posted description again:

Inside bus_add_driver(), one device might be added into
the bus or probed which is triggered by deferred probe
just after completing of driver_attach() and before
'klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers)',
so the device won't be probed by this driver.

>
> Thread 0 Thread 1 Thread 2
> driver_attach()
> bus_for_each_dev()
> __driver_attach(, devA)
> driver_probe_device(, devA)
> really_probe(devA, )
> driver_deferred_probe_add(devA)
> driver_attach()
> bus_for_each_dev()
> __driver_attach(, devA)
> driver_probe_device(, devB)
> really_probe(devB, )
> driver_bound(devB)
> driver_deferred_probe_trigger()
> deferred_probe_work_func()
> bus_probe_device(devA)
> device_attach(devA)
> bus_for_each_drv()
> __device_attach(, devA)
> *fails to find driver*
> *devA dropped from
> deferred probe list*
> klist_add_tail(klist_drivers)

Didn't my description cover the description above? ( bus_probe_device is
called between driver_attach and klist_add_tail)

Also it is only one of the two situations addressed by the patch.

>
> The reason this fix works is because the order in thread 0 means that
> when thread 2 comes to re-probe the device, it does find the driver,

You mean the driver_attach on same driver can happen at the same time?
If so, it is simply wrong.

> and if the resources that the driver needs are still not found (and it
> again returns -EPROBE_DEFER) the device will be placed back on the
> deferred probe list.

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/