Re: [PATCH v3 1/9] driver core: Don't let a device probe until it's ready

From: Greg Kroah-Hartman

Date: Fri Apr 03 2026 - 03:05:05 EST


On Thu, Apr 02, 2026 at 05:49:47PM -0700, Douglas Anderson wrote:
> The moment we link a "struct device" into the list of devices for the
> bus, it's possible probe can happen. This is because another thread
> can load the driver at any time and that can cause the device to
> probe. This has been seen in practice with a stack crawl that looks
> like this [1]:
>
> really_probe()
> __driver_probe_device()
> driver_probe_device()
> __driver_attach()
> bus_for_each_dev()
> driver_attach()
> bus_add_driver()
> driver_register()
> __platform_driver_register()
> init_module() [some module]
> do_one_initcall()
> do_init_module()
> load_module()
> __arm64_sys_finit_module()
> invoke_syscall()
>
> As a result of the above, it was seen that device_links_driver_bound()
> could be called for the device before "dev->fwnode->dev" was
> assigned. This prevented __fw_devlink_pickup_dangling_consumers() from
> being called which meant that other devices waiting on our driver's
> sub-nodes were stuck deferring forever.
>
> It's believed that this problem is showing up suddenly for two
> reasons:
> 1. Android has recently (last ~1 year) implemented an optimization to
> the order it loads modules [2]. When devices opt-in to this faster
> loading, modules are loaded one-after-the-other very quickly. This
> is unlike how other distributions do it. The reproduction of this
> problem has only been seen on devices that opt-in to Android's
> "parallel module loading".
> 2. Android devices typically opt-in to fw_devlink, and the most
> noticeable issue is the NULL "dev->fwnode->dev" in
> device_links_driver_bound(). fw_devlink is somewhat new code and
> also not in use by all Linux devices.
>
> Even though the specific symptom where "dev->fwnode->dev" wasn't
> assigned could be fixed by moving that assignment higher in
> device_add(), other parts of device_add() (like the call to
> device_pm_add()) are also important to run before probe. Only moving
> the "dev->fwnode->dev" assignment would likely fix the current
> symptoms but lead to difficult-to-debug problems in the future.
>
> Fix the problem by preventing probe until device_add() has run far
> enough that the device is ready to probe. If somehow we end up trying
> to probe before we're allowed, __driver_probe_device() will return
> -EPROBE_DEFER which will make certain the device is noticed.
>
> In the race condition that was seen with Android's faster module
> loading, we will temporarily add the device to the deferred list and
> then take it off immediately when device_add() probes the device.
>
> Instead of adding another flag to the bitfields already in "struct
> device", instead add a new "flags" field and use that. This allows us
> to freely change the bit from different thread without holding the
> device lock and without worrying about corrupting nearby bits.
>
> [1] Captured on a machine running a downstream 6.6 kernel
> [2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libmodprobe/libmodprobe.cpp?q=LoadModulesParallel
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 2023c610dc54 ("Driver core: add new device to bus's list before probing")
> Reviewed-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@xxxxxxxxxx>
> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> ---
> v1: https://lore.kernel.org/r/20260320200656.RFC.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid
> v2: https://lore.kernel.org/r/20260330072839.v2.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid
>
> As of v2 this feels like a very safe change. It doesn't change the
> ordering of any steps of probe and it _just_ prevents the early probe
> from happening.
>
> I ran tests where I turned the printout "Device not ready_to_probe" on
> and I could see the printout happening, evidence of the race occurring
> from other printouts, and things successfully being resolved.
>
> I kept Alan and Rafael's Reviewed-by tags in v3 even though I changed
> the implementation to use "flags" as per Danilo's request. This didn't
> seem like a major enough change to remove tags, but please shout if
> you'd like your tag removed.
>
> Changes in v3:
> - Use a new "flags" bitfield
> - Add missing \n in probe error message
>
> Changes in v2:
> - Instead of adjusting the ordering, use "ready_to_probe" flag
>
> drivers/base/core.c | 13 +++++++++++++
> drivers/base/dd.c | 12 ++++++++++++
> include/linux/device.h | 15 +++++++++++++++
> 3 files changed, 40 insertions(+)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 09b98f02f559..8d4028a2165f 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3688,6 +3688,19 @@ int device_add(struct device *dev)
> fw_devlink_link_device(dev);
> }
>
> + /*
> + * The moment the device was linked into the bus's "klist_devices" in
> + * bus_add_device() then it's possible that probe could have been
> + * attempted in a different thread via userspace loading a driver
> + * matching the device. "DEV_FLAG_READY_TO_PROBE" being unset would have
> + * blocked those attempts. Now that all of the above initialization has
> + * happened, unblock probe. If probe happens through another thread
> + * after this point but before bus_probe_device() runs then it's fine.
> + * bus_probe_device() -> device_initial_probe() -> __device_attach()
> + * will notice (under device_lock) that the device is already bound.
> + */
> + set_bit(DEV_FLAG_READY_TO_PROBE, &dev->flags);
> +
> bus_probe_device(dev);
>
> /*
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 37c7e54e0e4c..3aead51173f8 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -848,6 +848,18 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
> if (dev->driver)
> return -EBUSY;
>
> + /*
> + * In device_add(), the "struct device" gets linked into the subsystem's
> + * list of devices and broadcast to userspace (via uevent) before we're
> + * quite ready to probe. Those open pathways to driver probe before
> + * we've finished enough of device_add() to reliably support probe.
> + * Detect this and tell other pathways to try again later. device_add()
> + * itself will also try to probe immediately after setting
> + * "DEV_FLAG_READY_TO_PROBE".
> + */
> + if (!test_bit(DEV_FLAG_READY_TO_PROBE, &dev->flags))
> + return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready to probe\n");
> +
> dev->can_match = true;
> dev_dbg(dev, "bus: '%s': %s: matched device with driver %s\n",
> drv->bus->name, __func__, drv->name);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index e65d564f01cd..5b7ace5921aa 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -458,6 +458,18 @@ struct device_physical_location {
> bool lid;
> };
>
> +/**
> + * enum struct_device_flags - Flags in struct device
> + *
> + * Should be accessed with thread-safe bitops.
> + *
> + * @DEV_FLAG_READY_TO_PROBE: If set then device_add() has finished enough
> + * initialization that probe could be called.
> + */
> +enum struct_device_flags {
> + DEV_FLAG_READY_TO_PROBE,
> +};

If you are going to want this to be a bit value, please use BIT(X) and
not an enum, as that's going to just get confusing over time.

Also, none of this manual test_bit()/set_bit() stuff, please give us
real "accessors" for this like:
bool device_ready_to_probe(struct device *dev);

so that it's obvious what is happening.

thanks,

greg k-h