Re: [PATCH] Revert "driver core: enforce device_lock for driver_match_device()"

From: Gui-Dong Han

Date: Sun Mar 01 2026 - 20:55:38 EST


On Mon, Mar 2, 2026 at 8:25 AM Danilo Krummrich <dakr@xxxxxxxxxx> wrote:
>
> This reverts commit dc23806a7c47 ("driver core: enforce device_lock for
> driver_match_device()") and commit 289b14592cef ("driver core: fix
> inverted "locked" suffix of driver_match_device()").
>
> While technically correct, there is a major downside to this approach:
>
> When a device is already present in the system and a driver is
> registered on the same bus, we iterate over all devices registered on
> this bus to see if one of them matches. If we come across an already
> bound one where the corresponding driver crashed while holding the
> device lock (e.g. in probe()) we can't make any progress anymore.
>
> However, drivers are typically the least tested code in the kernel and
> hence it is a case that is likely to happen regularly. Besides hurting
> developer ergonomics, it potentially decreases chances of shutting
> things down cleanly and obtaining logs in production environments as
> well. [1]
>
> This came up in the context of a firewire bug, which only in combination
> with the reverted commit, caused the machine to hang. [2]
>
> Thus, revert commit dc23806a7c47 ("driver core: enforce device_lock for
> driver_match_device()") and add a brief note clarifying that an
> implementer of struct bus_type must not expect match() to be called with
> the device lock held.
>
> Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@xxxxxxxxxx/ [1]
> Link: https://lore.kernel.org/all/67f655bb-4d81-4609-b008-68d200255dd2@xxxxxxxxxxxx/ [2]
> Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Closes: https://lore.kernel.org/driver-core/CAHk-=wgJ_L1C=HjcYJotg_zrZEmiLFJaoic+PWthjuQrutrfJw@xxxxxxxxxxxxxx/
> Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx>

This aligns with our previous analysis:
https://lore.kernel.org/lkml/CALbr=LZ4v7N=tO1vgOsyj9AS+XuNbn6kG-QcF+PacdMjSo0iyw@xxxxxxxxxxxxxx/

Enforcing the device_lock here amplifies the impact of an oops during
probe, so moving to a separate lock for driver_override is definitely
a better approach.

The revert looks good to me.

As a minor note, it might be worth adding a tag for the initial Tegra
issue where this lock stall was first observed. No big deal if you
decide to leave it as is, though.

Reviewed-by: Gui-Dong Han <hanguidong02@xxxxxxxxx>

> ---
> drivers/base/base.h | 11 +----------
> drivers/base/dd.c | 2 +-
> include/linux/device/bus.h | 2 ++
> 3 files changed, 4 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index 79d031d2d845..1af95ac68b77 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -179,19 +179,10 @@ void device_release_driver_internal(struct device *dev, const struct device_driv
> void driver_detach(const struct device_driver *drv);
> void driver_deferred_probe_del(struct device *dev);
> void device_set_deferred_probe_reason(const struct device *dev, struct va_format *vaf);
> -static inline int driver_match_device_locked(const struct device_driver *drv,
> - struct device *dev)
> -{
> - device_lock_assert(dev);
> -
> - return drv->bus->match ? drv->bus->match(dev, drv) : 1;
> -}
> -
> static inline int driver_match_device(const struct device_driver *drv,
> struct device *dev)
> {
> - guard(device)(dev);
> - return driver_match_device_locked(drv, dev);
> + return drv->bus->match ? drv->bus->match(dev, drv) : 1;
> }
>
> static inline void dev_sync_state(struct device *dev)
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 0354f209529c..bea8da5f8a3a 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -928,7 +928,7 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
> bool async_allowed;
> int ret;
>
> - ret = driver_match_device_locked(drv, dev);
> + ret = driver_match_device(drv, dev);
> if (ret == 0) {
> /* no match */
> return 0;
> diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
> index 99c3c83ea520..63de5f053c33 100644
> --- a/include/linux/device/bus.h
> +++ b/include/linux/device/bus.h
> @@ -35,6 +35,8 @@ struct fwnode_handle;
> * otherwise. It may also return error code if determining that
> * the driver supports the device is not possible. In case of
> * -EPROBE_DEFER it will queue the device for deferred probing.
> + * Note: This callback may be invoked with or without the device
> + * lock held.
> * @uevent: Called when a device is added, removed, or a few other things
> * that generate uevents to add the environment variables.
> * @probe: Called when a new device or driver add to this bus, and callback
>
> base-commit: 78437ab3b769f80526416570f60173c89858dd84
> --
> 2.53.0
>