Re: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe=

From: Yafang Shao
Date: Wed Dec 06 2023 - 09:09:21 EST


On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > occurred due to the driver's asynchronous probe behavior. Specifically,
> > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > /dev/sda, notably impacting monitoring systems monitoring the root disk.
>
> Device names are never guaranteed to be stable, ALWAYS use a persistant
> names like a filesystem label or other ways. Look at /dev/disk/ for the
> needed ways to do this properly.

The root disk is typically identified as /dev/sda or /dev/vda, right?
This is because the root disk, which houses the operating system,
cannot be removed or hotplugged. Therefore, it usually remains as the
first disk in the system. With the synchronous probe, the root disk
maintains a stable and consistent identification.

>
> > To address this, a new kernel parameter 'driver_sync_probe=' is introduced
> > to enforce synchronous probe behavior for specific drivers.
>
> This should be a per-bus thing, not a driver-specific thing as drivers
> for the same bus could have differing settings here which would cause a
> mess.
>
> Please just revert the scsi bus functionality if you have had
> regressions here, it's not a driver-core thing to do.

Are you suggesting a reversal of the asynchronous probe code in the
SCSI driver? While reverting to synchronous probing could ensure
stability, it's worth noting that asynchronous probing can potentially
shorten the reboot duration under specific conditions. Thus, there
might be some resistance to reverting this change as it offers
performance benefits in certain scenarios. That's why I prefer to
introduce a kernel parameter for it.

--
Regards
Yafang