Re: [PATCH 6/8] amd64_edac: enforce synchronous probe

From: Dmitry Torokhov
Date: Wed Mar 18 2015 - 17:41:36 EST


On Wed, Mar 18, 2015 at 05:02:26PM -0400, Tejun Heo wrote:
> Hello,
>
> On Wed, Mar 18, 2015 at 01:26:05PM -0700, Dmitry Torokhov wrote:
> > Tejun, I lost you here. Certainly you are not arguing for going through
> > the drivers one by one and making their module init code to engage
> > async_schedule to continue the device creation in link order (well,
> > sorta, since deferred probing already violates it).
>
> Kind of, yes, but not by driver-by-driver, but by subsystem. I don't
> think we have too many subsystems where probing order matters and the
> ordering only matters within each subsystem.

I do not think you can always define this by subsystem. SCSI or libata
are exceptions than rule I think. Take for example I2C. Does the
probing order matter? Not if I2C happens to be an input device. But
maybe if it is a serial port. But maybe not if you can deal with it
being probed out of order. And you probably are since many systems
already ready to handle -EPROBE_DEFER.

And I think libata and SD still rely on the underlying PCI to be probed
synchronously. Try probing PCI asynchronously and see your disks getting
renumbered. And if we try to ensure that all devices are registered in
given order you will end up stalling the boot process because while you
can do some of probing simultaneously you still will have to wait till
slow device is done before allowing drivers "after" the slow one to
register their children objects.

>
> > Also, it is not only kernel that may not be prepared for asynchronous
> > probing, but userspace as well. And I do not think that we should be
> > working towards preserving the init order in the long run as more and
> > more bits become hot pluggable and we should be able to handle devices
> > come and go gracefully anyway.
>
> It's not about supporting or not supporting hotplugging. Most of the
> storage devices support hotplugging but still maintain boot order and
> at least for storage devices there are pretty good reasons for doing
> so especially as we can do so w/o giving up on parallel probing. The

You are over-stating the boot order guarantees that storage provides.
Yes, you can scan devices and partitions simultaneously on the same
controller, but it will break if controllers are registered in different
order. And if you are delaying registering cone controller because
another that you consider "first" has not done probing, you are stalling
the boot process. It may be OK for "leaf" devices, which disks are
usually are, bit not when you delaying initialization of a device that
is in a middle of the device tree.

> problem is that if you hinge enabling of general async probing on
> virtually all userlands being okay with storage devices (or ttyS
> devices and so on), we won't be able to enable this, ever.

Sure we will. I am pretty sure will do that for ChromeOS reasonably
soon, and I am sure other distributions can follow if needed.

So, to reiterate: right now we are synchronous by default. Certain
drivers can while-list themselves to be async probed when we are sure
userspace can handle this. We have a module option that can be used by
userspace to make drivers registered when module is being loaded to be
probed asynchronously. I expect newer systemd will start using it so
that they can time-out their module loading workers. We also have a
debug kernel option that can force everything asynchronous. I expect
various distributions developers to try using it once they are
comfortable with systemd loading modules asynchronously and then we can
change it to normal option and consider switching the default from sync
to async. I really do not believe that we should continue building
kernel infrastructure to help userspace pretend that the world is static
and unchanging. Userspace is already aware of this past-boot, it is time
for it to recognize the same during boot.

Thanks.

--
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/