Re: Please revert 5adc55da4a7758021bcc374904b0f8b076508a11(PCI_MULTITHREAD_PROBE)

From: Cornelia Huck
Date: Tue May 08 2007 - 15:22:10 EST


On Tue, 8 May 2007 11:30:03 -0700 (PDT),
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> It's better because then the *driver* can decide whether it supports
> threaded probing or not.

Is there any reason why a driver shouldn't be able to handle probing
several devices at once? What if two devices become hotplugged at the
same time? Does this imply that the bus _always_ needs to do some
serializing there?

> Also, quite frankly, when this came up I already posted a much better
> approach for allowing arbitrary threading. It was ignored, but maybe you
> want to take a look.
>
> See
>
> http://lkml.org/lkml/2006/10/27/185
>
> for last time this came up. Any driver can decide to do
>
> execute_in_parallel(myprobe, myarguments);
>
> and that really *simple* thing allows synchronization points
> ("wait_for_parallel()") and arbitrary setting of the allowed parallelism.

Yeah, I saw this back then. But this is much too inflexible. How do you
determine the settings? On a large system, you could throttle too much,
obliterating most of the benefits of the parallel approach; on a small
system, you could overwhelm the system.

> I really think that is a *much* better approach. It allows, for example,
> the driver to do certain things (like name allocations and any
> particularly *fast* parts of probing) synchronously, and then do slow
> things (like spinning up disks and actually reading the partition tables)
> asynchronously.

Hm, I think "asynchronous" is the key word here, not "multithreaded".
Thinking some more of it, we really want to do the following:
- look at the device:
- do the fast stuff
- kick off the slow stuff
- look at the next device
- ...
- wait until all devices said they are finished
- each device needs to signal completion when the slow stuff has been
done (or hit a timeout)

(By some curious accident, that is what s390 cio does :))

This way, we are much more flexible than with the execute_in_parallel()
approach. This async approach especially looks suited to stuff where
some kind of operation is submitted to the hardware and we need to wait
for an interrupt/completion. And for that kind of operations, it scales
equally well on small or big machines.

> Anyway, I will refuse to merge anything that does generic bus-level
> (unless the bus is something off-the-board like SCSI) parallelism, unless
> you can really show me how superior and stable it is. I seriously doubt
> you can. We tried it once, and it was a total disaster.

Maybe the criterion should be more "a resonably unified set of
devices". s390 channel attached devices are really all the same (at a
certain not driver-specific level), and the same may be true for
something like scsi. pci may be just too diverse with too many
dissimilar drivers.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/