Re: Please revert 5adc55da4a7758021bcc374904b0f8b076508a11(PCI_MULTITHREAD_PROBE)

From: Linus Torvalds
Date: Wed May 09 2007 - 13:22:34 EST




On Wed, 9 May 2007, Greg KH wrote:

> On Tue, May 08, 2007 at 01:58:10PM -0700, David Miller wrote:
> >
> > 1) A proper dependency system is necessary
> >
> > 2) Proper mutual exclusion for shared system resources/registers/etc.
> > that are poked at in an ad-hoc unlocked manner currently
> >
> > Is that basically what it boils down to?
>
> Yes, that's about it.
>
> Number 1 seemed to cause the most crashes, I don't think number 2 ever
> caused any problems, but it might have, there were too many weird oopses
> to be able to rule that out.

One issue is that a lot of shared resources and their locking really
aren't known until *after* you've done a first-level probing.

The classic example of this really is a cardbus controller, and almost any
multi-function PCI device. Yes, they are "independent" PCI devices in
their own right, but they almost invariably have some shared state.

A bus driver that probes them concurrently is simply broken.

And no, the solution is not to special-case multi-function devices and
always probe the subfunctions serially. That would suck for many things
(disk controllers are *also* often subfunctions). The solution really *is*
to probe the devices serially, and let the layer that actually knows what
it is doing (the low-level driver) decide how it goes from there.

I can almost guarantee that the same is true of most other buses. For
example, I wouldn't be surprised at all to hear that you shouldn't probe
the individual LUN's of many "SCSI" devices concurrently. The number of
bugs in things like multifunction card readers (total lockup if you read
the wrong config pages or even try to read past the end of the flash etc)
is just scary.

(Server people seem to think that "SCSI" == "high-end expensive hardware
that we paid too much for", and yes, that's sometimes true, but "SCSI"
also equals "el-cheapo stuff that sells for $5 and talks something that
looks enough like SCSI commands that we want to consider it SCSI").

So I'm really convinced that the bus layer should be serial and then have
some capability to allow lower levels to do the things *they* know is fine
to do independently in a parallel way. But anything that makes that be a
bus-level choice is almost guaranteed to be broken on just about all
buses!

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/