Re: [PATCH/RFC] PCI prepare/activate instead of enable to avoid IRQstorm and rogue DMA access
From: Tejun Heo
Date: Wed Mar 14 2007 - 22:38:06 EST
Stephen Hemminger wrote:
The problem is the BIOS is busted on these machines. How much effort
do we want to put into dealing with systems with broken BIOS?
I would rather have the root cause fixed than creating a bandaid that
has to be maintained for all the other architectures and platforms.
For sky2/skge, it might be caused by broken BIOS. For some ATA devices,
it's just the hardware which is designed that way. Also, under non-x86
machines and during resume, there's no BIOS to nudge chips into sane
state. This is an existing problem which has to be solved. How much
effort we are gonna put into it is certainly debatable.
Also, the current implementation doesn't have any arch independent part.
It's wholly contained in arch independent PCI layer, but it might be
beneficial to have arch dependent hooks (IRQ line enable/disable?) in
the future.
What if the device with the IRQ problem is never loaded? Sometimes
devices aren't loaded until after boot.
What do you mean by loading a device? Do you mean loading driver for
the device? The patch as posted is probably not a complete solution.
We probably need to make sure during early boot and resume that all IRQ
/ bus master are turned off where possible and let low level drivers
enable them as needed and after certain amount of initialization is
performed.
If you use MSI interrupts, they aren't shared so there isn't a problem.
Maybe the root cause of this is bad MSI emulation handling in BIOS.
Yes, if MSI is used things are better.
Any change like this has to be done without changing device drivers.
Changing the skge/sky2 drivers as special case is not acceptable.
I dunno about that. What I'm proposing is alternative two-step PCI
initialization step - the first step enables the device just enough for
initialization/reset and the second one enables full access. We're
doing part of it already for bus master. I'm proposing to expand that
approach and make them handled by generic PCI layer. As you can see, it
doesn't add noticeable complexity to drivers. I think it's even clearer
than doing pci_set_master() explicitly.
If this way of solving the problem is chosen, eventually most drivers
should be converted to new initialization steps. And there is no way to
do this without modifying low level driver. Only low level driver knows
when full blown access can be enabled and such thing must happen before
registering the device to upper layer (e.g. ATA/SCSI, netif).
sky2/skge aren't exceptions. If this way of solving the problem is
chosen, eventually most if not all drivers should be converted to new
model. It may take two years, maybe five, but as a start just
converting ATA and network drivers shouldn't take too long and that
would help a lot of cases.
Thanks.
--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/