Re: Linux 2.6.20-rc6 - sky2 resume breakage

From: Jeff Garzik
Date: Tue Jan 30 2007 - 03:03:25 EST


Ingo Molnar wrote:
btw., it would be great if you could help us here: could you perhaps, from a past example, outline a specific case of such an ATA/USB IRQ storm and how it occured (precisely) - and what the fix was? I'd like to analyze a specific case to make sure the genirq layer recovers from such cases more gracefully. In general, i think the IRQ subsystem needs to become more failure-resilient and needs to become more auto-learning (and these two dont stand in the way of good performance). This problem of shared IRQs will be with us for at least another 10 years, if not more. (for example ISA is /still/ not dead everywhere and it was already legacy technology 15 years ago when Linux was started.)


Easy to name an example, as they are pretty generic. When sharing irqs -- usually ATA is configured to PCI native (IO-APIC-fasteoi) -- any interrupt storm causes the other devices sharing that irq to crap themselves (kernel turns off irq, suggests irqpoll, etc.)

ATA is unfortunately easier to cause interrupt storms than most because the standard PCI IDE definition has __no__ possible way to indicate certain interrupt conditions are pending. You have to /know/ that you are expecting an interrupt, which causes problems if the hardware decides to send the interrupt early or late, rather than when its expected. Most modern hardware has a read/write/clear interrupt status register that gives you an immediate summary of the pending interrupt conditions, and an easy way to ack the pending events. ATA does not have any such capability.

That said, stuff like AHCI or sata_sil or sata_sil24 do have modern designs with the expected interrupt status register(s), so they do not suffer from the problems suffered by the more legacy-like hardware (ata_piix, sata_via, pata_*)

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/