Re: Possibly SATA related freeze killed networking and RAID

From: Tejun Heo
Date: Fri Nov 30 2007 - 18:57:26 EST


Phillip Susi wrote:
> Tejun Heo wrote:
>> Because SFF ATA controller don't have IRQ pending bit. You don't know
>> whether IRQ is raised or not. Plus, accessing the status register which
>> clears pending IRQ can be very slow on PATA machines. It has to go
>> through the PCI and ATA bus and come back. So, unconditionally trying
>> to clear IRQ by accessing Status can incur noticeable overhead if the
>> IRQ is shared with devices which raise a lot of IRQs.
>
> There HAS to be a way to determine if that device generated the
> interrupt, or the interrupt can not be shared. Since the kernel said
> nobody cared about the interrupt, that indicates that the sata driver
> checked the status register and realized the sata chip didn't generate
> the interrupt, and returned to the kernel letting it know that the
> interrupt was not for it.

Surprise, surprise. There's no way to tell whether the controller
raised interrupt or not if command is not in progress. As I said
before, there's no IRQ pending bit. While processing commands, you can
tell by looking at other status registers but when there's nothing in
flight and the controller determines it's a good time to raise a
spurious interrupt, there's no way you can tell. That dang SFF
interface is like 15+ years old.

But we can still make things pretty robust. We're working on it.

Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/