enable_ioapic_irq broken in arch/i386/kernel/irq.c

Claus-Justus Heine (claus@momo.math.rwth-aachen.de)
20 Apr 1998 11:30:55 +0200


Hi!

I think I have found a bug in enable_ioapic_irq() in arch/i3886/kernel/irq.c
(all(?) 2.1.* kernel versions up to 2.1.97), Dual PII system.

Symptoms:

The system freezes when using ide devices and SCSI devices at the same
time; it freezes more often with slow devices (e.g. when copying data
from an IDE cdrom to the SCSI disk)

Reason:

The two processors are spinning on the io_request_lock spinlock which
appears to be held by a third thread (not true, see below). They do so
in the SCSI resp. ide interrupt routines (found this by modifying the
spinlock() macro to printk the name of the calling __FUNCTION__ after
100000000 iterations.).

Explanation:

Why can this be? Normally, this should be impossible as
spin_lock_irqsave() calls __cli() and thus disables the local
interrupts; therefore the local processor -- that one that holds the
lock -- shouldn't be interrupted and the locked section of code should
be executed without interruptions. I first thought that the ide
interrupt somewhere calls a __sti(), but this is _not_ the
case.

Nevertheless the ide_intr() routine starts running on the _same_
processor while either unplug_device() or add_request() in
ll_rw_blk.c hold the io_request_lock spinlock.

The solution to this riddle is the implementation of the enable_irq()
function. Some functions in drivers/block/ide.c first disable all ide
related irq's by calling disable_irq() and later enable them again
with enable_irq(). So far so good. But the current implementation of
enable_irq() likes to call the irq handlers for possibly pending irq's
itself; now, if the caller of enable_irq() holds the io_request_lock
and the handler for a pending interrupt also tries to obtain that lock
then a deadlock is not only possible but even guaranteed.

This explains also why the deadlocks happen less often when
e.g. enabling 32bit IO-mode or DMA transfers with the cdrom drive as
this means that the spinlock is held for a shorter time and therefor
the chance that another irq occurs while the ide IO routines hold the
spinlock is reduced.

Now, there is already another way implemented of handling pending
irq's in enable_ioapic_irq(); it is just "#if 0"ed. I enabled the
block with the "#if 0":

#if 0
/*
* In the SMP+IOAPIC case it might happen that there are an unspecified
* number of pending IRQ events unhandled. These cases are very rare,
* so we 'resend' these IRQs via IPIs, to the same CPU. It's much
* better to do it this way as thus we dont have to be aware of
* 'pending' interrupts in the IRQ path, except at this point.
*/
if (!disabled_irq[irq] && irq_events[irq]) {
if (!ipi_pending[irq]) {
ipi_pending[irq] = 1;
--irq_events[irq];
send_IPI(cpu,IO_APIC_VECTOR(irq));
}
}
spin_unlock_irqrestore(&irq_controller_lock, flags);
#else
if (!disabled_irq[irq] && irq_events[irq]) {

[snip] (code that calls interrupt hanlders directly follows here)

It appears to work. In any case: the other code block is definitely
buggy. It violates the cli()/sti() versus enable_irq()/disable_irq()
semantics. AFAIK calling enable_irq() should NOT result in the
execution of the interrupt handler on the same processor if interrupts
had been locally switched off on that processor using __cli().

As the "#if 0" block appears to work, I'd suggest to enable it by
default? But maybe it implies other problems. Otherwise the "#else"
block should check whether interrupts are disabled locally. But when
to execute the pending interrupt in that case? Seems to be
tricky. I'll use the "#if 0" block until I run into other
problems.

Thanks for listening to all that blurb.

Cheers

Claus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu