Re: Oops in 2.2.10-ac10

Patrick J. LoPresti (patl@cag.lcs.mit.edu)
14 Jul 1999 13:56:39 -0400


I may have found the problem. The files I am looking at are all in
arch/i386/kernel and have had no relevant updates since 2.2.10.

There appears to be a race condition between irq.c:free_irq() and
io_apic.c:do_level_ioapic_IRQ().

The problem is that do_level_ioapic_IRQ() releases irq_controller_lock
before passing "action" to handle_IRQ_event(). (See io_apic.c line
1127.)

The close routine of the eepro100 driver, like most drivers (?), calls
free_irq(), and since the spinlock is not held during
handle_IRQ_event(), there is nothing to stop free_irq() from kfree-ing
the action structure *before* it is used by handle_IRQ_event(). (Yes,
the driver does disable interrupts before calling free_irq(), but
there is still a race condition.)

Shouldn't free_irq() check the IRQ_INPROGRESS flag before doing
kfree(action) ? I am not sure what it should do if the flag is set;
it basically needs to release the lock and defer the kfree() until the
IRQ is finished. Alternatively, the lock could be retained during the
call to handle_IRQ_event(), but that is probably longer than the lock
should be held...

Note that I am not an experienced kernel hacker, so I may just be
confused about all of this. If so, feel free to correct me.

- Pat

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/