Re: 2.6.19-rc <-> ThinkPads

From: Linus Torvalds
Date: Wed Nov 01 2006 - 14:57:36 EST




On Wed, 1 Nov 2006, Andi Kleen wrote:
>
> Fix race in IO-APIC routing entry setup.
>
> Interrupt could happen between setting the IO-APIC entry
> and setting its interrupt data.

This doesn't fix anything at all.

The interrupt can come in on another CPU, and if we end up having an
affinity change due to that, we then have "set_ioapic_affinity_irq()"
called on that other irq, and it might get to mess with the cpumask
because we dropped the ioapic_lock.

In other words, the problem is not that interrupts were re-enabled, the
problem is literally that the locking is _wrong_.

It's a small window, but we simply should not release the ioapic_lock in
between setting the routing and doing the "set_native_irq_info()" call.

So I think doing the locking inside "ioapic_write_entry()" is simply
fundamentally wrong. When you did the cleanup, your commit message talked
about how it might add a few more lock/unlock things:

In a few cases the IO APIC lock is taken more often now, but this
isn't a problem because it's all initialization/shutdown only
slow path code.

but the point is, this is not about "performance". It's about
_correctness_.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/