Symptoms:
~~~~~~~~~
Between ten and thirty
"stuck on smp_invalidate_needed ..."
messages after the second CPU has been fired off.
Reason:
~~~~~~~
smp_flush_tlb() sets the
smp_invalidate_needed
mask to "cpu_present_map". It then calls
smp_message_pass(MSG_ALL_BUT_SELF, ) to propagate the flush_tlb()
request to all other CPUs and then calls local_flush_tlb().
The problem is that smp_flush_tlb() doesn't clear the bit belonging to
the local CPU itself in "smp_invalidate_needed". As soon as the other
CPUs have been fired off, this is no longer a problem as
smp_message_pass() then clears the bit.
BUT: smp_flush_tlb() is called two or three times before the other
CPUs have been fired off. In this case smp_message_pass() is a no-op,
and also doesn't clear the bit of the local CPU in smp_invalidate_needed.
This causes some "stuck on smp_invalidate_needed ..." messages. The
problem eventually "fixes" itself when either of the following things
happens:
a) somebody calls smp_flush_tlb() or otherwise sends an
MSG_INVALIDATE_TLB message
b) the first CPU enters an irq context while another is already
executing an interrupt, in which case get_irqlock() and
wait_on_irq() eventually will call check_smp_invalidate().
The patch below fixes the problem by simply clearing the bit belonging
to the local processor in smp_flush_tlb()
Cheers
Claus
########################################################################
--- linux-2.1/arch/i386/kernel/smp.c.old Sat Jul 25 18:58:50 1998
+++ linux-2.1/arch/i386/kernel/smp.c Sat Jul 25 19:00:18 1998
@@ -1395,6 +1395,7 @@
*/
local_flush_tlb();
+ clear_bit(smp_processor_id(), &smp_invalidate_needed);
__restore_flags(flags);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html