[PATCH] i386 tlb flush bug fix

From: Manfred (manfred@colorfullife.com)
Date: Thu Jul 20 2000 - 05:50:24 EST


I found a tlb flush race in switch_mm:

If
* the cpu is in lazy tlb mode
* it perform a thread switch without changing active_mm.
* a flush ipi arrives while the cpu is in switch_mm, after the
set_bit(cpu_vm_mask), but before setting cpu_tlbstate[].state.

then the cpu will miss all tlb flush ipis until the next reschedule.

My first bugfix left a tiny race, below is a new bugfix.

I've tested it on my Dual-pII, but I don't have any special multi threaded
test apps.

Please test it!

--
	Manfred

<<<<<<<<<<<<< // $Header$ // Kernel Version: // VERSION = 2 // PATCHLEVEL = 4 // SUBLEVEL = 0 // EXTRAVERSION = -test4 --- 2.4/include/asm-i386/pgalloc.h Sat May 13 10:35:31 2000 +++ build-2.4/include/asm-i386/pgalloc.h Thu Jul 20 10:54:47 2000 @@ -240,7 +240,6 @@ #define TLBSTATE_OK 1 #define TLBSTATE_LAZY 2 -#define TLBSTATE_OLD 3 struct tlb_state { --- 2.4/include/asm-i386/mmu_context.h Sat May 13 10:35:31 2000 +++ build-2.4/include/asm-i386/mmu_context.h Thu Jul 20 11:14:35 2000 @@ -27,8 +27,9 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, struct task_struct *tsk, unsigned cpu) { - set_bit(cpu, &next->cpu_vm_mask); if (prev != next) { + /* stop flush ipis for the previous mm */ + clear_bit(cpu, &prev->cpu_vm_mask); /* * Re-load LDT if necessary */ @@ -38,20 +39,22 @@ cpu_tlbstate[cpu].state = TLBSTATE_OK; cpu_tlbstate[cpu].active_mm = next; #endif + set_bit(cpu, &next->cpu_vm_mask); /* Re-load page tables */ asm volatile("movl %0,%%cr3": :"r" (__pa(next->pgd))); - clear_bit(cpu, &prev->cpu_vm_mask); } #ifdef CONFIG_SMP else { - int old_state = cpu_tlbstate[cpu].state; cpu_tlbstate[cpu].state = TLBSTATE_OK; if(cpu_tlbstate[cpu].active_mm != next) BUG(); - if(old_state == TLBSTATE_OLD) + if(!test_and_set_bit(cpu, &next->cpu_vm_mask)) { + /* We were in lazy tlb mode and leave_mm disabled + * tlb flush IPI delivery. We must flush our tlb. + */ local_flush_tlb(); + } } - #endif } --- 2.4/arch/i386/kernel/smp.c Fri Jul 14 09:01:50 2000 +++ build-2.4/arch/i386/kernel/smp.c Thu Jul 20 11:20:57 2000 @@ -207,7 +207,7 @@ * These mean you can really definitely utterly forget about * writing to user space from interrupts. (Its not allowed anyway). * - * Optimizations Manfred Spraul <manfreds@colorfullife.com> + * Optimizations Manfred Spraul <manfred@colorfullife.com> */ static volatile unsigned long flush_cpumask; @@ -216,23 +216,45 @@ static spinlock_t tlbstate_lock = SPIN_LOCK_UNLOCKED; #define FLUSH_ALL 0xffffffff +/* + * We cannot call mmdrop() because we are in interrupt context, + * instead update mm->cpu_vm_mask. + */ static void inline leave_mm (unsigned long cpu) { if (cpu_tlbstate[cpu].state == TLBSTATE_OK) BUG(); clear_bit(cpu, &cpu_tlbstate[cpu].active_mm->cpu_vm_mask); - cpu_tlbstate[cpu].state = TLBSTATE_OLD; } /* * * The flush IPI assumes that a thread switch happens in this order: - * 1) set_bit(cpu, &new_mm->cpu_vm_mask); - * 2) update cpu_tlbstate - * [now the cpu can accept tlb flush request for the new mm] - * 3) change cr3 (if required, or flush local tlb,...) - * 4) clear_bit(cpu, &old_mm->cpu_vm_mask); - * 5) switch %%esp, ie current + * [cpu0: the cpu that switches] + * 1) switch_mm() either 1a) or 1b) + * 1a) thread switch to a different mm + * 1a1) clear_bit(cpu, &old_mm->cpu_vm_mask); + * Stop ipi delivery for the old mm. This is not synchronized with + * the other cpus, but smp_invalidate_interrupt ignore flush ipis + * for the wrong mm, and in the worst case we perform a superflous + * tlb flush. + * 1a2) set cpu_tlbstate to TLBSTATE_OK + * Now the smp_invalidate_interrupt won't call leave_mm if cpu0 + * was in lazy tlb mode. + * 1a3) update cpu_tlbstate[].active_mm + * Now cpu0 accepts tlb flushes for the new mm. + * 1a4) set_bit(cpu, &new_mm->cpu_vm_mask); + * Now the other cpus will send tlb flush ipis. + * 1a4) change cr3. + * 1b) thread switch without mm change + * cpu_tlbstate[].active_mm is correct, cpu0 already handles + * flush ipis. + * 1b1) set cpu_tlbstate to TLBSTATE_OK + * 1b2) test_and_set the cpu bit in cpu_vm_mask. + * Atomically set the bit [other cpus will start sending flush ipis], + * and test the bit. + * 1b3) if the bit was 0: leave_mm was called, flush the tlb. + * 2) switch %%esp, ie current * * The interrupt must handle 2 special cases: * - cr3 is changed before %%esp, ie. it cannot use current->{active_,}mm. @@ -249,8 +271,6 @@ * * 1) Flush the tlb entries if the cpu uses the mm that's being flushed. * 2) Leave the mm if we are in the lazy tlb mode. - * We cannot call mmdrop() because we are in interrupt context, - * instead update cpu_tlbstate. */ asmlinkage void smp_invalidate_interrupt (void) >>>>>>>>>>>>>

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jul 23 2000 - 21:00:13 EST