Re: [PATCH - sort of] x86: Livelock in handle_pte_fault

From: Linus Torvalds
Date: Wed May 22 2013 - 11:01:51 EST


On Wed, May 22, 2013 at 5:33 AM, Rik van Riel <riel@xxxxxxxxxx> wrote:
>
> That sounds like maybe we DO want a TLB flush on spurious
> page faults, so we get rid of this problem.

Hmm. If it was just the Geode, I wouldn't be surprised. But with a Celeron too?

Anyway, worth testing..

> We can get flush_tlb_fix_spurious_fault to do a local TLB
> invalidate of just the address in question by removing the
> x86-specific dummy version, falling back to the asm-generic
> version that does something.
>
> Can you test the attached patch?

I think you should also remove the

if (flags & FAULT_FLAG_WRITE)

test in handle_pte_fault(). Because if it's spurious, it might happen
on reads too, I think.

RT people - does RT do anything special with the page tables?

Stanislav, the patch you sent out may well work, but it's damned odd.
On UP, we don't do the leave_mm() optimization that makes that code
necessary. So I agree with Rik that it's more likely somewhere else
(and infinite page faults do imply the TLB not getting flushed by the
page fault exception), and your patch might just be working around it
by simply flushing the TLB at least when switching between threads,
which still happens.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/