Re: page fault scalability patch final : i386 tested, x86_64support added

From: Benjamin Herrenschmidt
Date: Tue Aug 31 2004 - 23:34:54 EST


On Sat, 2004-08-28 at 09:20, Christoph Lameter wrote:
> Signed-off-by: Christoph Lameter <clameter@xxxxxxx>
>
> This is the fifth (and hopefully final) release of the page fault
> scalability patches. The scalability patches avoid locking during the
> creation of page table entries for anonymous memory in a threaded
> application. The performance increases significantly for more than 2
> threads running concurrently.

Sorry for "waking up" late on this one but we've been kept busy by
a lot of other things.

The removal of the page table lock has other more subtle side effects
on ppc64 (and ppc32 too) that aren't trivial to solve. Typically, due
to the way we use the hash table as a TLB cache.

For example, out ptep_test_and_clear will first clear the PTE and then
flush the hash table entry. If in the meantime another CPU gets in,
takes a fault, re-populates the PTE and fills the hash table via
update_mmu_cache, we may end up with 2 hash PTEs for the same linux
PTE at least for a short while. This is a potential cause of checkstop
on ppc CPUs.

There may be other subtle races of that sort I haven't encovered yet.

We need to spend more time on our (ppc/ppc64) side to figure out what
is the extent of the problem. We may have a cheap way to fix most of the
issues using the PAGE_BUSY bit we have in the PTEs as a lock, but we
don't have that facility on ppc32.

I think there wouldn't be a problem if we could guarantee exclusion
between page fault and clearing of a PTE (that is basically having the
swapper take the mm write sem) but I don't think that's realistic, oh
well, not that I understand anything about the swap code anyways...

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/